Learn R Programming

representr (version 0.1.5)

Create Representative Records After Entity Resolution

Description

An implementation of Kaplan, Betancourt, Steorts (2022) that creates representative records for use in downstream tasks after entity resolution is performed. Multiple methods for creating the representative records (data sets) are provided.

Copy Link

Version

Install

install.packages('representr')

Monthly Downloads

202

Version

0.1.5

License

GPL-3

Maintainer

Andee Kaplan

Last Published

September 5th, 2023

Functions in representr (0.1.5)

rl_reg1

500 records suitable for record linkage with additional regression variables
pp_weights

Get posterior weights for each record post record-linkage using posterior prototyping.
emp_kl_div

Calculate the empirical KL divergence for a representative dataset as compared to the true dataset
representr

representr: A package for creating representative records post-record linkage.
represent

Create a representative dataset post record-linkage.
within_category_compare_cpp

within_category_compare_cpp Inner column type record distance function
dist_binary

The distance between two records
clust_proto_random

Prototype record from a cluster.
clust_composite

Composite record from a cluster using a weighted average of each column values.
dist_col_type

dist_col_type Inner column type record distance function