Learn R Programming

multilink

multilink is an R package which implements methodology presented in the manuscript “Multifile Partitioning for Record Linkage and Duplicate Detection” by Serge Aleshin-Guendel and Mauricio Sadinle, published in the Journal of the American Statistical Association and available on arXiv. It handles the general problem of multifile record linkage and duplicate detection, where any number of files are to be linked, and any of the files may have duplicates.

Installation

You can install the development version of multilink from GitHub with:

install.packages("devtools")
devtools::install_github("aleshing/multilink")

Copy Link

Version

Install

install.packages('multilink')

Monthly Downloads

144

Version

0.1.1

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Serge Aleshin-Guendel

Last Published

June 9th, 2023

Functions in multilink (0.1.1)

gibbs_sampler

Gibbs Sampler for Posterior Inference
dup_data_small

Small Duplicate Dataset
initialize_partition

Initialize the Partition
specify_prior

Specify the Prior Distributions
find_bayes_estimate

Find the Bayes Estimate of a Partition
relabel_bayes_estimate

Relabel the Bayes Estimate of a Partition
reduce_comparison_data

Reduce Comparison Data Size
multilink

Multifile Record Linkage and Duplicate Detection
dup_data

Duplicate Dataset
no_dup_data

No Duplicate Dataset
create_comparison_data

Create Comparison Data
no_dup_data_small

Small No Duplicate Dataset