Learn R Programming

RecordLinkage (version 0.4-12.4)

Record Linkage Functions for Linking and Deduplicating Data Sets

Description

Provides functions for linking and deduplicating data sets. Methods based on a stochastic approach are implemented as well as classification algorithms from the machine learning domain. For details, see our paper "The RecordLinkage Package: Detecting Errors in Data" Sariyar M / Borg A (2010) .

Copy Link

Version

Install

install.packages('RecordLinkage')

Monthly Downloads

20,085

Version

0.4-12.4

License

GPL (>= 2)

Maintainer

Murat Sariyar Developer

Last Published

November 8th, 2022

Functions in RecordLinkage (0.4-12.4)

emClassify

Weight-based Classification of Data Pairs
RLdata

Test data for Record Linkage
classifySupv

Supervised Classification
classifyUnsup

Unsupervised Classification
RLResult-class

Class "RLResult"
deleteNULLs

Remove NULL Values
editMatch

Edit Matching Status
RLBigDataDedup-class

Class "RLBigDataDedup"
RLBigDataLinkage-class

Class "RLBigDataLinkage"
clone

Serialization of record linkage object.
getFrequencies-methods

Get attribute frequencies
epiWeights

Calculate EpiLink weights
compare

Compare Records
epiClassify

Classify record pairs with EpiLink weights
getExpectedSize

Estimate number of record pairs.
emWeights

Calculate weights
%append%-methods

Concatenate comparison patterns or classification results
RecLinkResult.object

Record Linkage Result Object
genSamples

Generate Training Set
getErrorMeasures-methods

Calculate Error Measures
mygllm

Generalized Log-Linear Fitting
mrl

Mean Residual Life Plot
optimalThreshold

Optimal Threshold for Record Linkage
makeBlockingPairs

Create record pairs from blocks of ids.
getMinimalTrain

Create a minimal training set
gpdEst

Estimate Threshold from Pareto Distribution
getTable-methods

Build contingency table
stochastic

Stochastic record linkage.
getPairs

Extract Record Pairs
internals

Internal functions and methods
ffdf-class

Class "ffdf"
ff_vector-class

Class "ff_vector"
getPairsBackend

Backend function for getPairs
getParetoThreshold

Estimate Threshold from Pareto Distribution
summary

Print Summary of Record Linkage Data
unorderedPairs

Create Unordered Pairs
isFALSE

Check for FALSE
phonetics

Phonetic Code
splitData

Split Data
resample

Safe Sampling
summary.RLBigData

summary methods for "RLBigData" objects.
summary.RLResult

Summary method for "RLResult" objects.
trainSupv

Train a Classifier
strcmp

String Metrics
texSummary

LaTeX Summary of linkage results
subset

Subset operator for record linkage objects
show

Show a RLBigData object
RecLinkClassif-class

Class "RecLinkClassif"
RLBigData-class

Class "RLBigData"
RecLinkData.object

Record Linkage Data Object
RecLinkData-class

Class "RecLinkData"
RecLinkResult-class

Class "RecLinkResult"
RLBigDataDedup

Constructors for big data objects.