Learn R Programming

RecordLinkage (version 0.4-12.5)

Record Linkage Functions for Linking and Deduplicating Data Sets

Description

Provides functions for linking and deduplicating data sets. Methods based on a stochastic approach are implemented as well as classification algorithms from the machine learning domain. For details, see our paper "The RecordLinkage Package: Detecting Errors in Data" Sariyar M / Borg A (2010) .

Copy Link

Version

Install

install.packages('RecordLinkage')

Monthly Downloads

5,480

Version

0.4-12.5

License

GPL (>= 2)

Maintainer

Murat Sariyar Developer

Last Published

July 28th, 2025

Functions in RecordLinkage (0.4-12.5)

emWeights

Calculate weights
editMatch

Edit Matching Status
classifySupv

Supervised Classification
deleteNULLs

Remove NULL Values
emClassify

Weight-based Classification of Data Pairs
RecLinkResult.object

Record Linkage Result Object
clone

Serialization of record linkage object.
classifyUnsup

Unsupervised Classification
compare

Compare Records
%append%-methods

Concatenate comparison patterns or classification results
getMinimalTrain

Create a minimal training set
genSamples

Generate Training Set
epiClassify

Classify record pairs with EpiLink weights
ffdf-class

Class "ffdf"
ff_vector-class

Class "ff_vector"
epiWeights

Calculate EpiLink weights
getErrorMeasures-methods

Calculate Error Measures
getFrequencies-methods

Get attribute frequencies
optimalThreshold

Optimal Threshold for Record Linkage
getExpectedSize

Estimate number of record pairs.
getPairs

Extract Record Pairs
mygllm

Generalized Log-Linear Fitting
gpdEst

Estimate Threshold from Pareto Distribution
getPairsBackend

Backend function for getPairs
mrl

Mean Residual Life Plot
isFALSE

Check for FALSE
makeBlockingPairs

Create record pairs from blocks of ids.
getTable-methods

Build contingency table
getParetoThreshold

Estimate Threshold from Pareto Distribution
internals

Internal functions and methods
show

Show a RLBigData object
summary

Print Summary of Record Linkage Data
stochastic

Stochastic record linkage.
summary.RLResult

Summary method for "RLResult" objects.
strcmp

String Metrics
subset

Subset operator for record linkage objects
splitData

Split Data
texSummary

LaTeX Summary of linkage results
phonetics

Phonetic Code
summary.RLBigData

summary methods for "RLBigData" objects.
resample

Safe Sampling
trainSupv

Train a Classifier
unorderedPairs

Create Unordered Pairs
RLResult-class

Class "RLResult"
RecLinkResult-class

Class "RecLinkResult"
RLBigDataDedup

Constructors for big data objects.
RLBigDataDedup-class

Class "RLBigDataDedup"
RLBigData-class

Class "RLBigData"
RLdata

Test data for Record Linkage
RecLinkData.object

Record Linkage Data Object
RLBigDataLinkage-class

Class "RLBigDataLinkage"
RecLinkData-class

Class "RecLinkData"
RecLinkClassif-class

Class "RecLinkClassif"