Learn R Programming

⚠️There's a newer version (0.4-12.6) of this package.Take me there.

RecordLinkage (version 0.3-3)

Record Linkage in R

Description

Provides functions for linking and deduplicating data sets. Methods based on a stochastical approach are implemented as well as classification algorithms from the machine learning domain.

Copy Link

Version

Install

install.packages('RecordLinkage')

Monthly Downloads

2,587

Version

0.3-3

License

GPL (>= 2)

Maintainer

Murat Sariyar

Last Published

July 2nd, 2011

Functions in RecordLinkage (0.3-3)

RLdata

Test data for Record Linkage
RLBigDataLinkage-class

Class "RLBigDataLinkage"
RecLinkData-class

Class "RecLinkData"
RecLinkData.object

Record Linkage Data Object
RecLinkResult.object

Record Linkage Result Object
RecLinkClassif-class

Class "RecLinkClassif"
RLResult-class

Class "RLResult"
classifyUnsup

Unsupervised Classification
classifySupv

Supervised Classification
RLBigDataDedup

Constructors for big data objects.
RecLinkResult-class

Class "RecLinkResult"
emWeights

Calculate weights
compare

Compare Records
delete.NULLs

Remove NULL Values
genSamples

Generate Training Set
getPairsBackend

Backend function for getPairs
emClassify

Weight-based Classification of Data Pairs
getExpectedSize

Estimate number of record pairs.
getFrequencies-methods

Get attribute frequencies
getPairs

Extract Record Pairs
internals

Internal functions and methods
getMinimalTrain

Create a minimal training set
isFALSE

Check for FALSE
getErrorMeasures-methods

Calculate Error Measures
getTable-methods

Build contingency table
editMatch

Edit Matching Status
phonetics

Phonetic Code
makeBlockingPairs

Create record pairs from blocks of ids.
epiWeights

Calculate EpiLink weights
resample

Safe Sampling
show

Show a RLBigData object
splitData

Split Data
optimalThreshold

Optimal Threshold for Record Linkage
summary.RLBigData

summary methods for "RLBigData" objects.
stochastic

Stochastic record linkage.
subset

Subset operator for record linkage objects
gpdEst

Estimate Threshold from Pareto Distribution
clone

Serialization of record linkage object.
%append%-methods

Concatenate comparison patterns or classification results
strcmp

String Metrics
mygllm

Generalized Log-Linear Fitting
epiClassify

Classify record pairs with EpiLink weights
texSummary

LaTeX Summary of linkage results
mrl

Mean Residual Life Plot
unorderedPairs

Create Unordered Pairs
summary

Print Summary of Record Linkage Data
RLBigDataDedup-class

Class "RLBigDataDedup"
getParetoThreshold

Estimate Threshold from Pareto Distribution
trainSupv

Train a Classifier
RLBigData-class

Class "RLBigData"