Learn R Programming

PreProcessRecordLinkage (version 1.0.1)

preprocLinkage: Record Linkage with Data Preprocessing

Description

This function records linkage along with data preprocessing. It has been meticulously executed to cover a wide range of datasets, ensuring that variable names are standardized using synonyms. This approach facilitates seamless data integration and analysis across various datasets.

Usage

preprocLinkage(d1,d2,chz="NULL",var=c("age","sex"),threshold=0.9)

Value

Two csv files or two rdata files.

Arguments

d1

A data frame.

d2

A data frame.

chz

the number of the name of the variable that the user does not want to change based on the output of the preproc function.

var

The vector of the names of the blocked variables that the user chooses based on the output of the selVar function that gives the vector of the names of the common variables between the two data sets.

threshold

A numeric value between 0 and 1.

Author

Hossein Hassani and and Leila Marvian Mashhad.

Details

The results are stored in the .csv files, but if the number of records exceeds one million, they are stored in the rdata files.

See Also

selVar, chzInput

Examples

Run this code
  d1 = RLdata500
  d2 = RLdata10000
  preprocLinkage(d1, d2, var = "by")
 

Run the code above in your browser using DataLab