custom_rec_lin_model

Creates a supervised record linkage model using a custom machine learning (ML) classifier.

The goal of 'automatedRecLin' is to perform record linkage (also known as entity resolution) in unsupervised or supervised settings. It compares pairs of records from two datasets using selected comparison functions to estimate the probability or density ratio between matched and non-matched records. Based on these estimates, it predicts a set of matches that maximizes entropy. For details see: Lee et al. (2022) <https://www150.statcan.gc.ca/n1/pub/12-001-x/2022001/article/00007-eng.htm>, Vo et al. (2023) <https://ideas.repec.org/a/eee/csdana/v179y2023ics0167947322002365.html>, Sugiyama et al. (2008) <doi:10.1007/s10463-008-0197-x>.

Adam Struzik

automatedRecLin

Record Linkage Based on an Entropy-Maximizing Classifier

Maciej Beręsewicz

custom_rec_lin_model function

<dl><dt>ml_model</dt>
<dd>A trained ML model that predicts the probability of a match based on comparison vectors.</dd>
<dt>vectors</dt>
<dd>An object of class <code>comparison_vectors</code> (a result of the <code>comparison_vectors</code> function), used for training the <code>ml_model</code>.</dd></dl>

Arguments

Author

Create a Custom Record Linkage Model — custom_rec_lin_model

<dl>

<dt>ml_model</dt>
<dd>A trained ML model that predicts the probability of a match based on comparison vectors.</dd>


<dt>vectors</dt>
<dd>An object of class <code>comparison_vectors</code> (a result of the <code>comparison_vectors</code> function), used for training the <code>ml_model</code>.</dd>

</dl>

custom_rec_lin_model: Create a Custom Record Linkage Model

Description

Usage

Value

Arguments

Author

Details

Examples