genSamples(dataset, num.non, des.mprop = 0.1)RecLinkData. Data pairs from
which to sample.RecLinkResult objects.pairs components.
The prediction components represent the clustering result. If weights are
present in dataset, the corresponding fractions of Wdata are
stored to train and valid. All other components are copied
from dataset.classifyUnsup)
requires a sufficient training set of record pairs with known matching status.
Where no such data are available, genSamples can be used to generate
training data. The linkage status is classified based on unsupervised
clustering with bclust and the desired number of links and
non-links are sampled.
If the requested numbers of matches or non-matches is not feasible, a
warning is issued and the maximum possible number is considered.splitData for splitting data sets without clustering.