genSamples(dataset, num.non, des.mprop = 0.1)
RecLinkData
. Data pairs from
which to sample."RecLinkResult"
objects.pairs
components.
The prediction
components represent the clustering result. If weights are
present in dataset
, the corresponding values of Wdata
are
stored to train
and valid
. All other components are copied
from dataset
.classifySupv
)
requires a training set of record pairs with known matching status.
Where no such data are available, genSamples
can be used to generate
training data. The matching status is determined by unsupervised
clustering with bclust
. Subsequently, the desired number of
links and non-links are sampled.
If the requested numbers of matches or non-matches is not feasible, a
warning is issued and the maximum possible number is considered.splitData
for splitting data sets without clustering.