RecordLinkage (version 0.4-12.4)

classifySupv: Supervised Classification

Description

Supervised classification of record pairs based on a trained model.

Usage

classifySupv(model, newdata, ...)

# S4 method for RecLinkClassif,RecLinkData classifySupv(model, newdata, convert.na = TRUE, ...)

# S4 method for RecLinkClassif,RLBigData classifySupv(model, newdata, convert.na = TRUE, withProgressBar = (sink.number()==0), ...)

Value

For the "RecLinkData" method, a S3 object of class "RecLinkResult" that represents a copy of newdata with element rpairs$prediction, which stores the classification result, as addendum.

For the "RLBigData" method, a S4 object of class

"RLResult".

Arguments

model

Object of class RecLinkClassif. The calibrated model. See trainSupv.

newdata

Object of class "RecLinkData" or "RLBigData". The data to classify.

convert.na

Logical. Whether to convert missing values in the comparison patterns to 0.

withProgressBar

Whether to display a progress bar

...

Further arguments for the predict method.

Author

Andreas Borg, Murat Sariyar

Details

The record pairs in newdata are classified by calling the appropriate predict method for model$model.

By default, the "RLBigDataDedup" method displays a progress bar unless output is diverted by sink, e.g. when processing a Sweave file.

See Also

trainSupv for training of classifiers, classifyUnsup for unsupervised classification.

Examples

Run this code
# Split data into training and validation set, train and classify with rpart
data(RLdata500)
pairs=compare.dedup(RLdata500, identity=identity.RLdata500,
                    blockfld=list(1,3,5,6,7))
l=splitData(pairs, prop=0.5, keep.mprop=TRUE)                    
model=trainSupv(l$train, method="rpart", minsplit=5)
result=classifySupv(model=model, newdata=l$valid)
summary(result)

Run the code above in your browser using DataLab