Learn R Programming

RecordLinkage (version 0.2-0)

getPairs: Extract Record Pairs

Description

Extracts record pairs from data and result objects.

Usage

getPairs(rpairs, max.weight = Inf, min.weight = -Inf, 
         single.rows = FALSE, show = "all", sort = !is.null(rpairs$Wdata))

Arguments

rpairs
max.weight, min.weight
Real numbers. Upper and lower output threshold.
single.rows
Logical. Wether to print record pairs in one row instead of two consecutive rows.
show
Selects which records to show, one of "links", "nonlinks", "possible", "all".
sort
Logical. Whether to sort descending by weight.

Value

  • A data frame. If single.rows is TRUE, each row holds (in this order) the weight of the data pair (possibly NA), id and data fields of the first record and id and data fields of the second record.

    If single.rows is not TRUE, each odd row contains the weight followed by id and data fields of the first record, each even row a blank field followed by id and data fields of the second record.

Details

This function extracts record pairs from a RecLinkData or RecLinkResult object for further processing such as a review of possible links. Arguments max.weight, min.weight and show control which records to include in the output. If weights are stored in rpairs$Wdata, all records with rpairs$Wdata < max.weight & rpairs$Wdata >= min.weight are returned. Further selection can be made by show to include all data pairs, only links, only non-links or only possible links. If single.rows is not TRUE, pairs are output on two consecutive lines to enable easy comparison. All data are converted to character, which can lead to a loss of precision for numeric values. Therefore, two row-format should be used for printing only.

See Also

RecLinkResult, RecLinkResult