Learn R Programming

reclin (version 0.1.2)

score_simsum: Score pairs by summing the similarity vectors

Description

Score pairs by summing the similarity vectors

Usage

score_simsum(pairs, var = "simsum", by, add = TRUE, na_value = 0, ...)

Arguments

pairs

a pairs object, such as generated by pair_blocking

var

a character vector of length 1 with the name of the variable that will be created.

by

a character vector with the column names from pairs that should be summed. When missing the by attribute from pairs is used.

add

add the variable to the pairs object and return the pairs object. Otherwise, return a vector with the scores.

na_value

the value to use for missing values

...

passed on to other methods.

Value

When add = TRUE the original pairs object is returned with the column given by var added to it. Otherwise a vector with scores is returned.

Details

The scores are calculated by summing the columns given by by. Missing values are counted as zeros.

Examples

Run this code
# NOT RUN {
data("linkexample1", "linkexample2")
pairs <- pair_blocking(linkexample1, linkexample2, "postcode")
pairs <- compare_pairs(pairs, c("lastname", "firstname", "address", "sex"))
pairs <- score_simsum(pairs)
 
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab