RecordLinkage (version 0.4-11)

gpdEst: Estimate Threshold from Pareto Distribution

Description

Fits a Pareto distribution to the distribution of weights and calculates a quantile on the fitted model as classification threshold.

Usage

gpdEst(Wdata, thresh = -Inf, quantil = 0.95)

Arguments

Wdata

A numeric vector representing weights of record pairs.

thresh

Threshold for exceedances.

quantil

A real number between 0 and 1. The desired quantil.

Value

A real number representing the resulting classification threshold. It is assured that the threshold lies in a reasonable range.

Details

The weights that exceed thresh are fitted to a generalized Pareto distribution (GPD). The estimated parameters shape and scale are used to calculate a classification threshold by the formula $$\mathit{thresh}+\frac{\mathit{scale}}{\mathit{shape}} ((\frac{n}{k}(1-\mathit{quantil}))^{-\mathit{shape}} -1)$$ where \(n\) is the total number of weights and \(k\) the number of exceedances.

See Also

getParetoThreshold for user-level function