To find a threshold for distance to define controls that are qualified to be matched with a case.
get_threshold(data, vars, case_var = "case", p_threshold = 0.5, seed = 1600)A list with items:
The numeric threshold chosen
The data used to fit the logistic regression model
The strata made by make_knn_strata
The fit logisitic regression model
The dataset
The variables to use for calculating distance
The name of the case identifier variable
The probability that the closest matching approach
produces the closer matching relative to the random matching approach.
The greater p_threshold, the smaller the threshold.
A random seed.
This function uses logistic regression to predict by the distance whether a control is the closest (unique) match for each case vs. a random selection and by default returns the 50
For more information, please refer to the vignette using
browseVignettes("nncc").