Last chance! 50% off unlimited learning
Sale ends in
matchControls(formula, data = list(), subset, contlabel = "con",
caselabel = NULL, dogrep = TRUE, replace = FALSE)
matchControls
is called from.TRUE
, then contlabel
and
contlabel
are matched using grep
, else
string comparison (exact equality) is used.FALSE
, then every control is used only
once.NA
.formula
must be a factor
determining whether an observation belongs to the case or the
control group. By default, all observations where a grep of
contlabel
matches, are used as possible controls, the rest is
taken as cases. If caselabel
is given, then only those
observations are taken as cases. If dogrep = TRUE
, then both
contlabel
and caselabel
can be regular expressions. The right hand side of the formula
gives the variables that
should be matched. The matching is done using the
daisy
distance from the cluster
package, i.e.,
a model frame is built from the formula and used as input for
daisy
. For each case, the nearest control is
selected. If replace = FALSE
, each control is used only
once.
Age.case <- 40 + 5 * rnorm(50)
Age.cont <- 45 + 10 * rnorm(150)
Age <- c(Age.case, Age.cont)
Sex.case <- sample(c("M", "F"), 50, prob = c(.4, .6), replace = TRUE)
Sex.cont <- sample(c("M", "F"), 150, prob = c(.6, .4), replace = TRUE)
Sex <- as.factor(c(Sex.case, Sex.cont))
casecont <- as.factor(c(rep("case", 50), rep("cont", 150)))
## now look at the group properties:
boxplot(Age ~ casecont)
barplot(table(Sex, casecont), beside = TRUE)
m <- matchControls(casecont ~ Sex + Age)
## properties of the new groups:
boxplot(Age ~ m$factor)
barplot(table(Sex, m$factor))
Run the code above in your browser using DataLab