Match implements a variety of algorithms for multivariate
matching including propensity score, Mahalanobis and inverse variance
matching. The function is intended to be used in conjunction with the
MatchBalance function which determines the extent to which
Match has been able to achieve covariate balance. In order to
do propensity score matching, one should estimate the propensity model
before calling Match, and then send Match the propensity
score to use. Match enables a wide variety of matching
options including matching with or without replacement, bias
adjustment, different methods for handling ties, exact and caliper
matching, and a method for the user to fine tune the matches via a
general restriction matrix. Variance estimators include the usual
Neyman standard errors, Abadie-Imbens standard errors, and robust
variances which do not assume a homogeneous causal effect. The
GenMatch function can be used to automatically
find balance via a genetic search algorithm which determines the
optimal weight to give each covariate.Match(Y=NULL, Tr, X, Z = X, V = rep(1, length(Y)), estimand = "ATT", M = 1,
BiasAdjust = FALSE, exact = NULL, caliper = NULL, replace=TRUE, ties=TRUE,
CommonSupport=FALSE,Weight = 1, Weight.matrix = NULL, weights = NULL,
Var.calc = 0, sample = FALSE, restrict=NULL, match.out = NULL,
distance.tolerance = 1e-05, tolerance=sqrt(.Machine$double.eps),
version="standard")Match will rVar.calc option,
which takes precedence.ties
option.Z matrix.X. If a logical vector is provided, a logical value should
be provideFALSE, the order of matches
generally matters. Matches will be found in the same order as the
data are sorted. Thus, the match(es) for the first ties==TRUE. If, for example, one treated observation
matches more than one control observation, the matched dataset will
include the multiple matchedcaliper option is to
be X. The default value of 1 denotes that weights are equal to
the inverse of the variances. 2 denotes the MahalanoX---see
the Weight option. This square matrix should have as many
columns as the number of columns of the XY which
provides observation specific weights.Var.calc=0 which means that
homoscedasticity is assumed. For values of Var.calc > 0,
robust variances are calculated using Var.calc madistance.tolerance are deemed to be equal to zero.
This option can be used to perform a type of optimal mMatch. If this object is provided, then Match will
use the matches found by the previous invocation of the function.
Hence, Match will run faster. This is
uties=FALSE or
replace=FALSE if the dataset is larX consists of either covariates or a known
propensity score because it takes into account the uncertainty of the
matching procedure. If an estimated propensity score is used, the
uncertainty involved in its estimation is not accounted for although
the uncertainty of the matching procedure itself still is.BiasAdjust. If BiasAdjust is not requested, this is the
same as est.weights. Note that the
standard error provided by se takes into account the uncertainty
of the matching procedure while se.standard does not. Neither
se nor se.standard take into account the uncertainty of
estimating a propensity score. se.standard does
not take into account any BiasAdjust. Summary of both types
of standard error results can be requested by setting the
full=TRUE flag when using the summary.Match
function on the object returned by Match.Match. Three datasets are included in this list: Y,
Tr and X.index.control
can be used to recover the matched dataset produced by
Match. For example, the X matrix used by Match
can be recovered by
rbind(X[index.treated,],X[index.control,]). The user should
generally just examine the output of mdata.index.treated
can be used to recover the matched dataset produced by
Match. For example, the X matrix used by Match
can be recovered by
rbind(X[index.treated,],X[index.control,]). The user should
generally just examine the output of mdata.caliper and
exact. If no observations were dropped, this
index will be NULL.caliper which was used.X variables. This object has the same length as the number of
covariates in X.exact function argument.ndrops.matches, takes into account observation specific
weights which the user may have provided via the weights
argument.MatchBalance function which checks if the results of this
function have actually achieved balance. The results of this function
can be summarized by a call to the summary.Match
function. If one wants to do propensity score matching, one should estimate the
propensity model before calling Match, and then place the
fitted values in the X matrix---see the provided example.
The GenMatch function can be used to automatically
find balance by the use of a genetic search algorithm which determines
the optimal weight to give each covariate. The object returned by
GenMatch can be supplied to the Weight.matrix
option of Match to obtain estimates.
Match is often much faster with large datasets if
ties=FALSE or replace=FALSE---i.e., if matching is done
by randomly breaking ties or without replacement. Also see the
Matchby function. It provides a wrapper for
Match which is much faster for large datasets when it can be
used.
Three demos are included: GerberGreenImai, DehejiaWahba,
and AbadieImbens. These can be run by calling the
demo function such as by demo(DehejiaWahba). Sekhon, Jasjeet S. 2006. ``Alternative Balance Metrics for Bias
Reduction in Matching Methods for Causal Inference.'' Working Paper.
Abadie, Alberto and Guido Imbens. 2006.
``Large Sample Properties of Matching Estimators for Average
Treatment Effects.'' Econometrica 74(1): 235-267.
Diamond, Alexis and Jasjeet S. Sekhon. 2005. ``Genetic Matching for
Estimating Causal Effects: A General Multivariate Matching Method for
Achieving Balance in Observational Studies.'' Working Paper.
Imbens, Guido. 2004. Matching Software for Matlab and
Stata.
summary.Match,
GenMatch,
MatchBalance,
Matchby,
balanceMV, balanceUV,
qqstats, ks.boot,
GerberGreenImai, lalonde#
# Replication of Dehejia and Wahba psid3 model
#
# Dehejia, Rajeev and Sadek Wahba. 1999.``Causal Effects in Non-Experimental Studies: Re-Evaluating the
# Evaluation of Training Programs.''Journal of the American Statistical Association 94 (448): 1053-1062.
#
data(lalonde)
#
# Estimate the propensity model
#
glm1 <- glm(treat~age + I(age^2) + educ + I(educ^2) + black +
hisp + married + nodegr + re74 + I(re74^2) + re75 + I(re75^2) +
u74 + u75, family=binomial, data=lalonde)
#
#save data objects
#
X <- glm1$fitted
Y <- lalonde$re78
Tr <- lalonde$treat
#
# one-to-one matching with replacement (the "M=1" option).
# Estimating the treatment effect on the treated (the "estimand" option defaults to ATT).
#
rr <- Match(Y=Y, Tr=Tr, X=X, M=1);
summary(rr)
# Let's check the covariate balance
# 'nboots' is set to small values in the interest of speed.
# Please increase to at least 500 each for publication quality p-values.
mb <- MatchBalance(treat~age + I(age^2) + educ + I(educ^2) + black +
hisp + married + nodegr + re74 + I(re74^2) + re75 + I(re75^2) +
u74 + u75, data=lalonde, match.out=rr, nboots=10)Run the code above in your browser using DataLab