Match: Multivariate and Propensity Score Matching Estimator for Causal Inference

Description

This function preforms multivariate matching. This function is intended to be used in conjunction with the MatchBalance function which checks if the results of this function have actually achieved balance. If one wants to do propensity score matching, one should estimate the propensity model before calling Match, and then send Match the propensity scores to use. Match implements the matching algorithm of Abadie and Imbens which provides principled standard errors when matching is done with covariates or a known propensity score. Ties are handled in a deterministic and coherent fashion.

Usage

Match(Y, Tr, X, Z = X, V = rep(1, length(Y)), estimand = "ATT", M = 1,
      BiasAdj = FALSE, Weight = 1, Weight.matrix = NULL, Var.calc = 0,
      weights= rep(1, length(Y)), caliper=FALSE, exact = FALSE,
      sample = FALSE, tmpdir = NULL, extra.output = FALSE,
      tolerance = 1e-05)

Arguments

A vector containing the outcome of interest. Missing values are not allowed.

A vector indicating the observations which are in the treatment regime and those which are not. This can either be a logical vector or a real vector where 0 denotes control and 1 denotes treatment.

A matrix containing the variables we wish to match on. This matrix may contain the actual observed covariates or the propensity score or a combination of both.

A matrix containing the covariates for which we wish to make bias adjustments.

A matrix containing the covariates for which the variance of the causal effect may vary. Also see the Var.calc option, which takes precedence.

estimand

A character string for the estimand. The default estimand is "ATT", the sample average treatment effect for the treated. "ATE" is the sample average treatment effect (for all), and "ATC" is the sample average treatment effect for the controls

A scalar for the number of matches which should be found (with replacement). The default is one-to-one matching.

BiasAdj

A logical scalar for whether regression adjustment should be used. See the Z matrix.

Weight

A scalar for the type of weighting scheme the matching algorithm should use when weighting each of the covariates in X. The default value of 1 denotes that weights are equal to the inverse of the variances. 2 denotes the Maha

Weight.matrix

This matrix denotes the weights the matching algorithm uses when weighting each of the covariates in X---see the Weight option. This square matrix should have as many columns as the number of columns of the X

Var.calc

A scalar for the variance estimate that should be used. By default Var.calc=0 which means that homoscedasticity is assumed. For values of Var.calc > 0, robust variances are calculated using Var.calc ma

weights

A vector the same length as Y which provides observations specific weights.

caliper

A scalar denoting if a caliper should be used when matching. A caliper is the distance which is acceptable for any match. Observations which are outside of the caliper are dropped. If a caliper is to be used, a scalar real value should be

exact

A logical flag for whether exact matching of all covariates in X should be done. When exact matches are not found, observations are dropped. tolerance determines what is considered to be an exact match.

sample

A logical flag for whether the population or sample variance is returned.

tmpdir

The path to a directory where the user has write permission. Match creates temporary files. This directory is by default determined by a call to the tempdir function. In the temporary directory, Match p

extra.output

A logical flag for whether the user wants to have the art.data and aug.data objects returned.

tolerance

This is a scalar which is used to determine if distances are different from zero. Values less than tolerance are deemed to be equal to zero.

Value

estThe estimated average causal effect.
seThe standard error. This standard error is principled if X consists of either covariates or a known propensity score because it takes into account the uncertainty of the matching procedure. If an estimated propensity score is used, the uncertainty involved in its estimation is not accounted for although the uncertainty of the matching procedure itself still is.
est.noadjThe estimated average causal effect without any BiasAdj. If BiasAdj is not requested, this is the same as est.
se.naiveThe naive standard error. This is the standard error calculated on the matched data using the usual method of calculating the difference of means (between treated and control) weighted by the observation weights provided by weights. Note that the standard error provided by se takes into account the uncertainty of the matching procedure while se.naive does not. Neither se nor se.naive take into account the uncertainty of estimating a propensity score. se.naive does not take into account any BiasAdj. Summary of the naive results can be requested by setting the full=TRUE flag when using the summary.Match function on the object returned by Match.
se.condThe conditional standard error. The practitioner should not generally use this.
mdataA list which contains the matched datasets produced by Match. Three datasets are included in this list: Y, Tr and X.
index.treatedA vector containing the observation numbers from the original dataset for the treated observations in the matched dataset. This index in conjunction with index.control can be used to recover the matched dataset produced by Match. For example, the X matrix used by Match can be recovered by rbind(X[index.treated,],X[index.control,]). The user should generally just examine the output of mdata.
index.controlAn index for the control observations in the matched data. This index in conjunction with index.treated can be used to recover the matched dataset produced by Match. For example, the X matrix used by Match can be recovered by rbind(X[index.treated,],X[index.control,]). The user should generally just examine the output of mdata.
weightsThe weight for the matched dataset. If all of the observations had a weight of 1 on input, they will have a weight of 1 on output if each observation was only matched once.
orig.nobsThe original number of observations in the dataset.
orig.wnobsThe original number of weighted observations in the dataset.
orig.treated.nobsThe original number of treated observations (unweighted).
nobsThe number of observations in the matched dataset.
wnobsThe number of weighted observations in the matched dataset.
caliperA logical flag indicating if a caliper was used.
index.caliperAn index listing the observations in the original dataset dropped by the caliper. This is NULL if no caliper was used.
index.treated.nocaliperSimilar to index.treated, but the index that would be produced without a caliper. This is exactly the same as index.treated if no caliper was used.
index.control.nocaliperSimilar to index.control, but the index that would be produced without a caliper. This is exactly the same as index.control if no caliper was used.
index.treated.indataAn index used to index the matched dataset to keep track of dropped observations. The end user should not need to use this index.
index.control.indataAn index used to index the matched dataset to keep track of dropped observations. The end user should not need to use this index.
art.dataObject that can be requested by the extra.output flag for compatibility with Imbens's function.
aug.dataObject that can be requested by the extra.output flag for compatibility with Imbens's function.

Details

This function is intended to be used in conjunction with the MatchBalance function which checks if the results of this function have actually achieved balance. The results of this function can be summarized by a call to the summary.Match function. If one wants to do propensity score matching, one should estimate the propensity model before calling Match, and then place the fitted values in the X matrix---see the provided example. Three demos are included: GerberGreenImai, DehejiaWahba, and AbadieImbens. These can be run by calling the demo function such as by demo(DehejiaWahba).

References

Abadie, Alberto and Guido Imbens. 2004. ``Large Sample Properties of Matching Estimators for Average Treatment Effects.'' Working Paper. http://ksghome.harvard.edu/~.aabadie.academic.ksg/sme.pdf Sekhon, Jasjeet S. 2004. ``The Varying Role of Voter Information Across Democratic Societies.'' Working Paper. http://jsekhon.fas.harvard.edu/papers/SekhonInformation.pdf

Examples

Run this code

#
# Replication of Dehejia and Wahba psid3 model
#
# Dehejia, Rajeev and Sadek Wahba. 1999.``Causal Effects in Non-Experimental Studies: Re-Evaluating the
# Evaluation of Training Programs.''Journal of the American Statistical Association 94 (448): 1053-1062.
#
data(lalonde)

#
# Estimate the propensity model
#
glm1  <- glm(treat~age + I(age^2) + educ + I(educ^2) + black +
             hisp + married + nodegr + re74  + I(re74^2) + re75 + I(re75^2) +
             u74 + u75, family=binomial, data=lalonde)


#
#save data objects
#
X  <- glm1$fitted
Y  <- lalonde$re78
Tr  <- lalonde$treat

#
# one-to-one matching with replacement (the "M=1" option).
# Estimating the treatment effect on the treated (the "estimand" option which defaults to 0).
#
rr  <- Match(Y=Y,Tr=Tr,X=X,M=1);
summary(rr)

#
# Let's check for balance
# 'nboots' and 'nmc' are set to small values in the interest of speed.
# Please increase to at least 500 each for publication quality p-values.  
mb  <- MatchBalance(treat~age + I(age^2) + educ + I(educ^2) + black +
                    hisp + married + nodegr + re74  + I(re74^2) + re75 + I(re75^2) +
                    u74 + u75, data=lalonde, match.out=rr, nboots=10, nmc=10)

Run the code above in your browser using DataLab