Match so as to achieve balance. Balance is
determined by a variety of univariate test, mainly paired t-tests for
dichotomous variables and univariate Kolmogorov-Smirnov (KS) test for
multinomial and continuous variables. The loss criterion defining
optimal balance is determined by the loss option. The object
returned by GenMatch can be supplied as the
Weight.matrix option of the Match function to
obtain estimates. GenMatch, via the cluster option, supports
the use of multiple computers, CPUs or cores to perform parallel
computations.GenMatch(Tr, X, BalanceMatrix=X, estimand="ATT", M=1, weights=NULL,
pop.size = 50, max.generations=100,
wait.generations=4, hard.generation.limit=FALSE,
starting.values=rep(1,ncol(X)),
fit.func="pvals",
data.type.integer=TRUE,
MemoryMatrix=TRUE,
exact=NULL, caliper=NULL, replace=TRUE, ties=TRUE,
nboots=0, ks=TRUE, verbose=FALSE,
tolerance = 1e-05,
distance.tolerance=tolerance,
min.weight=0, max.weight=1000,
Domains=NULL, print.level=2,
project.path=NULL,
paired=TRUE, loss=1,
restrict=NULL,
cluster=FALSE, balance=TRUE, ...)X, but it can
in principle be a matrix which contains more or less variables than
X or variables which are transformed in varities
option.Y which
provides observations specific weights. If none are provides, equal
weights of 1 for each observations are assumed.genoud will run when attempting to
optimize a function. This is a soft limit. The maximum
generation limit wgenoud will think that it has
found the optimum. The other variables controlling termination are
max.generations
variable is a binding constraint for genoud. If
hard.generation.limit is FALSE, then X. This
vector contains the starting weights each of the variables is
given. The starting.values vector is a way for the user
to insert one individual into thepvals: maximize the p.values uses from a variety of hypothesis
tests.
qqmean.mean: calculate the mean standardized differefalse, search will be done over floating point
weights. This is usually an unnecessary degree of precision.X. If a logical vector is provided, a logical value should
be providedFALSE, the order of matches generally matters.
Matches will be found in the same order as the data is
sorted. Thus, the match(es) for the first observation will be fties==TRUE. If, for example, one treated observation
matches more than one control observation, the matched dataset will
include the multiple matched control observatioks test. By default this option is set to zero so no
bootstraps are done. If this is a positive integer, the boostrap KS
test will be used instead of the usual one. See
cluster option is used.distance.tolerance are deemed to be equal to zero. This
option can be used to perform a type of optimal ncol(X) $\times 2$ matrix.
The first column is the lower bound, and the second column is the
upper bound for each variable over which genoud will
search for weights.GenMatch will
print details about the population at each generatit.test should be
used when determining balance.1,
implies "lexical" optimization: all of the balance statistics will
be sorted from the most discrepant to the least and weights will be
picked which minimize the maximum dismakeCluster commands in
the snow package or a vector of machine names so GenMatch can
setup the cgenoud.X.X. This object corresponds to the
Weight.matrix in the Match function.index.treated,
index.control and weights objects which are returned by
Match.X variables. This object has the same length as the number of
covariates in X. Sekhon, Jasjeet S. 2006. ``Matching: Algorithms and Software for
Multivariate and Propensity Score Matching with Balance Optimization
via Genetic Search.''
Sekhon, Jasjeet Singh and Walter R. Mebane, Jr. 1998. ``Genetic
Optimization Using Derivatives: Theory and Application to Nonlinear
Models.'' Political Analysis, 7: 187-210.
Match, summary.Match,
MatchBalance, genoud,
balanceMV, balanceUV, qqstats,
ks.boot, GerberGreenImai, lalondeset.seed(38913)
data(lalonde)
attach(lalonde)
#The covariates we want to match on
X = cbind(age, educ, black, hisp, married, nodegr, u74, u75, re75, re74);
#The covariates we want to obtain balance on
BalanceMat <- cbind(age, educ, black, hisp, married, nodegr, u74, u75, re75, re74,
I(re74*re75));
#Let's call GenMatch() to find the optimal weight to give each
#covariate in 'X' so as we have achieved balance on the covariates in
#'BalanceMat'. This is only an example so we want GenMatch to be quick
#to the population size has been set to be only 15 via the 'pop.size'
#option.
genout <- GenMatch(Tr=treat, X=X, BalanceMatrix=BalanceMat, estimand="ATE", M=1,
pop.size=16, max.generations=10, wait.generations=1)
#The outcome variable
Y=re78/1000;
# Now that GenMatch() has found the optimal weights, let's estimate
# our causal effect of interest using those weights
mout <- Match(Y=Y, Tr=treat, X=X, estimand="ATE", Weight.matrix=genout)
summary(mout)
#
#Let's determine if balance has actually been obtained on the variables of interest
#
mb <- MatchBalance(treat~age +educ+black+ hisp+ married+ nodegr+ u74+ u75+
re75+ re74+ I(re74*re75),
match.out=mout, nboots=500, ks=TRUE, mv=FALSE)
# For more examples see: http://sekhon.berkeley.edu/matching/R.Run the code above in your browser using DataLab