weightit: Generate Balancing Weights

Description

weightit() allows for the easy generation of balancing weights using a variety of available methods for binary, continuous, and multinomial treatments. Many of these methods exist in other packages, which weightit() calls; these packages must be installed to use the desired method. Also included are print and summary methods for examining the output.

Usage

weightit(formula,
         data,
         method = "ps",
         estimand = "ATE",
         stabilize = FALSE,
         focal = NULL,
         exact = NULL,
         s.weights = NULL,
         ps = NULL,
         moments = 1,
         int = FALSE,
         verbose = FALSE,
         ...)
# S3 method for weightit
print(x, ...)

Arguments

formula

a formula with a treatment variable on the left hand side and the covariates to be balanced on the right hand side. See glm for more details. Interactions and functions of covariates are allowed.

data

a data set in the form of a data frame that contains the variables in formula.

method

a string of length 1 containing the name of the method that will be used to estimate weights. See Details below for allowable options. The default is "ps".

estimand

the desired estimand. For binary treatments, can be "ATE", "ATT", "ATC", and, for some methods, "ATO". For multinomial treatments, can be "ATE" or "ATT". The default for both is "ATE". This argument is ignored for continuous treatments.

stabilize

logical; whether or not to stabilize the weights. For the methods that involve estimating propensity scores, this involves multiplying each unit's weight by the sum of the weights in that unit's treatment group. For the "ebal" method, this involves using ebalance.trim() to reduce the variance of the weights. Default is FALSE.

focal

when multinomial treatments are used and the "ATT" is requested, which group to consider the "treated" or focal group. This group will not be weighted, and the other groups will be weighted to be more like the focal group.

exact

a vector or the names of variables in data for which weighting is to be done within catgories. For example, if exact = "gender", weights will be generated seperately within each level of the variable "gender".

s.weights

A vector of sampling weights or the name of a variable in data that contains sampling weights. These can also be matching weights if weighting is to be used on matched data.

A vector of propensity scores or the name of a variable in data containing propensity scores. If not NULL, method is ignored, and the propensity scores will be used to create weights. formula must include the treatment variable in data, but the listed covariates will play no role in the weight estimation.

moments

numeric; for entropy balancing, empirical balancing callibration weights, and stable balancing weights, the greatest moment of the covariate distribution to be balanced. For example, if moments = 3, for all non-categorical covariates, the mean, second moment (variance), and third moments (skew) of the covariates will be balanced. This argument is ignored for other methods; to balance powers of the covariates, appropriate functions must be entered in formula.

int

logical; for entropy balancing, empirical balancing callibration weights, and stable balancing weights, whether first-order interactons of the covariates are to be balanced (essentially balancing the covariances between covariates). This argument is ignored for other methods; to balance interactions between the variables, appropriate functions must be entered in formula.

verbose

whether to print additional information output by the fitting function.

...

other arguments for functions called by weightit that control aspects of fitting that are not covered by the above arguments. See Details.

a weightit object; the output of a call to weightit().

Value

A weightit object with the following elements:

weights

The estimated weights, one for each unit.

treat

The values of the treatment variable.

covs

The covariates used in the fitting. Only includes the raw covariates, which may have been altered in the fitting process.

data

The data.frame originally entered to weightit().

estimand

The estimand requested.

method

The weight estimation method specified.

The estimated or provided propensity scores.

s.weights

The provided sampling weights.

treat.type

The type of treatment: binary, continuous, or multinomial ("multi").

focal

The focal variable if the ATT was requested with a multinomial treatment.

Details

The primary purpose of weightit() is as a dispatcher to other functions in other packages that perform the estimation of balancing weights. These functions are identified by a name, which is used in method to request them. Each method has some slight distinctions in how it is called, but in general, simply entering the method will cause weightit() to generate the weights correctly using the function. To use each method, the package containing the function must be installed, or else an error will appear. Below are the methods allowed and some information about each.

"ps": Propensity score weighting using GLM. For binary treatments, this method estimates the propensity scores using glm(). An additional argument is link, which uses the same options as link in family. The default link is "logit", but others, including "probit", are allowed. The weights for the ATE, ATT, and ATC are computed from the estimated propensity scores using the standard formulas, and the weights for the ATO are computed as in Li, Morgan, & Zaslavsky (2016). For multinomial treatments, the propensity scores are estimated using multinomial regression from one of two functions depending on the requested link: for logit ("logit") and probit ("probit") links, mlogit() from the mlogit package is used, and for the Bayesian probit ("bayes.probit") link, MNP() from the MNP package is used. If mlogit in not installed, a series of binary regressions using glm() will be run instead, with estimated propensities normalized to sum to 1. These are the only three links allowed for multinomial treatments at this time. (These methods can fail to converge, yielding errors that may seem foreign.) For continuous treatments, the generalized propensity score is estimated using linear regression with a normal density, but other families and links are allowed, such as poisson for count data, using the family and link arguments. An additional argument, num.formula, may be specified, containing the stabilization variables on the right hand side. For all treatment types except multinomial treatments with a Bayesian probit link, sampling weights are supported, but a warning message from glm() may appear.
"gbm": Propensity score weighting using generalized boosted modeling. This method, which can also be requested as "gbr" or "twang", uses functions from the twang package to perform generalized boosted modeling to estimate propensity scores that yield balance on the requested covariates. For binary treatments, ps() is used, and the ATE, ATT, and ATC can be requested. For multinomial treatments, mnps() is used, and the ATE or ATT can be requested. For both, the weightit() argument s.weights corresponds to the ps() and mnps() argument sampw. The weightit() argument focal corresponds to the mnps() argument treatATT. For both, a single stop method must be supplied to stop.method; only one can be entered at a time. The other arguments to ps() and mnps can be specified in the call to weightit(). See ps and mnps for details.
"cbps": Covariate Balancing Propensity Score weighting. This method uses the CBPS() function from the CBPS package to estimate propensity scores and weights. It works with binary, multinomial, and continuous treatments. For binary treatments, the ATE, ATT, and ATC can be requested. For multinomial treatments, only the ATE can be requested. The weightit() argument s.weights corresponds to the CBPS() argument sampling.weights. CBPS() can fit either an over-identified model or a model that only contains covariate balancing conditions; this option is typically specified with the method argument to CBPS(), but because this argument is already used in weightit(), a new argument, over, can be specified. over = FALSE in weightit() is equivalent to method = "exact" in CBPS(). The other arguments to CBPS() can be specified in the call to weightit(). See CBPS for details.
"npcbps": Non-parametric Covariate Balancing Propensity Score weighting. This method uses the npCBPS() function from the CBPS package to estimate weights. It works with binary, multinomial, and continuous treatments. For binary and multinomial treatments, only the ATE can be requested. Sampling weights are not supported. The other arguments to npCBPS() can be specified in the call to weightit(). See npCBPS for details.
"ebal": Entropy balancing. This method uses the ebalance() function from the ebal package to estimate weights. It works with binary and multinomial treatments. For binary treatments, the ATE, ATT, and ATC can be requested. For multinomial treatments, the ATE and ATT can be requested. If the ATT is requetsed with a mutlinomial treatment, one treatment level must be entered to focal to serve as the "treated". Sampling weights are supported and are automatically entered into base.weight in ebal(). When stabilize = TRUE, ebalance.trim() is used to trim and reduce the variance of the weights. The other arguments to ebalance() can be specified in the call to weightit(). See ebalance for details.
"sbw": Stable balancing weights. This method uses the sbw() function from the sbw package to estimate weights. For binary treatments, the ATE, ATT, and ATC can be requested. For multinomial treatments, the ATE and ATT can be requested. If the ATT is requested with a multinomial treatment, one treatment level must be entered to focal to serve as the "treated". Sampling weights are not supported. The other arguments to sbw() can be specified in the call to weightit(). See sbw for details. There are some slight difference between the default options in weightit() and sbw(); importantly, in weightit() when bal_tols_sd is TRUE (the default), the standardized mean difference is not used for categorical variables, and the demoninator of the standardized mean difference corresponds to the standard deviation of the target group (e.g., for the ATT, the denominator is the standard deviation of the treated group).
"ebcw": Empirical balancing calibration weighting. This method uses the ATE() function from the ATE package to estimate weights. It works with binary and multinomial treatments. For binary treatments, the ATE, ATT, and ATC can be requested. For multinomial treatments, only the ATE can be requested. Sampling weights are not supported. The other arguments to ATE() can be specified in the call to weightit(). See ATE for details.

References

Binary treatments

method = "ps"

- estimand = "ATO"

Li, F., Morgan, K. L., & Zaslavsky, A. M. (2016). Balancing Covariates via Propensity Score Weighting. Journal of the American Statistical Association, 0(ja), 0<U+2013>0.

- Other estimands

Austin, P. C. (2011). An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies. Multivariate Behavioral Research, 46(3), 399<U+2013>424. 10.1080/00273171.2011.568786

method = "gbm"

McCaffrey, D. F., Ridgeway, G., & Morral, A. R. (2004). Propensity Score Estimation With Boosted Regression for Evaluating Causal Effects in Observational Studies. Psychological Methods, 9(4), 403<U+2013>425. 10.1037/1082-989X.9.4.403

method = "cbps"

Imai, K., & Ratkovic, M. (2014). Covariate balancing propensity score. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(1), 243<U+2013>263.

method = "npcbps"

Fong, Christian, Chad Hazlett, and Kosuke Imai (2015). Parametric and Nonparametric Covariate Balancing Propensity Score for General Treatment Regimes. Unpublished Manuscript. <http://imai.princeton.edu/research/files/CBGPS.pdf>

method = "ebal"

Hainmueller, J. (2012). Entropy Balancing for Causal Effects: A Multivariate Reweighting Method to Produce Balanced Samples in Observational Studies. Political Analysis, 20(1), 25<U+2013>46. 10.1093/pan/mpr025

method = "sbw"

Zubizarreta, J. R. (2015). Stable Weights that Balance Covariates for Estimation With Incomplete Outcome Data. Journal of the American Statistical Association, 110(511), 910<U+2013>922. 10.1080/01621459.2015.1023805

method = "ebcw"

Chan, K. C. G., Yam, S. C. P., & Zhang, Z. (2016). Globally efficient non-parametric inference of average treatment effects by empirical balancing calibration weighting. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 78(3), 673<U+2013>700. 10.1111/rssb.12129

Multinomial Treatments

method = "ps"

McCaffrey, D. F., Griffin, B. A., Almirall, D., Slaughter, M. E., Ramchand, R., & Burgette, L. F. (2013). A Tutorial on Propensity Score Estimation for Multiple Treatments Using Generalized Boosted Models. Statistics in Medicine, 32(19), 3388<U+2013>3414. 10.1002/sim.5753

method = "gbm"

Continuous treatments

method = "ps"

Robins, J. M., Hern<U+00E1>n, M. <U+00C1>., & Brumback, B. (2000). Marginal Structural Models and Causal Inference in Epidemiology. Epidemiology, 11(5), 550<U+2013>560.

method = "cbps"

Examples

Run this code

# NOT RUN {
library("cobalt")
data("lalonde", package = "cobalt")

#Balancing covariates between treatment groups
(W1 <- weightit(treat ~ age + educ + married +
                nodegree + re74, data = lalonde,
                method = "ps", estimand = "ATT"))
summary(W1)
bal.tab(W1)

#Balancing covariates with respect to re75 (continuous)
(W2 <- weightit(race ~ age + educ + married +
                nodegree + re74, data = lalonde,
                method = "ebal", estimand = "ATE"))
summary(W2)
bal.tab(W2)

#Balancing covariates with respect to re75 (continuous)
(W3 <- weightit(re75 ~ age + educ + married +
                nodegree + re74, data = lalonde,
                method = "cbps", over = FALSE))
summary(W3)
bal.tab(W3)
# }

Run the code above in your browser using DataLab