lm_match: Regression-based matching estimator of treatment effects

Description

lm_match estimates treatment effects in matched samples. The function expects the user to provide the outcomes, treatment indicators, and a matching object. It returns point estimates of the average treatment effects and variance estimates. It is possible to estimate treatment effects for subsets of the observations, such as estimates of the average treatment effect for the treated (ATT).

Usage

lm_match(outcomes, treatments, matching, covariates = NULL,
  target = NULL)

Value

A list with two numeric matrices with all estimated treatment effects and their estimated variances is returned. The first matrix (effects) contains estimated treatment effects. Rows in this matrix indicate minuends in the treatment effect contrast and columns indicate subtrahends. For example, in the matrix:

	a	b	c
a	0.0	4.5	5.5
b	-4.5	0.0	1.0
c	-5.5	-1.0	0.0

the estimated treatment effect between conditions $a$ and $b$ is

$4.5$, and the estimated treatment effect between conditions $c$

and $b$ is $-1.0$. In symbols, $E[Y(a) - Y(b) | S] = 4.5$ and

$E[Y(c) - Y(b) | S] = -1.0$ where $S$ is the condition set indicated by the target parameter.

The second matrix (effect_variances) contains estimates of variances of the corresponding effect estimators.

Arguments

outcomes: numeric vector with observed outcomes.
treatments: factor specifying the units' treatment assignments.
matching: qm_matching or scclust object with the matched groups.
covariates: vector, matrix or data frame with covariates to include in the estimation. If NULL, no covariates are included.
target: units to target the estimation for. If NULL, the effect is estimated for all units in the sample (i.e., ATE). A non-null value specifies a subset of units for which the effect should be estimated (e.g., ATT or ATC). If target is a logical vector with the same length as the sample size, units indicated with TRUE will be targeted. If target is an integer vector, the units with indices in target are targeted. If target is a character vector, it should contain treatment labels, and the effect for the corresponding units (as given by treatments) will be estimated.

Details

lm_match estimates treatment effects using weighted regression. The function first derives the unit-level weights implied by the matching. In detail, let $S(g)$ be the number of units indicated by target in group $g$. Let $T$ be the total number of units indicated by target in the sample. Let $A(t, g)$ be the number of units assigned to treatment $t$ in group $g$. The weight for a unit in group $g$ that is assigned to treatment $t$ is given by:

$$\frac{S(g)}{T \times A(t, g)}.$$

See matching_weights for more details.

The function uses the derived weights in a weighted least squares regression (using the lm function) with indicator variables for the treatment conditions. Optionally, covariates can be added to the regression (e.g., a common recommendation is to include the covariates used to construct the matching). Standard errors are estimated with the heteroskedasticity-robust "HC1" estimator in the vcovHC function. Units not assigned to matched groups and units assigned weights of zero are excluded from the estimation.

References

Stuart, Elizabeth A. (2010), ‘Matching Methods for Causal Inference: A Review and a Look Forward’. Statistical Science, 25(1), 1–21. https://doi.org/10.1214/09-STS313

Examples

Run this code

# Construct example data
my_data <- data.frame(y = rnorm(100),
                      x1 = runif(100),
                      x2 = runif(100),
                      treatment = factor(sample(rep(c("T1", "T2", "C"), c(25, 25, 50)))))

# Make distances
my_distances <- distances(my_data, dist_variables = c("x1", "x2"))

# Make matching
my_matching <- quickmatch(my_distances, my_data$treatment)

# ATE without covariates
lm_match(my_data$y,
         my_data$treatment,
         my_matching)

# ATE with covariates
lm_match(my_data$y,
         my_data$treatment,
         my_matching,
         my_data[c("x1", "x2")])

# ATT for T1
lm_match(my_data$y,
         my_data$treatment,
         my_matching,
         my_data[c("x1", "x2")],
         target = "T1")

Run the code above in your browser using DataLab