Learn R Programming

PSsurvival (version 0.2.0)

estimate_ps: Propensity Score Estimation for PSsurvival Package

Description

Functions for estimating propensity scores for binary and multiple treatment groups. Estimate Propensity Scores

Fits a propensity score model and extracts propensity scores for binary or multiple treatment groups. For binary treatments, uses binomial logistic regression. For multiple treatments (>2 levels), uses multinomial logistic regression to estimate generalized propensity scores.

Usage

estimate_ps(data, treatment_var, ps_formula, ps_control = list())

Value

A list with the following components:

ps_model

The fitted propensity score model object (class glm for binary treatment or multinom for multiple treatments).

ps

A numeric vector of propensity scores representing the probability of receiving the actual treatment each individual received. Length equals the number of rows in data.

ps_matrix

A numeric matrix of dimension n × K where n is the number of observations and K is the number of treatment levels. Each row contains the predicted probabilities for all treatment levels. Column names correspond to treatment levels.

n_levels

An integer indicating the number of treatment levels.

treatment_levels

A vector of unique treatment values sorted by sort(): numerically for numeric, alphabetically for character, by factor level order for factor.

Arguments

data

A data.frame containing the analysis data (typically the cleaned data with complete cases).

treatment_var

A character string specifying the name of the treatment variable in data. Can be numeric, character, or factor with any coding (e.g., 0/1, 1/2, "Control"/"Treated"). Function assumes treatment has been validated for 2 or more levels.

ps_formula

A formula object for the propensity score model, of the form treatment ~ covariates.

ps_control

An optional list of control parameters to pass to the model fitting function (glm for binary treatment or nnet::multinom for multiple treatments). Default is an empty list.

Details

Propensity Score Definition: Returns P(Z = observed | X) for each individual, not P(Z=1|X) for all (as in Rosenbaum & Rubin 1983). This definition enables direct use in IPW and extends naturally to multiple treatments.

Binary Treatments (2 levels): Fits binomial logistic regression via glm(). Treatment is factorized with levels sorted by sort(): numerically for numeric, alphabetically for character, by factor level order for factor. Returns P(Z = observed | X).

Multiple Treatments (>2 levels): Fits multinomial logistic regression via nnet::multinom(). Returns P(Z = observed | X) for each individual from the generalized PS matrix.

Control Parameters (ps_control):

  • Binary: glm.control() parameters (default: epsilon=1e-08, maxit=25)

  • Multiple: multinom() parameters (default: MaxNWts=10000, maxit=100, trace=FALSE)

Examples

Run this code
# \donttest{
# Example 1: Binary treatment
data(simdata_bin)
ps_bin <- estimate_ps(
  data = simdata_bin,
  treatment_var = "Z",
  ps_formula = Z ~ X1 + X2 + X3 + B1 + B2
)
summary(ps_bin$ps)
table(simdata_bin$Z)

# Example 2: Multiple treatments
data(simdata_multi)
ps_multi <- estimate_ps(
  data = simdata_multi,
  treatment_var = "Z",
  ps_formula = Z ~ X1 + X2 + X3 + B1 + B2
)
head(ps_multi$ps_matrix)
# }

Run the code above in your browser using DataLab