Learn R Programming

PSsurvival (version 0.2.0)

estimate_weighted_km: Weighted Kaplan-Meier Estimation with Classic Greenwood Variance

Description

Core estimation function for computing weighted Kaplan-Meier survival curves with propensity score weights. Handles multiple treatment groups simultaneously and uses the classic weighted Greenwood formula for variance estimation. Estimate Weighted Kaplan-Meier Curves for All Treatment Groups

Computes weighted Kaplan-Meier survival estimates and variances for all treatment groups using propensity score weights. Uses classic weighted Greenwood formula: \(Var[S(t)] = [S(t)]^2 \sum (D_l / (R_l (R_l - D_l)))\).

Usage

estimate_weighted_km(
  data,
  time_var,
  event_var,
  treatment_var,
  weights,
  treatment_levels
)

Value

A list containing:

eval_times

Numeric vector of all unique event times where survival is estimated.

surv_estimates

Matrix [n_times x n_groups] of survival estimates. Column names are treatment levels.

surv_var

Matrix [n_times x n_groups] of variances for survival.

n_risk

Matrix [n_times x n_groups] of weighted number at risk (R).

n_event

Matrix [n_times x n_groups] of weighted number of events (D).

n_acc_event

Matrix [n_times x n_groups] of cumulative weighted events up to each time.

treatment_levels

Treatment levels (column names for matrices).

n_levels

Number of treatment groups.

Arguments

data

A data.frame containing the complete-case analysis data.

time_var

A character string specifying the name of the time variable.

event_var

A character string specifying the name of the event variable. Should be coded as 1 = event, 0 = censored.

treatment_var

A character string specifying the name of the treatment variable in data.

weights

A numeric vector of propensity score weights with length equal to nrow(data). Each observation has one weight corresponding to its observed treatment group. For ATE: \(w_i = 1/e_j(X_i)\) where j is observed treatment.

treatment_levels

A vector of unique treatment values (sorted). Should match the levels from estimate_ps().

Details

**Weighted Kaplan-Meier Formula:**

For treatment group j, at each event time \(t_l\): $$R_l = \sum_{i: T_i \ge t_l, Z_i = j} w_{i,j}$$ $$D_l = \sum_{i: T_i = t_l, \delta_i = 1, Z_i = j} w_{i,j}$$ $$\hat{S}^w_j(t) = \prod_{t_l \le t} \left(1 - \frac{D_l}{R_l}\right)$$

where \(R_l\) is the weighted number at risk and \(D_l\) is the weighted number of events. Ties between events and censorings are handled using the Breslow method.

**Classic Weighted Greenwood Variance:**

$$Var[\hat{S}^w_j(t)] = [\hat{S}^w_j(t)]^2 \sum_{t_l \le t} \frac{D_l}{R_l (R_l - D_l)}$$

This is the standard weighted extension of Greenwood's formula. When all weights equal 1, reduces to classical Greenwood's formula.

**Weight Structure:**

The weight vector has length nrow(data). Each observation i in treatment group j has weight \(w_i\) based on its propensity score for group j. For ATE estimation, \(w_i = 1/e_j(X_i)\). When computing weighted KM for group j, only observations with \(Z_i = j\) and their corresponding weights are used.

**Handling Edge Cases:**

- If weighted at-risk count \(R_l = 0\) at time t, survival remains constant after t (last observation censored). - Variance is undefined when \(R_l - D_l \le 0\); set to NA for that time point.