Learn R Programming

SMMAL

SMMAL is an R package for estimating the Average Treatment Effect (ATE) using semi-supervised learning (SSL), tailored for settings with limited treatment/outcome labels but rich covariates and surrogate variables. It enhances efficiency and robustness over supervised methods by leveraging unlabeled data and supports high-dimensional models via cross-fitting, flexible model fitting, and adaptive LASSO.

Installation

# install.packages("devtools")
devtools::install_github("ShuhengKong/SMMAL")

A github version can be found at this link: https://github.com/ShuhengKong/SMMAL

Example

This is a basic example which shows you how to solve a common problem:

library(SMMAL)

# Load the example dataset included with the package
file_path <- system.file("extdata", "sample_data.rds", package = "SMMAL")
dat <- readRDS(file_path)

temp <- data.frame(dat$X)
temp[,] <- NA
# Estimate ATE using the SMMAL pipeline
output <- SMMAL(
  Y = dat$Y,
  A = dat$A,
  S = data.frame(dat$S),
  X = data.frame(dat$X),
  nfold = 5,
  cf_model = "bspline"
)

# View the results
print(output)
#> $est
#> [1] 0.1021349
#> 
#> $se
#> [1] 0.03006258

SMMAL input files

ColumnDescription
YObserved outcomes. Can be continuous or binary
ATreatment indicator. Must be binary
SSurrogates
XCovariates
nfoldNumber of cross-validation folds. Default is 5.
cf_modelThe modeling method to use in cross-fitting. Default is “bspline”. Other values are “xgboost”,“randomforest”
custom_model_funOptional user-supplied function for feature selection or prediction. Overrides the built-in model fitting. Must return fold-level predictions.

SMMAL output

ColumnDescription
estestimated value of ATE
sestandard error of ATE

Copy Link

Version

Install

install.packages('SMMAL')

Monthly Downloads

118

Version

0.0.5

License

MIT + file LICENSE

Maintainer

Jue Hou

Last Published

August 28th, 2025

Functions in SMMAL (0.0.5)

cf

Cross-Fitting with Model Selection and Log Loss Evaluation
SMMAL_ada_lasso

Adaptive LASSO with Cross-Validation
compute_parameter

Estimate Nuisance Parameters for Semi-Supervised ATE Estimation
SMMAL

Estimate Average Treatment Effect (ATE) via Semi-Supervised Learning Pipeline
ate.SSL

Estimate Average Treatment Effect (ATE) via Semi-Supervised Learning
cross_validation

Assign Cross-Validation Folds for Labelled and Unlabelled Data
param_fun

Parameter grid function