The SVEMnet package implements Self-Validated Ensemble Models (SVEM)
using Elastic Net (including lasso and ridge) regression via glmnet.
SVEM averages predictions from multiple models fitted to fractionally
weighted bootstraps of the data, tuned with anti-correlated validation
weights. The package supports multi-response optimization with
uncertainty-aware candidate generation for iterative formulation and
process development.
SVEMnetFit an SVEMnet model using Elastic Net regression (including relaxed elastic net) on fractionally weighted bootstraps.
predict.svem_modelPredict method for SVEM models (ensemble-mean aggregation by default, optional debiasing, and percentile prediction intervals when available).
coef.svem_modelAveraged (optionally debiased) coefficients from an SVEM model.
svem_nonzeroBootstrap nonzero percentages for each coefficient, with an optional quick plot.
plot.svem_modelQuick actual-versus-predicted plot for a fitted model (with optional group colorings).
The bigexp_* helpers build and reuse a locked polynomial/interaction
expansion across multiple responses and datasets:
bigexp_termsBuild a deterministic expanded RHS (polynomials, interactions, optional partial-cubic terms) with locked factor levels and numeric ranges.
bigexp_prepareCoerce new data to match a stored
bigexp_spec, including factor levels and numeric types.
bigexp_formulaReuse a locked expansion for another response to ensure an identical factor space across models.
with_bigexp_contrastsTemporarily restore the
contrast options used when a bigexp_spec was built.
bigexp_trainConvenience wrapper that builds a
bigexp_spec and prepares training data in one call.
svem_random_table_multiGenerate one shared random
predictor table (with optional mixture constraints) from cached
factor-space information and obtain predictions from multiple SVEM
models at those points. Supports both Gaussian and binomial models;
binomial predictions are returned on the probability scale. This is
the lower-level sampler used by svem_score_random.
svem_score_randomRandom-search scoring for multiple
responses with Derringer–Suich desirabilities, user weights,
optional whole-model-test (WMT) reweighting, percentile CI-based
uncertainty, and (optionally) scoring of existing experimental data.
Returns a scored random-search table and, when data is
supplied, an augmented copy of the original data with
<resp>_pred, desirabilities, scores, and an
uncertainty_measure.
svem_select_from_score_tableGiven a scored table
(typically svem_score_random()$score_table), select one
"best" row under a chosen objective and a small, diverse set of
medoid candidates via PAM clustering on predictors.
svem_export_candidates_csvConcatenate one or more
selection objects from svem_select_from_score_table
and export candidate tables (with metadata, predictions, and
optional design-only trimming) to CSV or return them in-memory for
inspection.
svem_significance_test_parallelParallel whole-model
significance test (using foreach + doParallel) with
support for mixture-constrained sampling and reuse of a locked
bigexp_spec. Designed for continuous (Gaussian) responses.
svem_wmt_multiHelper to run
svem_significance_test_parallel across multiple responses and
construct whole-model p-values and reweighting multipliers for use
in svem_score_random.
plot.svem_significance_testPlot helper for visualizing multiple significance-test outputs (observed vs permutation distances, fitted null, and p-values).
glmnet_with_cvConvenience wrapper around repeated
cv.glmnet() selection for robust lambda (and optional alpha)
choice.
lipid_screenExample dataset for multi-response modeling, whole-model testing, and mixture-constrained optimization demonstrations.
SVEMnet currently supports:
Gaussian responses (family = "gaussian") with identity link
and optional debiasing / percentile prediction intervals.
Binomial responses (family = "binomial") with logit link.
The response must be 0/1 numeric or a two-level factor (first level
treated as 0). Use predict(..., type = "response") for event
probabilities or type = "class" for 0/1 labels
(threshold = 0.5 by default).
Some higher-level utilities place additional constraints:
svem_significance_test_parallel is designed and
interpreted for continuous (Gaussian) responses.
svem_score_random supports mixed Gaussian + binomial
response sets, treating binomial predictions and CIs on the
probability scale, but WMT-based goal reweighting (via
svem_wmt_multi and the wmt argument) is only
allowed when all responses are Gaussian.
OpenAI's GPT models (o1-preview through GPT-5 Pro) were used to assist with coding and roxygen documentation; all content was reviewed and finalized by the author.
Maintainer: Andrew T. Karl akarl@asu.edu (ORCID)
A typical workflow is:
Build a wide, deterministic factor expansion (optionally via
bigexp_terms) and reuse it across responses with
bigexp_formula.
Fit one or more SVEM models with SVEMnet.
Optionally run whole-model testing via
svem_significance_test_parallel (and
svem_wmt_multi) to assess factor
relationships or reweight response goals.
Call svem_score_random to draw random points in the
factor space, compute multi-response Derringer–Suich scores,
optional WMT-reweighted scores, and an uncertainty measure; then use
svem_select_from_score_table to pick a single "best"
row and diverse medoid candidates, and
svem_export_candidates_csv to export candidate tables
for the next experimental round.
Run new experiments at the suggested candidates, append the data, refit the models, and repeat as needed (closed-loop optimization).
Gotwalt, C., & Ramsey, P. (2018). Model Validation Strategies for Designed Experiments Using Bootstrapping Techniques With Applications to Biopharmaceuticals. JMP Discovery Conference. https://community.jmp.com/t5/Abstracts/Model-Validation-Strategies-for-Designed-Experiments-Using/ev-p/849873/redirect_from_archived_page/true
Karl, A. T. (2024). A randomized permutation whole-model test heuristic for Self-Validated Ensemble Models (SVEM). Chemometrics and Intelligent Laboratory Systems, 249, 105122. tools:::Rd_expr_doi("10.1016/j.chemolab.2024.105122")
Karl, A., Wisnowski, J., & Rushing, H. (2022). JMP Pro 17 Remedies for Practical Struggles with Mixture Experiments. JMP Discovery Conference. tools:::Rd_expr_doi("10.13140/RG.2.2.34598.40003/1")
Lemkus, T., Gotwalt, C., Ramsey, P., & Weese, M. L. (2021). Self-Validated Ensemble Models for Design of Experiments. Chemometrics and Intelligent Laboratory Systems, 219, 104439. tools:::Rd_expr_doi("10.1016/j.chemolab.2021.104439")
Xu, L., Gotwalt, C., Hong, Y., King, C. B., & Meeker, W. Q. (2020). Applications of the Fractional-Random-Weight Bootstrap. The American Statistician, 74(4), 345–358. tools:::Rd_expr_doi("10.1080/00031305.2020.1731599")
Ramsey, P., Gaudard, M., & Levin, W. (2021). Accelerating Innovation with Space Filling Mixture Designs, Neural Networks and SVEM. JMP Discovery Conference. https://community.jmp.com/t5/Abstracts/Accelerating-Innovation-with-Space-Filling-Mixture-Designs/ev-p/756841
Ramsey, P., & Gotwalt, C. (2018). Model Validation Strategies for Designed Experiments Using Bootstrapping Techniques With Applications to Biopharmaceuticals. JMP Discovery Conference - Europe. https://community.jmp.com/t5/Abstracts/Model-Validation-Strategies-for-Designed-Experiments-Using/ev-p/849647/redirect_from_archived_page/true
Ramsey, P., Levin, W., Lemkus, T., & Gotwalt, C. (2021). SVEM: A Paradigm Shift in Design and Analysis of Experiments. JMP Discovery Conference - Europe. https://community.jmp.com/t5/Abstracts/SVEM-A-Paradigm-Shift-in-Design-and-Analysis-of-Experiments-2021/ev-p/756634
Ramsey, P., & McNeill, P. (2023). CMC, SVEM, Neural Networks, DOE, and Complexity: It's All About Prediction. JMP Discovery Conference.
Friedman, J. H., Hastie, T., and Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1-22.
Meinshausen, N. (2007). Relaxed Lasso. Computational Statistics & Data Analysis, 52(1), 374-393.
Kish, L. (1965). Survey Sampling. Wiley.
Lumley, T. (2004). Analysis of complex survey samples. Journal of Statistical Software, 9(1), 1–19.
Lumley, T. and Scott, A. (2015). AIC and BIC for modelling with complex survey data. Journal of Survey Statistics and Methodology, 3(1), 1–18.
Useful links: