Learn R Programming

SVEMnet (version 3.2.0)

svem_random_table_multi: Generate a Random Prediction Table from Multiple SVEMnet Models (no refit)

Description

Samples the original predictor factor space cached in fitted svem_model objects and computes predictions from each model at the same random points. This is intended for multiple responses built over the same factor space and a deterministic factor expansion (for example via a shared bigexp_terms), so that a shared sampling schema is available.

Usage

svem_random_table_multi(
  objects,
  n = 1000,
  mixture_groups = NULL,
  debias = FALSE,
  range_tol = 1e-08,
  numeric_sampler = c("random", "uniform")
)

Value

A list with three data frames:

  • data: the sampled predictor settings, one row per random point.

  • pred: one column per response, aligned to data rows.

  • all: cbind(data, pred) for convenience.

Each prediction column is named by the model's response (left-hand side) with a "_pred" suffix (for example, "y1_pred"). If that name would collide with a predictor name or with another prediction column, the function stops with an error and asks the user to rename the response or predictor.

Arguments

objects

A list of fitted svem_model objects returned by SVEMnet(). Each object must contain a valid $sampling_schema produced by the updated SVEMnet() implementation. A single model is also accepted and treated as a length-one list.

n

Number of random points to generate (rows in the output tables). Default is 1000.

mixture_groups

Optional list of mixture constraint groups. Each group is a list with elements vars, lower, upper, total (see Notes on mixtures). Mixture variables must be numeric-like and must also appear in the models' predictor_vars (that is, they must be used as predictors in all models).

debias

Logical; if TRUE, apply each model's calibration during prediction when available (for Gaussian fits). This is passed to predict.svem_model(). Default is FALSE.

range_tol

Numeric tolerance for comparing numeric ranges across models (used when checking that all $sampling_schema$num_ranges agree). Default is 1e-8.

numeric_sampler

Sampler for non-mixture numeric predictors: "random" (default), or "uniform".

  • "random": random Latin hypercube when the lhs package is available; otherwise independent uniforms via runif().

  • "uniform": independent uniform draws within numeric ranges (fastest; no lhs dependency).

Typical workflow

  1. Build a deterministic expansion (for example with bigexp_terms) and fit several SVEMnet() models for different responses on the same factor space, using the same expansion / sampling settings.

  2. Ensure that the fitted models were created with a version of SVEMnet() that populates $sampling_schema.

  3. Collect the fitted models in a list and pass them to svem_random_table_multi().

  4. Use $data (predictors), $pred (response columns), or $all (cbind(data, pred)) for downstream plotting, summarization, or cross-response comparison.

Blocking variables

If the models were fit using a bigexp_spec that included blocking variables (for example blocking = c("Operator", "Plate_ID")) and SVEMnet() stored these in $sampling_schema$blocking, then svem_random_table_multi() will:

  • treat those variables as blocking factors; and

  • hold them fixed at a single value across the sampled table.

Specifically:

  • For blocking numeric variables, the function uses the midpoint of the recorded numeric range, (min + max) / 2, for all rows. If the variable also has stored discrete support, the midpoint is snapped deterministically to the nearest allowed discrete value.

  • For blocking categorical variables, the function uses a single reference level equal to the most frequent observed level (mode) in the training data, with ties broken deterministically; if the mode is unavailable, it falls back to the first stored level.

Blocking variables are not allowed to appear in mixture_groups. If any mixture group tries to use a blocking variable, the function stops with an error.

When no blocking information is present in $sampling_schema (for example for models fit without a bigexp_spec or without blocking), the behavior is unchanged from earlier versions: all predictors are sampled according to the rules described under "Sampling strategy".

Details

No refitting is performed. Predictions are obtained by averaging per-bootstrap member predictions on the requested scale.

All models must share an identical predictor schema. Specifically, their $sampling_schema entries must agree on:

  • The same predictor_vars in the same order.

  • The same var_classes for each predictor.

  • Identical factor levels and level order for all categorical predictors.

  • Numeric num_ranges that match within range_tol for all continuous predictors.

  • When present, the same blocking set (up to order).

The function stops with an informative error message if any of these checks fail.

Discrete numeric predictors (automatic). If any supplied model stores discrete-numeric sampling information in its $sampling_schema, this function will automatically respect it (no separate user argument).

In the updated SVEMnet() implementation this information is stored as:

  • $sampling_schema$discrete_numeric: a character vector of discrete numeric variable names; and

  • $sampling_schema$discrete_levels: a named list mapping those names to allowed numeric values.

(Older objects may use $sampling_schema$discrete_values instead of discrete_levels; this function accepts both for backward compatibility.)

Discrete numeric variables are sampled independently (uniform over allowed values) and are excluded from Latin hypercube sampling; LHS (when used) is applied only to the remaining continuous numeric predictors. Discrete numeric variables are not allowed to be mixture variables.

Models may be Gaussian or binomial. For binomial fits, predictions are returned on the probability scale (that is, on the response scale) by default, consistent with the default behaviour of predict.svem_model().