Dispatches to a dense (Arm/BLAS) backend for in-memory matrices or to a streaming big.matrix backend when X (or Y) is a big.matrix. Algorithm can be chosen between: "simpls" (default), "nipals", "kernelpls", "widekernelpls", "rkhs" (Rosipal & Trejo), "klogitpls", "sparse_kpls", "rkhs_xy" (double RKHS), and "kf_pls" (Kalman-filter PLS, streaming).
The "kernelpls" paths now include a streaming XX'
variant for big.matrix inputs, with an optional row-chunking loop
controlled by chunk_cols.
pls_fit(
X,
y,
ncomp,
tol = 1e-08,
backend = c("auto", "arma", "bigmem"),
mode = c("auto", "pls1", "pls2"),
algorithm = c("auto", "simpls", "nipals", "kernelpls", "widekernelpls", "rkhs",
"klogitpls", "sparse_kpls", "rkhs_xy", "kf_pls"),
scores = c("none", "r", "big"),
chunk_size = 10000L,
chunk_cols = NULL,
scores_name = "scores",
scores_target = c("auto", "new", "existing"),
scores_bm = NULL,
scores_backingfile = NULL,
scores_backingpath = NULL,
scores_descriptorfile = NULL,
scores_colnames = NULL,
return_scores_descriptor = FALSE,
coef_threshold = NULL,
kernel = c("linear", "rbf", "poly", "sigmoid"),
gamma = 1,
degree = 3L,
coef0 = 0,
approx = c("none", "nystrom", "rff"),
approx_rank = NULL,
class_weights = NULL
)a list with coefficients, intercept, weights, loadings, means,
and optionally $scores.
numeric matrix or bigmemory::big.matrix
numeric vector/matrix or big.matrix
number of latent components
numeric tolerance used in the core solver
one of "auto", "arma", "bigmem"
one of "auto", "pls1", "pls2"
one of "auto", "simpls", "nipals",
"kernelpls", "widekernelpls",
"rkhs", "klogitpls", "sparse_kpls",
"rkhs_xy", "kf_pls"
one of "none", "r", "big"
chunk size for the bigmem backend
columns chunk size for the bigmem backend
name for dense scores (or output big.matrix)
one of "auto", "new", "existing"
optional existing big.matrix or descriptor for scores
Character; file name for file-backed scores (when scores="big").
Character; directory for the file-backed scores.
Defaults to getwd() or tempdir() in streamed predict, unless overridden.
Character; descriptor file name for the file-backed scores.
optional character vector for score column names
logical; if TRUE and scores is big.matrix, add $scores_descriptor
Optional non-negative value used to hard-threshold
the fitted coefficients after model estimation. When supplied, absolute
coefficients strictly below the threshold are set to zero via
pls_threshold().
kernel name for RKHS/KPLS ("linear", "rbf", "poly", "sigmoid")
RBF/sigmoid/poly scale parameter
polynomial degree
polynomial/sigmoid bias
kernel approximation: "none", "nystrom", "rff"
rank (columns / features) for the approximation
optional numeric weights for classes in klogitpls
set.seed(123)
X <- matrix(rnorm(60), nrow = 20)
y <- X[, 1] - 0.5 * X[, 2] + rnorm(20, sd = 0.1)
fit <- pls_fit(X, y, ncomp = 2, scores = "r", algorithm = "simpls")
head(pls_predict_response(fit, X, ncomp = 2))
Run the code above in your browser using DataLab