Learn R Programming

valueprhr (version 0.1.0)

fit_pls_multivariate: Fit PLS Regression with Cross-Validation Component Selection

Description

Fits a partial least squares regression model with automatic selection of the optimal number of components via cross-validation.

Usage

fit_pls_multivariate(
  X_matrix,
  Y_matrix,
  max_components = NULL,
  cv_segments = 10L,
  scale = TRUE,
  center = TRUE
)

Value

A list containing:

model

The fitted pls model object

optimal_ncomp

Optimal number of components by CV-RMSE

cv_table

Data frame with CV metrics by number of components

metrics_cv

CV metrics at optimal component number

metrics_insample

In-sample metrics at optimal component number

Arguments

X_matrix

Numeric matrix of predictor variables (direct prices).

Y_matrix

Numeric matrix of response variables (production prices).

max_components

Maximum number of components to consider. Default NULL uses min(ncol(X), nrow(X)-1, ncol(Y), 25).

cv_segments

Number of cross-validation segments. Default 10.

scale

Logical. Scale variables before fitting. Default TRUE.

center

Logical. Center variables before fitting. Default TRUE.

Details

This function uses the pls package for PLS regression. Component selection is based on minimizing cross-validated RMSE. The function handles log-transformed data and reports metrics in both log and original scales.

Examples

Run this code
# \donttest{
if (requireNamespace("pls", quietly = TRUE)) {
  set.seed(123)
  n <- 50
  p <- 10
  X <- matrix(rnorm(n * p), n, p)
  colnames(X) <- paste0("X", 1:p)
  Y <- X[, 1:3] %*% diag(c(1, 0.5, 0.3)) + matrix(rnorm(n * 3, 0, 0.5), n, 3)
  colnames(Y) <- paste0("Y", 1:3)

  result <- fit_pls_multivariate(X, Y, max_components = 8)
  print(result$optimal_ncomp)
  print(result$cv_table)
}
# }

Run the code above in your browser using DataLab