Learn R Programming

ocf (version 1.0.3)

multinomial_ml: Multinomial Machine Learning

Description

Estimation strategy to estimate conditional choice probabilities for ordered non-numeric outcomes.

Usage

multinomial_ml(Y = NULL, X = NULL, learner = "forest", scale = TRUE)

Value

Object of class mml.

Arguments

Y

Outcome vector.

X

Covariate matrix (no intercept).

learner

String, either "forest" or "l1". Selects the base learner to estimate each expectation.

scale

Logical, whether to scale the covariates. Ignored if learner is not "l1".

Author

Riccardo Di Francesco

Details

Multinomial machine learning expresses conditional choice probabilities as expectations of binary variables:

$$p_m \left( X_i \right) = \mathbb{E} \left[ 1 \left( Y_i = m \right) | X_i \right]$$

This allows us to estimate each expectation separately using any regression algorithm to get an estimate of conditional probabilities.

multinomial_ml combines this strategy with either regression forests or penalized logistic regressions with an L1 penalty, according to the user-specified parameter learner.

If learner == "l1", the penalty parameters are chosen via 10-fold cross-validation and model.matrix is used to handle non-numeric covariates. Additionally, if scale == TRUE, the covariates are scaled to have zero mean and unit variance.

References

  • Di Francesco, R. (2025). Ordered Correlation Forest. Econometric Reviews, 1–17. tools:::Rd_expr_doi("10.1080/07474938.2024.2429596").

See Also

ordered_ml, ocf

Examples

Run this code
## Generate synthetic data.
set.seed(1986)

data <- generate_ordered_data(100)
sample <- data$sample
Y <- sample$Y
X <- sample[, -1]

## Training-test split.
train_idx <- sample(seq_len(length(Y)), floor(length(Y) * 0.5))

Y_tr <- Y[train_idx]
X_tr <- X[train_idx, ]

Y_test <- Y[-train_idx]
X_test <- X[-train_idx, ]

## Fit multinomial machine learning on training sample using two different learners.
multinomial_forest <- multinomial_ml(Y_tr, X_tr, learner = "forest")
multinomial_l1 <- multinomial_ml(Y_tr, X_tr, learner = "l1")

## Predict out of sample.
predictions_forest <- predict(multinomial_forest, X_test)
predictions_l1 <- predict(multinomial_l1, X_test)

## Compare predictions.
cbind(head(predictions_forest), head(predictions_l1))

Run the code above in your browser using DataLab