fit.only.model: Fit Models without Feature Selection

Description

Applies models to high-dimensional data for classification.

Usage

fit.only.model(X, Y, method, p = 0.9, optimize = TRUE, tuning.grid = NULL, k.folds = if (optimize) 10 else NULL, repeats = if (optimize) 3 else NULL, resolution = if (optimize) 3 else NULL, metric = "Accuracy", allowParallel = FALSE, verbose = "none", ...)

Arguments

A scaled matrix or dataframe containing numeric values of each feature

A factor vector containing group membership of samples

method

A vector listing models to be fit. Available options are "plsda" (Partial Least Squares Discriminant Analysis), "rf" (Random Forest), "gbm" (Gradient Boosting Machine), "svm" (Support Vector Machines), "glmnet" (Elastic-net Generalized Linear Model), and "pam" (Prediction Analysis of Microarrays)

Percent of data to by 'trained'

optimize

Logical argument determining if each model should be optimized. Default "optimize = TRUE"

tuning.grid

Optional list of grids containing parameters to optimize for each algorithm. Default "tuning.grid = NULL" lets function create grid determined by "res"

k.folds

Number of folds generated during cross-validation. Default "k.folds = 10"

repeats

Number of times cross-validation repeated. Default "repeats = 3"

resolution

Resolution of model optimization grid. Default "resolution = 3"

metric

Criteria for model optimization. Available options are "Accuracy" (Predication Accuracy), "Kappa" (Kappa Statistic), and "AUC-ROC" (Area Under the Curve - Receiver Operator Curve)

allowParallel

Logical argument dictating if parallel processing is allowed via foreach package. Default allowParallel = FALSE

verbose

Logical argument if should output progress

...

Extra arguments that the user would like to apply to the models

Value

Methods: Vector of models fit to data
performance: Performance metrics of each model and bootstrap iteration
specs: List with the following elements:

Examples

Run this code

dat.discr <- create.discr.matrix(
    create.corr.matrix(
        create.random.matrix(nvar = 50, 
                             nsamp = 100, 
                             st.dev = 1, 
                             perturb = 0.2)),
    D = 10
)

vars <- dat.discr$discr.mat
groups <- dat.discr$classes

fit <- fit.only.model(X=vars, 
                      Y=groups, 
                      method="plsda", 
                      p = 0.9)

Run the code above in your browser using DataLab