Learn R Programming

OmicsMarkeR (version 1.4.2)

fit.only.model: Fit Models without Feature Selection

Description

Applies models to high-dimensional data for classification.

Usage

fit.only.model(X, Y, method, p = 0.9, optimize = TRUE, tuning.grid = NULL, k.folds = if (optimize) 10 else NULL, repeats = if (optimize) 3 else NULL, resolution = if (optimize) 3 else NULL, metric = "Accuracy", allowParallel = FALSE, verbose = "none", ...)

Arguments

X
A scaled matrix or dataframe containing numeric values of each feature
Y
A factor vector containing group membership of samples
method
A vector listing models to be fit. Available options are "plsda" (Partial Least Squares Discriminant Analysis), "rf" (Random Forest), "gbm" (Gradient Boosting Machine), "svm" (Support Vector Machines), "glmnet" (Elastic-net Generalized Linear Model), and "pam" (Prediction Analysis of Microarrays)
p
Percent of data to by 'trained'
optimize
Logical argument determining if each model should be optimized. Default "optimize = TRUE"
tuning.grid
Optional list of grids containing parameters to optimize for each algorithm. Default "tuning.grid = NULL" lets function create grid determined by "res"
k.folds
Number of folds generated during cross-validation. Default "k.folds = 10"
repeats
Number of times cross-validation repeated. Default "repeats = 3"
resolution
Resolution of model optimization grid. Default "resolution = 3"
metric
Criteria for model optimization. Available options are "Accuracy" (Predication Accuracy), "Kappa" (Kappa Statistic), and "AUC-ROC" (Area Under the Curve - Receiver Operator Curve)
allowParallel
Logical argument dictating if parallel processing is allowed via foreach package. Default allowParallel = FALSE
verbose
Logical argument if should output progress
...
Extra arguments that the user would like to apply to the models

Value

Methods
Vector of models fit to data
performance
Performance metrics of each model and bootstrap iteration
specs
List with the following elements:
  • total.samples: Number of samples in original dataset
  • number.features: Number of features in orginal dataset
  • number.groups: Number of groups
  • group.levels: The specific levels of the groups
  • number.observations.group: Number of observations in each group

Examples

Run this code
dat.discr <- create.discr.matrix(
    create.corr.matrix(
        create.random.matrix(nvar = 50, 
                             nsamp = 100, 
                             st.dev = 1, 
                             perturb = 0.2)),
    D = 10
)

vars <- dat.discr$discr.mat
groups <- dat.discr$classes

fit <- fit.only.model(X=vars, 
                      Y=groups, 
                      method="plsda", 
                      p = 0.9)

Run the code above in your browser using DataLab