Learn R Programming

MUVR2

Multivariate methods with Unbiased Variable selection in R
PhD candidate Yingxiao Yan yingxiao@chalmers.se
Associate Professor Carl Brunius carl.brunius@chalmers.se
Department of Life Sciences, Chalmers University of Technology www.chalmers.se

General description

The MUVR package allows for predictive multivariate modelling with minimally biased variable selection incorporated into a repeated double cross-validation framework. The MUVR procedure simultaneously produces both minimal-optimal and all-relevant variable selections.

The MUVR2 package is developed with new functionalities based on the MUVR package.

An easy-to-follow tutorial on how to use the MUVR2 package can be found at this repository at inst/Tutorial/MUVR_Tutorial.docx

In brief, MUVR2 proved the following functionality:

  • Types: classification, regression and multilevel.
  • Model cores: PLS, Random Forest, Elastic Net.
  • Validation: repeated double cross-validation (rdCV; Westerhuis et al. 2008, Filzmoser et al. 2009).
  • Variable selection: recursive feature elimination embedded in the rdCV loop.
  • Resampling tests and permutation tests: assessment of modelling fitnness and overfitting.

Installation

You also need to have the remotes R package installed. Just run the following from an R script or type it directly at the R console (normally the lower left window in RStudio):

install.packages('remotes')

When remotes is installed, you can install the MUVR2 package by running:

library(remotes)
install_github('MetaboComp/MUVR2')

References

  • Yan Y, Schillemans T, Skantze V, Brunius C. Adjusting for covariates and assessing modeling fitness in machine learning using MUVR2. Bioinformatics Advances. 2024, 4(1), vbae051.
  • Shi L, Westerhuis JA, Rosén J, Landberg R, Brunius C. Variable selection and validation in multivariate modelling. Bioinformatics. 2019, 35(6), 972–80.
  • Filzmoser P, Liebmann B, Varmuza K. Repeated double cross validation. Journal of Chemometrics. 2009, 23(4), 160-171.
  • Westerhuis JA, Hoefsloot HCJ, Smit S, Vis DJ, Smilde AK, Velzen EJJ, Duijnhoven JPM, Dorsten FA. Assessment of PLSDA cross validation. Metabolomics. 2008, 4(1), 81-89.

Copy Link

Version

Install

install.packages('MUVR2')

Monthly Downloads

167

Version

0.1.0

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Yingxiao Yan

Last Published

September 16th, 2024

Functions in MUVR2 (0.1.0)

pPerm

Calculate permutation p-value Calculate perutation p-value of actual model performance vs null hypothesis distribution. `pPerm` will calculate the cumulative (1-tailed) probability of `actual` belonging to `permutation_distribution`. `side` is guessed by actual value compared to median(permutation_distribution). Test is performed on original data OR ranked for non-parametric statistics.
nearZeroVar

Identify variables with near zero variance
mergeModels

Merge two MUVR class objects
biplotPLS

PLS biplot
get_rmsep

Get RMSEP
plotPred

Plot predictions for PLS regression
getVar

Get min, mid or max model from Elastic Net modelling
plotPerm

Plot for comparison of actual model fitness vs permutation/resampling
getMISS

Get number of misclassifications
getBER

Get BER
plotVAL

Plot validation metric
plotMV

Plot predictions
plotStability

Plot stability
plotPCA

PCA score plot
permutationPlot

Plot permutation analysis
rdcvNetParams

Make custom parameters for rdcvNet internal modelling
rdCV

Wrapper for repeated double cross-validation without variable selection
predMV

Predict outcomes Predict MV object using a MUVR class object and a X testing set. At present, this function only supports predictions for PLS regression type problems.
onehotencoding

One hot encoding
sampling_from_distribution

Sampling from the distribution of something
qMUVR2

Wrapper for speedy access to MUVR2 (autosetup of parallelization)
plotVIRank

Plot variable importance ranking
preProcess

Perform matrix pre-processing
varClass

Report variables belonging to different classes
Xotu

Microbiota composition in mosquitos for the classification tutorial
Q2_calculation

Q2 calculation
H0_reference

Get reference distribution for resampling tests
MUVR2_EN

MUVR2 with EN
MUVR2

MUVR2 with PLS and RF
H0_test

Perform permutation or resampling tests
customParams

Make custom parameters for internal modelling
IDR

Subject identifiers for the rye metabolomics regression tutorial
YR2

Rye consumption for the rye metabolomics regression tutorial, using unique individuals
IDR2

Subject identifiers for the rye metabolomics regression tutorial, using unique individuals
YR

Rye consumption for the rye metabolomics regression tutorial
crispEM

Effect matrix for the crisp multilevel tutorial
XRVIP2

Metabolomics data for the rye metabolomics regression tutorial, using unique individuals
XRVIP

Metabolomics data for the rye metabolomics regression tutorial
getVIRank

Get variable importance
Yotu

Village of capture of mosquitos for the classification tutorial
checkinput

Check input
confusionMatrix

Confusion matrix