Learn R Programming

mdatools (version 0.7.0)

randtest: Randomization test for PLS regression

Description

randtest is used to carry out randomization/permutation test for a PLS regression model

Usage

randtest(x, y, ncomp = 15, center = T, scale = F, nperm = 1000, sig.level = 0.05, silent = TRUE)

Arguments

x
matrix with predictors.
y
vector or one-column matrix with response.
ncomp
maximum number of components to test.
center
logical, center or not predictors and response values.
scale
logical, scale (standardize) or not predictors and response values.
nperm
number of permutations.
sig.level
significance level.
silent
logical, show or not test progress.

Value

Returns an object of randtest class with following fields:
nperm
number of permutations used for the test.
stat
statistic values calculated for each component.
alpha
alpha values calculated for each component.
statperm
matrix with statistic values for each permutation.
corrperm
matrix with correlation between predicted and reference y-vales for each permutation.
ncomp.selected
suggested number of components.

Details

The class implements a method for selection of optimal number of components in PLS1 regression based on the randomization test [1]. The basic idea is that for each component from 1 to ncomp a statistic T, which is a covariance between t-score (X score, derived from a PLS model) and the reference Y values, is calculated. By repeating this for randomly permuted Y-values a distribution of the statistic is obtained. A parameter alpha is computed to show how often the statistic T, calculated for permuted Y-values, is the same or higher than the same statistic, calculated for original data without permutations.

If a component is important, then the covariance for unpermuted data should be larger than the covariance for permuted data and therefore the value for alpha will be quie small (there is still a small chance to get similar covariance). This makes alpha very similar to p-value in a statistical test.

The randtest procedure calculates alpha for each component, the values can be observed using summary or plot functions. There are also several function, allowing e.g. to show distribution of statistics and the critical value for each component.

References

S. Wiklund et al. Journal of Chemometrics 21 (2007) 427-439.

See Also

Methods for randtest objects:
print.randtest
prints information about a randtest object.
summary.randtest
shows summary statistics for the test.
plot.randtest
shows bar plot for alpha values.
plotHist.randtest
shows distribution of statistic plot.
plotCorr.randtest
shows determination coefficient plot.

Examples

Run this code
### Examples of using the test

## Get the spectral data from Simdata set and apply SNV transformation

data(simdata)

y = simdata$conc.c[, 3]
x = simdata$spectra.c
x = prep.snv(x)

## Run the test and show summary
## (normally use higher nperm values > 1000)
r = randtest(x, y, ncomp = 4, nperm = 200, silent = FALSE)
summary(r)

## Show plots

par( mfrow = c(3, 2))
plot(r)
plotHist(r, comp = 3)
plotHist(r, comp = 4)
plotCorr(r, 3)
plotCorr(r, 4)
par( mfrow = c(1, 1))

Run the code above in your browser using DataLab