SOUP: SOUP Main Function

Description

Main function of the package, interface for every analysis. The dataset can be balanced or not for almost all possible choices of the input parameters. The function allows also for the presence of one or more continuous covariates or for stratified analysis.

Usage

SOUP(Y, covars, data = NULL, analysisType, p.adj.method, p.valuesType, testStatistic, combFunct, univ.p.values = TRUE, tails = NULL, linearInter = FALSE, returnPermSpace = TRUE, nPerms = 999L, alpha = 0.05, seed, iteratedNPC, ...)

Arguments

input matrix where each column is a response variables.

covars

it can be a matrix, a data.frame or a formula, in the first two cases it must contains at least the labels of groups, in the latter case it has to be a right-sided formula (e.g. ~ v1 + v2) specifying the model to extract from the data input.

data

optional data.frame containing covariates requested by covars, if covars is not a formula this input is useless.

analysisType

character, type of the analysis to be performed: it can be "simple" if the only covariate is the labels of groups, "strata" if there is also a stratifying (categorical) covariate, "regres" if there is one or more (numerical or not) covariate(s) besides labels of groups. In the latter case the linear effect of the covariates is removed from the response variables are residualised by the matrix $V^{-1/2}$ obtained from $V = I - H$ (where $I$ is the identity matrix and $H$ is the ``hat'' matrix of the OLS, by means of a spectral decomposition.

p.adj.method

character string containing the type of required p-value adjustment

p.valuesType

character string indicating the type of p-value to be used, it can be "permutation" or "asymptotic"

testStatistic

character string indicating the test statistic to be used, it depends on both analysisType and on p.valuesType and the alternatives are:

AD, meanDiff: for all analysisType but only using permutation p-values
Ttest: for all analysisType but only using asymptotic p-values
Hotelling: with both permutation and asymptotic emphp-values, with "simple" and "regres" but not with "strata" analysisType
lmCoef: only with "strata" analysisType and with "asymptotic" p-values

combFunct

character string containing the desired combining function to be used, choices are:

Fisher: the famous Fisher's p-values combining function
Liptak: it uses the quantile function of the Normal distribution to combine p-values
minP, tippett: combine p-values by taking the minimum across the set
maxT: combines directly the test statistics by taking the maximum across the set
direct, sumT: combine the test statistics by summing them
sumT2: combines the test statistics by squaring and summing them

See the references for more details about their properties.

univ.p.values

logical, if TRUE (default) p-values are returned for each variable separately in a 3-ways array, the chosen multiplicity correction is performed independently for each variable

tails

integer vector of ${+1,-1}$ containing the alternatives for response variables: +1 means ``the higher the better'', -1 means ``the lower the better'' (direction of preference), if NULL (default) all variables are considered to be of the type ``the higher the better''

linearInter

logical, if TRUE the presence of linear interaction is assumed between levels of the stratifying covariate and response variables, this affects only the "lmCoef" test statistic in the (in the "strata" analysisType), basically the contrasts matrix of groups is multiplied by the levels of the stratifying factor.

returnPermSpace

logical if TRUE (default) the whole permutation space is returned, class PermSpace, otherwise it is an empty instance of the class.

nPerms

integer number of permutation to be performed

alpha

numeric desired significance level, i.e. type-I error

seed

integer seed for the Random Number Generator

iteratedNPC

logical, single or iterated Non-Parametric Combination, see \ codeiterNPC for details.

...

put here the optional weights and subsets for the NPC function and the permutation space of rows indexes permSpaceID. The latter allows to exactly reproduce a previous analysis, if all other inputs are kept equal, or to see what happens changing for example only the testStatistic.

Value

an object of class SoupObject.

Details

Depending on the chosen p-values type and on the analysis type, only some options can be selected:

with "simple" or "regres" analysis and "asymptotic" p-values, "Hotelling" and "Ttest"; with permutation p-values "AD", "Hotelling" and "meanDiff" can be selected.
With "strata" analysis and "asymptotic" p-values, "lmCoef" and "Ttest"; with "permutation" p-values "AD" and "meanDiff" can be selected.

References

Pesarin, F. and Salmaso, L. (2010) Permutation Tests for Complex Data. Wiley: United Kingdom.

Pesarin F. (2001) Multivariate Permutation Tests with Applications in Biostatistics Wiley: New York.

Federico Mattiello (2010) Some resampling-based procedures for ranking of multivariate populations, Master's Thesis, Faculty of Statistical Sciences: Padova.

Examples

Run this code

###
### testing SOUP
###
rm(list = ls()); gc(reset = TRUE)

require(SOUP)
n <- 5L         # replication of the experiment
G <- 4L         # number of groups
nVar <- 10L     # number of variables
shift <- 1.5    # shift to be added to group 3
alpha <- c(0.01, 0.05, 0.1)        # significance levels

## groups factor
groups <- gl(G, n, labels = paste("gr", seq_len(n), sep = "_"))

set.seed(12345)
Y <- matrix(rnorm(n * G * nVar), nrow = n * G, ncol = nVar)
colnames(Y) <- paste("var", seq_len(nVar), sep = "_")
ind1 <- groups == unique(groups)[3L]
Y[ind1, ] <- Y[ind1, ] + shift

res <- SOUP(Y = Y, covars = as.matrix(groups), analysisType = "simple",
        testStatistic = "meanDiff", combFunct = "Fisher",
        alpha = alpha,
        subsets = list("first" = 1:5, "second" = 6:10),
        weights = list(
                "firstW" = c(.1, .2, .1, .5, .1),
                "secondW" = rep.int(1, 5)),
        p.valuesType = "permutation", p.adj.method = "FWEminP")
res

Run the code above in your browser using DataLab