creer_matrice: Create p-values matrix from pairwise tests of all possible ratios of a compositional vector

Description

This function performs hypothesis testing on all possible pairwise ratios or differences of a set of variables in a given data frame, and store their results in a (symmetric) matrix

Usage

creer.Mp( d, noms, f.p, log = FALSE, en.log = !log,
          nom.var = 'R', n.coeurs = 1, ... )

Value

These function returns the matrix obtained as described above, with row an column names set to the names in noms (after conversion into column names and removing all non-numeric variables).

Arguments

d

The data frame that contains the compositional variables. Other objects will be coerced as data frames using as.data.frame

noms

A character vector containing the column names of the compositional variables to be used for ratio computations. Names absent from the data frame will be ignored with a warning.

Optionnally, an integer vector containing the column numbers can be given instead. They will be converted to column names before further processing.

f.p

An R function that will perform the hypothesis test on a single ratio (or log ratio, depending on log and en.log values).

This function should return a single numerical value, typically the p-value from the test.

This function must accept at least two named arguments: d that will contain the data frame containing all required variables and variable that will contain the name of the column that contains the (log) ratio in this data frame. All other needed arguments can be passed through ....

Such functions are provided for several common situations, see references at the end of this manual page.

log

If TRUE, values in the columns are assumed to be log-transformed, and consequently ratios are computed as differences of the columns. The result is in the log scale.

If FALSE, values are assumed to be raw data and ratios are computed directly.

en.log

If TRUE, the ratio will be log-transformed before applying the hypothesis test computed by f.p. Don't change the default unless you really know what you are doing.

nom.var

A length-one character vector giving the name of the variable containing a single ratio (or log-ratio). No sanity check is performed on it: if you experience strange behaviour, check you gave a valid column name, for instance using make.names.

n.coeurs

The number of CPU cores to use in computation, with parallelization using forks (does not work on Windows) with the help of the parallel package.

...

additional arguments to f.p, passed unchanged to it.

Author

Emmanuel Curis (emmanuel.curis@parisdescartes.fr)

Details

This function constructs a \(n\times n\) matrix, where n = length( noms ) (after eventually removing names in noms that do not correspond to numeric variables). Term \((i,j)\) in this matrix is the result of the f.p function when applied on the ratio of variables noms[ i ] and noms[ j ] (or on its log, if either (log == TRUE) && (en.log == FALSE) or (log == FALSE) && (en.log == TRUE)).

The f.p function is always called only once, for \(i < j\), and the other term is obtained by symmetry.

The diagonal of the matrix is filled with 1 without calling f.p, since corresponding ratios are always identically equal to 1 so nothing useful can be tested on.

Examples

Run this code

   # load the potery data set
   data( poteries )

   # Compute one-way ANOVA p-values for all ratios in this data set   
   Mp <- creer.Mp( poteries, c( 'Al', 'Na', 'Fe', 'Ca', 'Mg' ),
                   f.p = anva1.fpc, v.X = 'Site' )
   Mp

   # Make a graphe from it and plot it
   plot( grf.Mp( Mp ) )

Run the code above in your browser using DataLab