best.r.sq: Use R^2 to find the variables that best explain a multivariate response.

Description

Finds the subset of explanatory variables in a formula that best explain the variation in a multivariate response, as measured by a chosen definition of R^2. Modifications are included for high dimensional data, such as multivariate abundance data in ecology.

Usage

best.r.sq(formula, data = parent.frame(), subset, var.subset,
  n.xvars= min(3, length(xn)), R2="h", ...)

Arguments

formula

a mvformula, a multivariate formula.

data

optional, the data.frame (or list) from which the variables in formula should be taken.

subset

an optional vector specifying a subset of observations to be used in the fitting process.

var.subset

an optional vector specifying the subset of the responses to be used.

n.xvars

the number of independent variables with the highest average R^2 that should be found.

the type of R^2 (correlation coefficient) that should be shown, possible values are: "h" = Hooper's R^2 = tr(SST^(-1)SSR))/p "v" = vector R^2 = det(SSR)/det(SST) "n" = none Note that for a univariate response, all of these are equivalent to the ordi

...

further arguments that are passed on to lm.

Value

A vector giving the indices of the independent variables with the greatest explanatory power.

Details

best.r.sq finds the n.xvars influence variables obtained by a forward selection in a multivariate linear model given by formula. Only the response variables given by var.subset are considered. However, if var.subset is NULL all response variables are considered. Interactions are excluded from the search mechanism, however the indices that are returned correspond to the indices in the model. This function should not be used for model selection, but only in plots.

Examples

Run this code

data(spider)
spiddat <- mvabund(spider$abund)
X <- spider$x

best.r.sq( spiddat~X )

Run the code above in your browser using DataLab