selectiveInference (version 1.2.5)

groupfs: Select a model with forward stepwise.

Description

This function implements forward selection of linear models almost identically to step with direction = "forward". The reason this is a separate function from fs is that groups of variables (e.g. dummies encoding levels of a categorical variable) must be handled differently in the selective inference framework.

Usage

groupfs(x, y, index, maxsteps, sigma = NULL, k = 2, intercept = TRUE,
  center = TRUE, normalize = TRUE, aicstop = 0, verbose = FALSE)

Arguments

x

Matrix of predictors (n by p).

y

Vector of outcomes (length n).

index

Group membership indicator of length p. Check that sort(unique(index)) = 1:G where G is the number of distinct groups.

maxsteps

Maximum number of steps for forward stepwise.

sigma

Estimate of error standard deviation for use in AIC criterion. This determines the relative scale between RSS and the degrees of freedom penalty. Default is NULL corresponding to unknown sigma. When NULL, link{groupfsInf} performs truncated F inference instead of truncated \(\chi\). See extractAIC for details on the AIC criterion.

k

Multiplier of model size penalty, the default is k = 2 for AIC. Use k = log(n) for BIC, or k = 2log(p) for RIC (best for high dimensions, when \(p > n\)). If \(G < p\) then RIC may be too restrictive and it would be better to use log(G) < k < 2log(p).

intercept

Should an intercept be included in the model? Default is TRUE. Does not count as a step.

center

Should the columns of the design matrix be centered? Default is TRUE.

normalize

Should the design matrix be normalized? Default is TRUE.

aicstop

Early stopping if AIC increases. Default is 0 corresponding to no early stopping. Positive integer values specify the number of times the AIC is allowed to increase in a row, e.g. with aicstop = 2 the algorithm will stop if the AIC criterion increases for 2 steps in a row. The default of step corresponds to aicstop = 1.

verbose

Print out progress along the way? Default is FALSE.

Value

An object of class "groupfs" containing information about the sequence of models in the forward stepwise algorithm. Call the function groupfsInf on this object to compute selective p-values.

See Also

groupfsInf, factorDesign.

Examples

Run this code
# NOT RUN {
x = matrix(rnorm(20*40), nrow=20)
index = sort(rep(1:20, 2))
y = rnorm(20) + 2 * x[,1] - x[,4]
fit = groupfs(x, y, index, maxsteps = 5)
out = groupfsInf(fit)
out
# }

Run the code above in your browser using DataLab