estimateSigma: Estimate the noise standard deviation in regression

Description

Estimates the standard deviation of the noise, for use in the selectiveInference package

Usage

estimateSigma(x, y, intercept=TRUE, standardize=TRUE)

Arguments

Matrix of predictors (n by p)

Vector of outcomes (length n)

intercept

Should glmnet be run with an intercept? Default is TRUE

standardize

Should glmnet be run with standardized predictors? Default is TRUE

Value

sigmahat

The estimate of sigma

The degrees of freedom of lasso fit used

Details

This function estimates the standard deviation of the noise, in a linear regresion setting. A lasso regression is fit, using cross-validation to estimate the tuning parameter lambda. With sample size n, yhat equal to the predicted values and df being the number of nonzero coefficients from the lasso fit, the estimate of sigma is sqrt(sum((y-yhat)^2) / (n-df-1)). Important: if you are using glmnet to compute the lasso estimate, be sure to use the settings for the "intercept" and "standardize" arguments in glmnet and estimateSigma. Same applies to fs or lar, where the argument for standardization is called "normalize".

References

Stephen Reid, Jerome Friedman, and Rob Tibshirani (2014). A study of error variance estimation in lasso regression. arXiv:1311.5274.

Examples

Run this code

# NOT RUN {
set.seed(33)
n = 50
p = 10
sigma = 1
x = matrix(rnorm(n*p),n,p)
beta = c(3,2,rep(0,p-2))
y = x%*%beta + sigma*rnorm(n)

# run forward stepwise
fsfit = fs(x,y)

# estimate sigma
sigmahat = estimateSigma(x,y)$sigmahat

# run sequential inference with estimated sigma
out = fsInf(fsfit,sigma=sigmahat)
out
# }

Run the code above in your browser using DataLab