hdi (version 0.1-6)

hdi: Function to perform inference in high-dimensional (generalized) linear models

Description

Perform inference in high-dimensional (generalized) linear models using various approaches.

Usage

hdi(x, y, method = "multi.split", B = NULL, fraction = 0.5,
    model.selector = NULL, EV = NULL, threshold = 0.75,
    gamma = seq(0.05, 0.99, by = 0.01),
    classical.fit = NULL,
    args.model.selector = NULL, args.classical.fit = NULL,
    verbose = FALSE, ...)

Arguments

x

Design matrix (without intercept).

y

Response vector.

method

Multi-splitting ("multi.split") or stability-selection ("stability").

B

Number of sample-splits (for "multi.split") or sub-sample iterations (for "stability"). Default is 50 ("multi.split")or 100 ("stability"). Ignored otherwise.

fraction

Fraction of data used at each of the B iterations.

model.selector

Function to perform model selection. Default is lasso.cv ("multi.split") and lasso.firstq ("stability"). Function must have at least two arguments: x (the design matrix) and y (the response vector). Return value is the index vector of selected columns. See lasso.cv and lasso.firstq for examples. Additional arguments can be passed through args.model.selector.

EV

(only for "stability"). Bound(s) for expected number of false positives . Can be a vector.

threshold

(only for "stability"). Bound on selection frequency.

gamma

(only for "multi.split"). Vector of gamma-values.

classical.fit

(only for "multi.split"). Function to calculate (classical) p-values. Default is lm.pval. Function must have at least two arguments: x (the design matrix) and y (the response vector). Return value is the vector of p-values. See lm.pval for an example. Additional arguments can be passed through args.classical.fit.

args.model.selector

Named list of further arguments for function model.selector.

args.classical.fit

Named list of further arguments for function classical.fit.

verbose

Should information be printed out while computing (logical).

...

Other arguments to be passed to the underlying functions.

Value

pval

(only for "multi.split"). Vector of p-values.

gamma.min

(only for "multi.split"). Value of gamma where minimal p-values was attained.

select

(only for "stability"). List with selected predictors for the supplied values of EV.

EV

(only for "stability"). Vector of corresponding values of EV.

thresholds

(only for "stability"). Used thresholds.

freq

(only for "stability"). Vector of selection frequencies.

References

Meinshausen, N., Meier, L. and B<U+00FC>hlmann, P. (2009) P-values for high-dimensional regression. Journal of the American Statistical Association 104, 1671--1681.

Meinshausen, N. and B<U+00FC>hlmann, P. (2010) Stability selection (with discussion). Journal of the Royal Statistical Society: Series B 72, 417--473.

See Also

stability, multi.split

Examples

Run this code
# NOT RUN {
x <- matrix(rnorm(100*1000), nrow = 100, ncol = 200)
y <- x[,1] * 2 + x[,2] * 2.5 + rnorm(100)

## Multi-splitting with lasso.firstq as model selector function
fit.multi <- hdi(x, y, method = "multi.split",
                 model.selector =lasso.firstq,
                 args.model.selector = list(q = 10))
fit.multi
fit.multi$pval.corr[1:10] ## the first 10 p-values

## Stability selection
fit.stab <- hdi(x, y, method = "stability", EV = 2)
fit.stab
fit.stab$freq[1:10] ## frequency of the first 10 predictors
# }

Run the code above in your browser using DataCamp Workspace