hiertest (version 1.1)

hiertest: Convex Hierarchical Testing Method

Description

This is the main function, implementing the Convex Hierarchical Testing (CHT) procedure. The CHT procedure produces a set of test statistics for both main effects and interactions with the property that an interaction's statistic is never larger than at least one of its two main effects. This is accomplished by formulating a convex optimization problem that enforces a hierarchical sparsity relationship between the main effects and interactions. The result is that interactions with large main effects receive a "boost" relative to those that do not.

Usage

hiertest(x, y, type = c("Fisher", "simple"))

Arguments

x
n by p design matrix
y
binary (0 or 1) vector of length n indicating class
type
determines whether Fisher transform should be applied to interaction contrasts. See below for explanation. Default is Fisher and is the recommended choice.

Value

A hiertest object, which consists of an ordered list of the main effects and interactions and a vector indicating which of these are interactions.

Details

The Convex Hierarchical Testing test statistics are the knots of the CHT optimization problem. That is, the statistic for a given main effect or interaction is the value of lambda at which the corresponding parameter becomes nonzero in the regularization path. Theorem 1 of the CHT paper gives the closed form expression used to compute these knots (recall that for the interaction test statistics, one takes the maximum of the two corresponding knots).

In Section 2.1 of the CHT paper, the raw main effect and interaction contrasts are defined. These are referred to as "w" and "z" in the paper. The main effect contrast "w" is the standard two-sample t-statistic. The interaction contrast "z" is the normalized difference of the Fisher transformed sample correlations between the two classes. If one instead uses type="simple", we simply take for "z" a two-sample statistic on the products of features. We recommend that type="Fisher" be used instead of "simple".

References

Bien, Simon, and Tibshirani (2015) Convex Hierarchical Testing of Interactions. Annals of Applied Statistics. Vol. 9, No. 1, 27-42.

See Also

estimate.fdr

Examples

Run this code
# generate some data accoring to the backward model:
set.seed(1)
n <- 200
p <- 50
y <- rep(0:1, each=n/2)
x <- matrix(rnorm(n*p), n, p)
colnames(x) <- c(letters,LETTERS)[1:p]
# make some interactions between several pairs of variables:
R <- matrix(0.3, 5, 5)
diag(R) <- 1
x[y==1, 1:5] <- x[y==1, 1:5] %*% R
# and a main effect for variables 1 and 3:
x[y==1, 1:5] <- x[y==1, 1:5] + 0.5
testobj <- hiertest(x=x, y=y, type="Fisher")
# look at test statistics
print(testobj)
plot(testobj)
## Not run: 
# lamlist <- seq(5, 2, length=100)
# estfdr <- estimate.fdr(x, y, lamlist, type="Fisher", B=200)
# plot(estfdr)
# print(estfdr)
# # the cutoff lamlist[70] is estimated to have roughly 10% FDR:
# estfdr$fdr[70]
# # this allows us to reject this many interactions:
# nrejected <- estfdr$ncalled[70]
# # These are the interactions rejected:
# interactions.above(testobj, lamlist[70])
# ## End(Not run)

Run the code above in your browser using DataLab