This function runs a permutation test to evaluate the effect of a subset of covariates on the covariance matrix estimates. Returns an estimated p-value.
significance.test(
formula,
data,
params.rfsrc = list(ntree = 1000, mtry = ceiling(px/3), nsplit = max(round(n/50),
10)),
nodesize.set = round(0.5^(1:100) * round(0.632 * n))[round(0.5^(1:100) * round(0.632
* n)) > py],
nperm = 500,
test.vars = NULL
)An object of class (covregrf, significancetest) which is a list
with the following components:
Estimated *p*-value, see below for details.
Best nodesize value selected with the proposed
tuning method using all covariates including the test.vars.
Best nodesize value selected with the
proposed tuning method using only the set of controlling covariates. If
test.vars is NULL, returns NULL.
Covariates whose effect on the covariance matrix estimates is evaluated.
Controlling set of covariates.
OOB predicted covariance matrices for training
observations using all covariates including the test.vars.
Predicted covariance matrices for the permutations
using all covariates including the test.vars. A list of
predictions for each permutation.
OOB predicted covariance matrices for training
observations using only the set of controlling covariates. If
test.vars is NULL, returns NULL.
Predicted covariance matrices for the
permutations using only the set of controlling covariates. If
test.vars is NULL, returns NULL.
Object of class formula or character describing
the model to fit. Interaction terms are not supported.
The multivariate data set which has \(n\) observations and \(px+py\) variables where \(px\) and \(py\) are the number of covariates (\(X\)) and response variables (\(Y\)), respectively. Should be a data.frame.
List of parameters that should be passed to
randomForestSRC. In the default parameter set, ntree = 1000,
mtry = \(px/3\) (rounded up), nsplit =
\(max(round(n/50), 10)\). See randomForestSRC for possible
parameters.
The set of nodesize levels for tuning. Default set
includes the power of two times the sub-sample size (\(.632n\)) greater
than the number of response variables (\(py\)).
Number of permutations.
Subset of covariates whose effect on the covariance matrix
estimates will be evaluated. A character vector defining the names of the
covariates. The default is NULL, which tests for the global effect
of the whole set of covariates.
We perform a hypothesis test to evaluate the effect of a subset of covariates on the covariance matrix estimates, while controlling for the rest of the covariates. Define the conditional covariance matrix of \(Y\) given all \(X\) variables as \(\Sigma_{X}\), and the conditional covariance matrix of \(Y\) given only the set of controlling \(X\) variables as \(\Sigma_{X}^{c}\). If a subset of covariates has an effect on the covariance matrix estimates obtained with the proposed method, then \(\Sigma_{X}\) should be significantly different from \(\Sigma_{X}^{c}\). We conduct a permutation test for the null hypothesis $$H_0 : \Sigma_{X} = \Sigma_{X}^{c}$$ We estimate a \(p\)-value with the permutation test. If the \(p\)-value is less than the pre-specified significance level \(\alpha\), we reject the null hypothesis.
Testing the global effect of the covariates on the conditional covariance
estimates is a particular case of the proposed significance test. Define
the unconditional covariance matrix estimate of \(Y\) as
\(\Sigma_{root}\) which is computed as the sample covariance matrix of
\(Y\), and the conditional covariance matrix of \(Y\) given \(X\) as
\(\Sigma_{X}\) which is obtained with covregrf(). If there is a
global effect of \(X\) on the covariance matrix estimates, the
\(\Sigma_{X}\) should be significantly different from \(\Sigma_{root}\).
The null hypothesis for this particular case is
$$H_0 : \Sigma_{X} = \Sigma_{root}$$
covregrf
predict.covregrf
print.covregrf