logicFS (version 1.42.0)

vim.permSNP: Permutation Based Importance Measures

Description

Computes the importances of input variables, SNPs, or sets of SNPs, respectively, based on permutations of the response. Currently only available for the classification and the logistic regression approach of logic regression.

Usage

vim.permInput(object, n.perm = NULL, standardize = TRUE, rebuild = FALSE, prob.case = 0.5, useAll = FALSE, version = 1, adjust = "bonferroni", addMatPerm = FALSE, rand=NA)
vim.permSNP(object, n.perm = NULL, standardize = TRUE, rebuild = FALSE, prob.case = 0.5, useAll = FALSE, version = 1, adjust = "bonferroni", addMatPerm = FALSE, rand = NA)
vim.permSet(object, set = NULL, n.perm = NULL, standardize = TRUE, rebuild = FALSE, prob.case = 0.5, useAll = FALSE, version = 1, adjust = "bonferroni", addMatPerm = FALSE, rand = NA)

Arguments

object
an object of class logicBagg, i.e.\ the output of logic.bagging.
set
either a list or a character or numeric vector. If NULL (default), then it will be assumed that data, i.e.\ the data set used in the application of logic.bagging, has been generated using make.snp.dummy or similar functions for coding variables by binary variables, i.e.\ with a function that splits a variable, say SNPx, into the dummy variables SNPx.1, SNPx.2, ... (where the ``." can also be any other sign, e.g., an underscore). If a character or a numeric vector, then the length of set must be equal to the number of variables used in object, i.e.\ the number of columns of data in the logicBagg object, and must specify the set to which a variable belongs either by an integer between 1 and the number of sets, or by a set name. If a variable should not be included in any of the sets, set the corresponding entry of set to NA. Using this specification of set it is not possible to assign a variable to more than one sets. For such a case, set set to a list (as follows). If set is a list, then each object in this list represents a set of variables. Therefore, each object must be either a character or a numeric vector specifying either the names of the variables that belongs to the respective set or the columns of data that contains these variables. If names(set) is NULL, generic names will be employed as names for the sets. Otherwise, names(set) are used.
n.perm
number of permutations used in the computation of the importances. By default (i.e.\ if n.perm = NULL), 100 permutations are used if rebuild = TRUE and the regression approach of logic regression has been used in logic.bagging (by setting ntrees to an integer larger than 1, or glm.if.1tree = TRUE). Otherwise, 1000 permutation are employed. Note that actually much more permutations should be used.
standardize
should the standardized importance measure be used?
rebuild
logical indicating whether the logic regression models should be rebuild (i.e.\ the parameters $beta$ of the generalized linear models should be recomputed) after removing a variable or a set of variables from the logic trees and for each permutation of the response. Note that setting rebuild = TRUE increases the computation time substantially.
prob.case
a numeric value between 0 and 1. If the logistic regression approach of logic regression has been used in logic.bagging, then an observation will be classified as a case (or more exactly, as 1), if the class probability of this observation is larger than prob.case. Otherwise, prob.case is ignored.
useAll
logical indicating whether all $m *$ n.perm permuted values should be used in the computation of the permutation based p-values, where $m$ is the number of variables or sets of variables, respectively. If FALSE, the n.perm permuted values corresponding to the respective variable (or set of variables) are employed in the determination of the p-value of this variable (or set of variables).
version
either 1 or 2. If 1, then the importance measure is computed by 1 - padj, where padj is the adjusted p-value. If 2, the importance measure is determined by -log10(padj), where a raw p-value equal to 0 is set to 1 / (10 * n.perm) to avoid infinitive importances.
adjust
character vector naming the method with which the raw permutation based p-values are adjusted for multiplicity. If "qvalue", the function qvalue.cal from the package siggenes is used to compute q-values. Otherwise, p.adjust is used to adjust for multiple comparisons. See p.adjust for all other possible specifications of adjust. If "none", the raw p-values will be used.
addMatPerm
should the (n.perm + 1) x $m$ matrix containing the original values (first column) and the permuted values (the remaining columns) of the importance measure for the $m$ variables or $m$ sets of variables be added to the output?
rand
an integer for setting the random number generator in a reproducible state.

Value

An object of class logicFS containing
vim
the values of the importance measure for the input variables, the SNPs, or the sets of SNPs, respectively,
prop
NULL,
primes
the names of the inputs, SNPs, or sets of variables, respectively,
type
the type of model (1: classification, 3: logistic regression),
param
NULL,
mat.imp
NULL,
measure
the name of the used importance measure,
threshold
0.95, i.e.\ the suggested threshold for calling an input, SNP or set of SNPs, respectively, important (this is just used as default value when plotting the importances, see argument thres of plot.logicFS),
mu
NULL,
useN
TRUE,
name
either "Variable", "SNP", or "Set",
mat.perm
if addMatPerm = FALSE, NULL; otherwise, a matrix containing the original and the permuted values of the respective importance measure.

References

Schwender, H., Ruczinski, I., Ickstadt, K. (2011). Testing SNPs and Sets of SNPs for Importance in Association Studies. Biostatistics, 12, 18-32.

See Also

logic.bagging, vim.input, vim.set, vim.signperm