kernel.pca.permute: Assess variable importance

Description

Assess importance of variables on a given PC component by computing the Crone-Crosby distance between original sample positions and sample positions obtained by a random permutation of the variables.

Usage

kernel.pca.permute(kpca.result, ncomp = 1, ..., directory = NULL)

Value

kernel.pca.permute returns a copy of the input kpca.result results and add values in the three entries: cc.distances, cc.variables and cc.blocks.

Arguments

kpca.result: a kernel.pca object returned by the kernel.pca function.
ncomp: integer. Number of KPCA components used to compute the importance. Default: 1.
...: list of character vectors. The parameter name must be the kernel name to be considered for permutation of variables. Provided vectors length has to be equal to the number of variables of the input dataset. A kernel is performed on each unique variables values. Crone-Crosby distances are computed on each KPCA performed on resulted kernels or meta-kernels and can be displayed using the plotVar.kernel.pca.
directory: character. To limit computational burden, this argument allows to store / read temporary computed kernels.

Author

Jerome Mariette <jerome.mariette@inrae.fr> Nathalie Vialaneix <nathalie.vialaneix@inrae.fr>

Details

plotVar.kernel.pca produces a barplot for each block. The variables for which the importance has been computed with kernel.pca.permute are displayed. The representation is limited to the ndisplay most important variables.

References

Mariette J. and Villa-Vialaneix N. (2018). Unsupervised multiple kernel learning for heterogeneous data integration. Bioinformatics, 34(6), 1009-1015. DOI: tools:::Rd_expr_doi("10.1093/bioinformatics/btx682")

Crone L. and Crosby D. (1995). Statistical applications of a metric on subspaces to satellite meteorology. Technometrics, 37(3), 324-328.

Examples

Run this code

data(TARAoceans)

# compute one kernel for the psychem dataset
phychem.kernel <- compute.kernel(TARAoceans$phychem, kernel.func = "linear")
# perform a KPCA
kernel.pca.result <- kernel.pca(phychem.kernel)

# compute importance for all variables in this kernel
kernel.pca.result <- kernel.pca.permute(kernel.pca.result, 
                                        phychem = colnames(TARAoceans$phychem))

Run the code above in your browser using DataLab