fit.networkBasedSVM: Implementation of the network-based Support Vector Machine introduced by Yanni Zhu et al., 2009.

Description

mapping must be a data.frame with at least two columns. The column names have to be c('probesetID','graphID'). Where 'probesetID' is the probeset ID present in the expression matrix (i.e. colnames(x)) and 'graphID' is any ID that represents the nodes in the diffusionKernel (i.e. colnames(diffusionKernel) or rownames(diffusionKernel)). The purpose of the this mapping is that a gene or protein in the network might be represented by more than one probe set on the chip. Therefore, the algorithm must know which genes/protein in the network belongs to which probeset on the chip.

Usage

fit.networkBasedSVM(exps, y, DEBUG = FALSE, n.inner = 3, scale = c("center", "scale"), sd.cutoff = 1, lambdas = 10^(-2:4), adjacencyList)

Arguments

exps

a p x n matrix of expression measurements with p samples and n genes.

a factor of length p comprising the class labels.

DEBUG

should debugging information be plotted.

n.inner

number of fold for the inner cross-validation.

scale

a character vector defining if the data should be centered and/or scaled. Possible values are center and/or scale. Defaults to c('center', 'scale').

sd.cutoff

a cutoff on the standard deviation (sd) of genes. Only genes with sd > sd.cutoff stay in the analysis.

lambdas

a set of values for lambda regularization parameter of the L$_\infty$-Norm. Which, if properly chosen, eliminates factors that are completely irrelevant to the response, what in turn leads to a factor-wise (subnetwork-wise) feature selection. The 'best' lambda is found by an inner-cross validation.

adjacencyList

a adjacency list representing the network structure. The list can be generated from a adjacency matrix by using the function as.adjacencyList

Value

features: the selected features
lambda.performance: overview how different values of lambda performed in the inner cross validation
fit: the fitted network based SVM model

References

Zhu Y. et al. (2009). Network-based support vector machine for classification of microarray samples. BMC Bioinformatics

Examples

Run this code

## Not run: 
# library(Biobase)
# data(sample.ExpressionSet)
# x <- t(exprs(sample.ExpressionSet))
# y <- factor(pData(sample.ExpressionSet)$sex)
# # create the mapping
# library('hgu95av2.db')
# mapped.probes <- mappedkeys(hgu95av2REFSEQ)
# refseq <- as.list(hgu95av2REFSEQ[mapped.probes])
# times <- sapply(refseq, length)
# mapping <- data.frame(probesetID=rep(names(refseq), times=times), graphID=unlist(refseq), 
# row.names=NULL, stringsAsFactors=FALSE)
# mapping <- unique(mapping)
# library(pathClass)
# data(adjacency.matrix)
# matched <- matchMatrices(x=x, adjacency=adjacency.matrix, mapping=mapping)
# ad.list <- as.adjacencyList(matched$adjacency)
# res.nBSVM <- crossval(matched$x, y, theta.fit=fit.networkBasedSVM, folds=3, repeats=1, DEBUG=TRUE,
# parallel=FALSE, adjacencyList=ad.list, lambdas=10^(-1:2), sd.cutoff=50)
# ## End(Not run)

Run the code above in your browser using DataLab