fit.networkBasedSVM: Implementation of the network-based Support Vector Machine introduced by Yanni Zhu et al.

Description

Implementation of the network-based Support Vector Machine introduced by Yanni Zhu et al., 2009.

Usage

fit.networkBasedSVM(exps, y, DEBUG=FALSE, n.inner=3, scale=c("center", "scale"),
    sd.cutoff=1, lambdas=10^(-2:4), adjacencyList)

Arguments

exps

a p x n matrix of expression measurements with p samples and n genes.

a factor of length p comprising the class labels.

DEBUG

should debugging information be plotted.

n.inner

number of fold for the inner cross-validation.

scale

a character vector defining if the data should be centered and/or scaled. Possible values are center and/or scale. Defaults to c('center', 'scale').

sd.cutoff

a cutoff on the standard deviation (sd) of genes. Only genes with sd > sd.cutoff stay in the analysis.

lambdas

a set of values for lambda regularization parameter of the L$_\infty$-Norm. Which, if properly chosen, eliminates factors that are completely irrelevant to the response, what in turn leads to a factor-wise (subnetwork-wise) feature selection. The 'best'

adjacencyList

a adjacency list representing the network structure. The list can be generated from a adjacency matrix by using the function as.adjacencyList

Value

a networkBasedSVM object containing
featuresthe selected features
lambda.performanceoverview how different values of lambda performed in the inner cross validation
fitthe fitted network based SVM model

Details

mapping must be a data.frame with at least two columns. The column names have to be c('probesetID','graphID'). Where 'probesetID' is the probeset ID present in the expression matrix (i.e. colnames(x)) and 'graphID' is any ID that represents the nodes in the diffusionKernel (i.e. colnames(diffusionKernel) or rownames(diffusionKernel)). The purpose of the this mapping is that a gene or protein in the network might be represented by more than one probe set on the chip. Therefore, the algorithm must know which genes/protein in the network belongs to which probeset on the chip.

References

Zhu Y. et al. (2009). Network-based support vector machine for classification of microarray samples. BMC Bioinformatics

Examples

Run this code

library(Biobase)
data(sample.ExpressionSet)
x <- t(exprs(sample.ExpressionSet))
y <- factor(pData(sample.ExpressionSet)$sex)
# create the mapping
library('hgu95av2.db')
mapped.probes <- mappedkeys(hgu95av2REFSEQ)
refseq <- as.list(hgu95av2REFSEQ[mapped.probes])
times <- sapply(refseq, length)
mapping <- data.frame(probesetID=rep(names(refseq), times=times), graphID=unlist(refseq), row.names=NULL, stringsAsFactors=FALSE)
mapping <- unique(mapping)
library(pathClass)
data(adjacency.matrix)
matched <- matchMatrices(x=x, adjacency=adjacency.matrix, mapping=mapping)
ad.list <- as.adjacencyList(matched$adjacency)
res.nBSVM <- crossval(matched$x, y, theta.fit=fit.networkBasedSVM, folds=3, repeats=1, DEBUG=TRUE, parallel=FALSE, adjacencyList=ad.list, lambdas=10^(-1:2), sd.cutoff=50)

Run the code above in your browser using DataLab