TwoHC_assign(X, index1, index2, new.X, dis.method = "cor", link.method = "ward", minclus = 4, maxmiss = 30, surv.time, status, method1 = "BIC", method2 = "g2")
ExpressionSet
or data matrix from which two HC tress to be derived. Columns are assumed to represent the samples, and rows represent the sample's features. Missing values are allowed.ExpressionSet
or data matrix corresponds to new samples. Columns are assumed to represent the samples, and rows represents the sample's features. Missing values are allowed.dist
function or the Pearson correlation (default).This function is designed for this problem. it works as follows: first, two independent HC trees will be derived from given data; second, partitions are extracted and the optimal partition is selected from each HC tree, separately; third, new patient's molecular profile is compared with each cluster in each optimal partition to calculate average similarity and identify two most similar clusters (competing clusters) fromt the two HC trees; finally, new sample is assigned to one of the two competing clusters which has better overall survival.
Obulkasim,A. et al., (2011). "Stepwise classification of cancer samples using clinical and molecular data", BMC Bioinformatics, 12, 422.
Troyanskaya,O. et al., (2001). "Missing value estimation methods for DNA microarrays". Bioinformatics, 17, 520-525.
Obulkasim,A. et al., (2013). "Semi-supervised adaptive-height snipping of the Hierarchical Clustering tree", submitted.
TwoHC_perm
, cluster_pred
data(TcgaGBM)
attach(TcgaGBM)
id1 <- which(drugs == "Avastin")
id2 <- which(drugs == "Temodar")
result <- TwoHC_assign(X = em[ ,c(id1[1:30], id2[1:30])], index1 = 1:30, index2 = 31:60,
new.X = em[, c(id1[31:60], id2[31:60])], minclus = 4,
surv.time = surv.time[c(id1[1:30], id2[1:30])],
status = status[c(id1[1:30], id2[1:30])])
Run the code above in your browser using DataLab