This function extracts features based on ML method, finds optimal cut-off values of features using sequencial Cox PH model and obtain the most consistent level according to the cut-offs.
mlhighHet(cols, idSurv, idEvent, idFrail, num, fold = 3, data)dataframes containing optimal gene cutoff values and most consistent level according to those cut-offs with frailty variance.
A numeric vector of column numbers indicating the features for which the log Loss functions are to be computed
The name of the survival time variable
The name of the survival event variable
The name of the frailty variable
Number of features to be selected
An integer denoting number of folds in cross validation, default value 3
A data frame that contains the survival and covariate information for the subjects
Atanu Bhattacharjee, Gajendra K. Vishwakarma & Souvik Banerjee
Performs heterogeneity analysis in gene expression
This function extracts features based on minimum log-Loss function using Cox proportional hazard model as learner method on a high dimensional survival data. For those selected genes, we obtain optimal cutoff values using minimum p-value in a Cox PH model. The Cox PH model is used sequencially for each combination of genes and all possible gene combinations are tested to obtain best possible combination with minimum BIC value. The subjects are classified according to different levels of those genes. Using a Cox PH frailty model, we obtain the most consistent level for which the frailty variance is minimum. The data is splited using cross validation technique. The performance measure is considered as logarithmic loss function. It is defined as, $$L(f,t)=-log(f(t))$$ The CoxPH frailty model is defined as, $$\lambda(t)=\lambda 0(t)\nu exp{X'\beta}$$ where \(\nu\) is called the frailty. The variance of the frailty term is considered as the heterogeneity among the subjects or patients. Gaussian distribution with mean 0 is considered for the distribution of frailty component.
Sonabend, R., Király, F. J., Bender, A., Bernd Bischl B. and Lang M. mlr3proba: An R Package for Machine Learning in Survival Analysis, 2021, Bioinformatics, <https://doi.org/10.1093/bioinformatics/btab039>
Bhattacharjee, A. Vishwakarma, G.K. and Banerjee, S. A modified risk detection approach of biomarkers by frailty effect on multiple time to event data, 2020, <arXiv:2012.02102>.
mlhighCox, mlhighFrail
if (FALSE) {
data(hnscc)
mlhighHet(cols=c(27:32), idSurv="OS", idEvent="Death", idFrail="ID", num=2, fold = 3, data=hnscc)
}
Run the code above in your browser using DataLab