Benjamin HaibeKains

Benjamin HaibeKains

5 packages on CRAN

4 packages on Bioconductor

mRMRe

cran
67th

Percentile

This package contains a set of function to compute mutual information matrices from continuous, categorical and survival variables. It also contains function to perform feature selection with mRMR and a new ensemble mRMR technique.

CREAM

cran
18th

Percentile

Provides a new method for identification of clusters of genomic regions within chromosomes. Primarily, it is used for calling clusters of cis-regulatory elements (COREs). 'CREAM' uses genome-wide maps of genomic regions in the tissue or cell type of interest, such as those generated from chromatin-based assays including DNaseI, ATAC or ChIP-Seq. 'CREAM' considers proximity of the elements within chromosomes of a given sample to identify COREs in the following steps: 1) It identifies window size or the maximum allowed distance between the elements within each CORE, 2) It identifies number of elements which should be clustered as a CORE, 3) It calls COREs, 4) It filters the COREs with lowest order which does not pass the threshold considered in the approach.

genefu

bioconductor
14th

Percentile

This package contains functions implementing various tasks usually required by gene expression analysis, especially in breast cancer studies: gene mapping between different microarray platforms, identification of molecular subtypes, implementation of published gene signatures, gene selection, and survival analysis.

PharmacoGx

cran
14th

Percentile

Contains a set of functions to perform large-scale analysis of pharmacogenomic data.

predictionet

bioconductor
14th

Percentile

This package contains a set of functions related to network inference combining genomic data and prior information extracted from biomedical literature and structured biological databases. The main function is able to generate networks using Bayesian or regression-based inference methods; while the former is limited to < 100 of variables, the latter may infer networks with hundreds of variables. Several statistics at the edge and node levels have been implemented (edge stability, predictive ability of each node, ...) in order to help the user to focus on high quality subnetworks. Ultimately, this package is used in the 'Predictive Networks' web application developed by the Dana-Farber Cancer Institute in collaboration with Entagen.

survcomp

bioconductor
14th

Percentile

R package providing functions to assess and to compare the performance of risk prediction (survival) models.

MM2S

cran
18th

Percentile

Description: A single-sample classifier that generates Medulloblastoma (MB) subtype predictions for single-samples of human MB patients and model systems, including cell lines and mouse-models. The MM2S algorithm uses a systems-based methodology that facilitates application of the algorithm on samples irrespective of their platform or source of origin. MM2S demonstrates > 96% accuracy for patients of well-characterized normal cerebellum, Wingless (WNT), or Sonic hedgehog (SHH) subtypes, and the less-characterized Group4 (86%) and Group3 (78.2%). MM2S also enables classification of MB cell lines and mouse models into their human counterparts.This package contains function for implementing the classifier onto human data and mouse data, as well as graphical rendering of the results as PCA plots and heatmaps.

MM2Sdata

cran
18th

Percentile

Gene Expression datasets for the 'MM2S' package. Contains normalized expression data for Human Medulloblastoma ('GSE37418') as well as Mouse Medulloblastoma models ('GSE36594').

saps

bioconductor
15th

Percentile

Functions implementing the Significance Analysis of Prognostic Signatures method (SAPS). SAPS provides a robust method for identifying biologically significant gene sets associated with patient survival. Three basic statistics are computed. First, patients are clustered into two survival groups based on differential expression of a candidate gene set. P_pure is calculated as the probability of no survival difference between the two groups. Next, the same procedure is applied to randomly generated gene sets, and P_random is calculated as the proportion achieving a P_pure as significant as the candidate gene set. Finally, a pre-ranked Gene Set Enrichment Analysis (GSEA) is performed by ranking all genes by concordance index, and P_enrich is computed to indicate the degree to which the candidate gene set is enriched for genes with univariate prognostic significance. A SAPS_score is calculated to summarize the three statistics, and optionally a Q-value is computed to estimate the significance of the SAPS_score by calculating SAPS_scores for random gene sets.