do.unstratified.cv.data

matrix of the flat scores. It must be a named matrix, where rows are example (e.g. genes) and columns are classes/terms (e.g. HPO terms)

number of folds in which to split the dataset (<code>def. k=5</code>)

seed for the random generator. If <code>NULL</code> (def.) no initialization is performed

seed

This function splits a dataset in k-fold in an unstratified way (that is a fold may not have an equal amount of positive and 
negative examples). This function is used to perform k-fold cross-validation experiments in a hierarchical correction contest where 
splitting dataset in a stratified way is not needed.

An implementation of Hierarchical Ensemble Methods for Directed Acyclic Graphs (DAGs). The 'HEMDAG' package can be used to enhance the predictions of virtually any flat learning methods, by taking into account the hierarchical nature of the classes of a bio-ontology. 'HEMDAG' is specifically designed for exploiting the hierarchical relationships of DAG-structured taxonomies, such as the Human Phenotype Ontology (HPO) or the Gene Ontology (GO), but it can be also safely applied to tree-structured taxonomies (as FunCat), since trees are DAGs. 'HEMDAG' scale nicely both in terms of the complexity of the taxonomy and in the cardinality of the examples. (Marco Notaro, Max Schubach, Peter N. Robinson and Giorgio Valentini (2017) <doi:10.1186/s12859-017-1854-y>).

do.unstratified.cv.data: Unstratified cross-validation

Description

Usage

Arguments

Value

Examples