hmda.partition

Partition a data frame into training, testing, and
 optionally validation sets, and upload these sets to a local
 H2O server. If an outcome column <code>y</code> is provided and is a
 factor or character, stratified splitting is used; otherwise, a
 random split is performed. The proportions must sum to 1.

Holistic Multimodel Domain Analysis (HMDA) is a robust and transparent framework designed for exploratory machine learning research, aiming to enhance the process of feature assessment and selection. HMDA addresses key limitations of traditional machine learning methods by evaluating the consistency across multiple high-performing models within a fine-tuned modeling grid, thereby improving the interpretability and reliability of feature importance assessments. Specifically, it computes Weighted Mean SHapley Additive exPlanations (WMSHAP), which aggregate feature contributions from multiple models based on weighted performance metrics. HMDA also provides confidence intervals to demonstrate the stability of these feature importance estimates. This framework is particularly beneficial for analyzing complex, multidimensional datasets common in health research, supporting reliable exploration of mental health outcomes such as suicidal ideation, suicide attempts, and other psychological conditions. Additionally, HMDA includes automated procedures for feature selection based on WMSHAP ratios and performs dimension reduction analyses to identify underlying structures among features. For more details see Haghish (2025) <doi:10.13140/RG.2.2.32473.63846>.

E. F. Haghish 

HMDA

Holistic Multimodel Domain Analysis for Exploratory Machine
Learning

hmda.partition function

<dl><dt>df</dt>
<dd>A data frame to partition.</dd>
<dt>y</dt>
<dd>A string with the name of the outcome column.
Must match a column in <code>df</code>.</dd>
<dt>train</dt>
<dd>A numeric value for the proportion of the
training set.</dd>
<dt>test</dt>
<dd>A numeric value for the proportion of the
testing set.</dd>
<dt>validation</dt>
<dd>Optional numeric value for the proportion of
the validation set. Default is <code>NULL</code>. If
specified, train + test + validation must equal 1.</dd>
<dt>seed</dt>
<dd>A numeric seed for reproducibility.
Default is 2025.</dd></dl>

Arguments

Author

Partition Data for HMDA Analysis — hmda.partition

<dl>

<dt>df</dt>
<dd>A data frame to partition.</dd>


<dt>y</dt>
<dd>A string with the name of the outcome column.
Must match a column in <code>df</code>.</dd>


<dt>train</dt>
<dd>A numeric value for the proportion of the
training set.</dd>


<dt>test</dt>
<dd>A numeric value for the proportion of the
testing set.</dd>


<dt>validation</dt>
<dd>Optional numeric value for the proportion of
the validation set. Default is <code>NULL</code>. If
specified, train + test + validation must equal 1.</dd>


<dt>seed</dt>
<dd>A numeric seed for reproducibility.
Default is 2025.</dd>

</dl>

hmda.partition: Partition Data for HMDA Analysis

Description

Usage

Value

Arguments

Author

Details

Examples