train_test_split

Split dataframes or matrices into random train and test subsets. Takes the column at the y_index of data as response variable (y)
and the rest as the independent variables (X)

"Learning with Subset Stacking" is a supervised learning algorithm that is based on training many local estimators on subsets of a given dataset, and then passing their predictions to a global estimator. You can find the details about LESS in our manuscript at <arXiv:2112.06251>.

Burhan Ozer Cavdar

less

Learning with Subset Stacking

Ilker Birbil

train_test_split function

<dl><dt>data</dt>
<dd>Dataset that is going to be split</dd>
<dt>test_size</dt>
<dd>Represents the proportion of the dataset to include in the test split.
Should be between 0.0 and 1.0 (defaults to 0.3)</dd>
<dt>random_state</dt>
<dd>Controls the shuffling applied to the data before applying the split.
Pass an int for reproducible output across multiple function calls (defaults to NULL)</dd>
<dt>y_index</dt>
<dd>Corresponding column index of the response variable y (defaults to last column of data)</dd></dl>

Arguments

Dataset splitting — train_test_split

<dl>

<dt>data</dt>
<dd>Dataset that is going to be split</dd>


<dt>test_size</dt>
<dd>Represents the proportion of the dataset to include in the test split.
Should be between 0.0 and 1.0 (defaults to 0.3)</dd>


<dt>random_state</dt>
<dd>Controls the shuffling applied to the data before applying the split.
Pass an int for reproducible output across multiple function calls (defaults to NULL)</dd>


<dt>y_index</dt>
<dd>Corresponding column index of the response variable y (defaults to last column of data)</dd>

</dl>

`X_train`	Training input variables

`X_test`	Test input variables

`y_train`	Training response variables

`y_test`	Test response variables

Last chance! 50% off unlimited learning

train_test_split: Dataset splitting

Description

Usage

Value

Arguments

Examples