h2o_automl

Dataframe. Dataframe containing all your data, including 
the independent variable labeled as 'tag'. If you want to define 
which variable should be used instead, use the y parameter.

Character. Name of the independent variable

Character vector. Force columns for the model to ignore

ignore

Character. If needed, df's column name with 'test' 
and 'train' values to split

train_test

Numeric. Value between 0 and 1 to split as train/test 
datasets. Value is for training set.

split

Column with observation weights. Giving some observation a
weight of zero is equivalent to excluding it from the dataset; giving an 
observation a relative weight of 2 is equivalent to repeating that 
row twice. Negative weights are not allowed.

weight

Boolean. Auto-balance train dataset with under-sampling?

balance

Boolean. Fill NA values with MICE?

impute

Boolean. Using the base function scale, do you wish
to center and/or scale all numerical values?

center, scale

Integer. Set a seed for reproducibility. AutoML can only 
guarantee reproducibility if max_models is used because max_time is 
resource limited.

seed

Integer. Number of folds for k-fold cross-validation of 
the models. If set to 0, the test data will be used as validation, and
cross-validation amd Stacked Ensembles disableded

nfolds

Integer. Threshold for selecting binary or regression 
models: this number is the threshold of unique values we should 
have in 'tag' (more than: regression; less than: classification)

thresh

Numeric. Max seconds you wish for the function 
to iterate

max_time

Numeric. Max models you wish for the function 
to create

max_models

Boolean. Erase everything in the current h2o 
instance before we start to train models?

start_clean

Vector of character strings. Algorithms to 
skip during the model-building phase. Set NULL to use all

exclude_algos

plots

Boolean. Ping an alarm when ready!

alarm

Boolean. Quiet messages, warnings, recommendations?

quiet

Boolean. Do you wish to save/export results into your 
working directory?

save

Character. In which directory do you wish to save 
the results? Working directory as default.

subdir

project

This function lets the user create a robust and fast model, using 
H2O's AutoML function. The result is a list with the best model, 
its parameters, datasets, performance metrics, variables 
importances, and other useful metrics.

R library for better/faster analytics, visualization, data mining, and machine learning tasks. With a wide variety of family functions, such as Machine Learning, Data Wrangling, Exploratory, and Scrapper, lares helps the analyst or data scientist to get quick and robust results, without the need of repetitive coding or extensive programming skills.

h2o_automl: Automated H2O's AutoML

Description

Usage

Arguments

Details

See Also