mljar_fit: MLJAR FIT

Description

Verifies parameters and data and tries to run experiment.

Usage

mljar_fit(x, y, validx = NULL, validy = NULL, proj_title = NULL,
  exp_title = NULL, dataset_title = NULL, val_dataset_title = NULL,
  algorithms = c(), metric = "", wait_till_all_done = TRUE,
  validation_kfolds = MLJAR_DEFAULT_FOLDS,
  validation_shuffle = MLJAR_DEFAULT_SHUFFLE,
  validation_stratify = MLJAR_DEFAULT_STRATIFY,
  validation_train_split = MLJAR_DEFAULT_TRAIN_SPLIT,
  tuning_mode = MLJAR_DEFAULT_TUNING_MODE,
  create_ensemble = MLJAR_DEFAULT_ENSEMBLE,
  single_algorithm_time_limit = MLJAR_DEFAULT_TIME_CONSTRAINT)

Arguments

data.frame/matrix with training data

data.frame/matrix with training labels

validx

data.frame/matrix with validation data

validy

data.frame/matrix with validation labels

proj_title

charcater with project title

exp_title

charcater with experiment title

dataset_title

charcater with dataset name

val_dataset_title

charcater with validation dataset name

algorithms

list of algorithms to use For binary classification task available algorithm are: "xgb" which is for Xgboost, "lgb" which is for LightGBM "mlp" which is for Neural Network, "rfc" which is for Random Forest, "etc" which is for Extra Trees, "rgfc" which is for Regularized Greedy Forest, "knnc" which is for k-Nearest Neighbors, "logreg" which is for Logistic Regression. For regression task there are available algorithms: "xgbr" which is for Xgboost, "lgbr" which is for LightGBM, "rgfr" which is for Regularized Greedy Forest, "rfr" which is for Random Forest, "etr" which is for Extra Trees.

metric

charcater with metric For binary classification there are metrics: "auc" which is for Area Under ROC Curve, "logloss" which is for Logarithmic Loss. For regression tasks: "rmse" which is Root Mean Square Error, "mse" which is for Mean Square Error, "mase" which is for Mean Absolute Error.

wait_till_all_done

boolean saying whether function should wait till all models are done

validation_kfolds

number of folds to be used in validation

validation_shuffle

boolean which specify if shuffle samples before training

validation_stratify

boolean which decides whether samples will be divided into folds with the same class distribution

validation_train_split

ratio how to split training dataset into train and validation

tuning_mode

tuning mode

create_ensemble

whether or not to create ensemble

single_algorithm_time_limit

numeric with time limit to calculate algorithm

Value

structure with the best model