Verifies parameters and data and tries to run experiment.
mljar_fit(x, y, validx = NULL, validy = NULL, proj_title = NULL,
exp_title = NULL, dataset_title = NULL, val_dataset_title = NULL,
algorithms = c(), metric = "", wait_till_all_done = TRUE,
validation_kfolds = MLJAR_DEFAULT_FOLDS,
validation_shuffle = MLJAR_DEFAULT_SHUFFLE,
validation_stratify = MLJAR_DEFAULT_STRATIFY,
validation_train_split = MLJAR_DEFAULT_TRAIN_SPLIT,
tuning_mode = MLJAR_DEFAULT_TUNING_MODE,
create_ensemble = MLJAR_DEFAULT_ENSEMBLE,
single_algorithm_time_limit = MLJAR_DEFAULT_TIME_CONSTRAINT)
data.frame/matrix with training data
data.frame/matrix with training labels
data.frame/matrix with validation data
data.frame/matrix with validation labels
charcater with project title
charcater with experiment title
charcater with dataset name
charcater with validation dataset name
list of algorithms to use For binary classification task available algorithm are: "xgb" which is for Xgboost, "lgb" which is for LightGBM "mlp" which is for Neural Network, "rfc" which is for Random Forest, "etc" which is for Extra Trees, "rgfc" which is for Regularized Greedy Forest, "knnc" which is for k-Nearest Neighbors, "logreg" which is for Logistic Regression. For regression task there are available algorithms: "xgbr" which is for Xgboost, "lgbr" which is for LightGBM, "rgfr" which is for Regularized Greedy Forest, "rfr" which is for Random Forest, "etr" which is for Extra Trees.
charcater with metric For binary classification there are metrics: "auc" which is for Area Under ROC Curve, "logloss" which is for Logarithmic Loss. For regression tasks: "rmse" which is Root Mean Square Error, "mse" which is for Mean Square Error, "mase" which is for Mean Absolute Error.
boolean saying whether function should wait till all models are done
number of folds to be used in validation
boolean which specify if shuffle samples before training
boolean which decides whether samples will be divided into folds with the same class distribution
ratio how to split training dataset into train and validation
tuning mode
whether or not to create ensemble
numeric with time limit to calculate algorithm
structure with the best model