Build a Deep Neural Network model using CPUs Builds a feed-forward multilayer artificial neural network on an H2OFrame
h2o.deeplearning(x, y, training_frame, model_id = NULL, validation_frame = NULL, nfolds = 0, keep_cross_validation_predictions = FALSE, keep_cross_validation_fold_assignment = FALSE, fold_assignment = c("AUTO", "Random", "Modulo", "Stratified"), fold_column = NULL, ignore_const_cols = TRUE, score_each_iteration = FALSE, weights_column = NULL, offset_column = NULL, balance_classes = FALSE, class_sampling_factors = NULL, max_after_balance_size = 5, max_hit_ratio_k = 0, checkpoint = NULL, pretrained_autoencoder = NULL, overwrite_with_best_model = TRUE, use_all_factor_levels = TRUE, standardize = TRUE, activation = c("Tanh", "TanhWithDropout", "Rectifier", "RectifierWithDropout", "Maxout", "MaxoutWithDropout"), hidden = c(200, 200), epochs = 10, train_samples_per_iteration = -2, target_ratio_comm_to_comp = 0.05, seed = -1, adaptive_rate = TRUE, rho = 0.99, epsilon = 1e-08, rate = 0.005, rate_annealing = 1e-06, rate_decay = 1, momentum_start = 0, momentum_ramp = 1e+06, momentum_stable = 0, nesterov_accelerated_gradient = TRUE, input_dropout_ratio = 0, hidden_dropout_ratios = NULL, l1 = 0, l2 = 0, max_w2 = 3.4028235e+38, initial_weight_distribution = c("UniformAdaptive", "Uniform", "Normal"), initial_weight_scale = 1, initial_weights = NULL, initial_biases = NULL, loss = c("Automatic", "CrossEntropy", "Quadratic", "Huber", "Absolute", "Quantile"), distribution = c("AUTO", "bernoulli", "multinomial", "gaussian", "poisson", "gamma", "tweedie", "laplace", "quantile", "huber"), quantile_alpha = 0.5, tweedie_power = 1.5, huber_alpha = 0.9, score_interval = 5, score_training_samples = 10000, score_validation_samples = 0, score_duty_cycle = 0.1, classification_stop = 0, regression_stop = 1e-06, stopping_rounds = 5, stopping_metric = c("AUTO", "deviance", "logloss", "MSE", "RMSE", "MAE", "RMSLE", "AUC", "lift_top_group", "misclassification", "mean_per_class_error"), stopping_tolerance = 0, max_runtime_secs = 0, score_validation_sampling = c("Uniform", "Stratified"), diagnostics = TRUE, fast_mode = TRUE, force_load_balance = TRUE, variable_importances = TRUE, replicate_training_data = TRUE, single_node_mode = FALSE, shuffle_training_data = FALSE, missing_values_handling = c("MeanImputation", "Skip"), quiet_mode = FALSE, autoencoder = FALSE, sparse = FALSE, col_major = FALSE, average_activation = 0, sparsity_beta = 0, max_categorical_features = 2147483647, reproducible = FALSE, export_weights_and_biases = FALSE, mini_batch_size = 1, categorical_encoding = c("AUTO", "Enum", "OneHotInternal", "OneHotExplicit", "Binary", "Eigen", "LabelEncoder", "SortByResponse", "EnumLimited"), elastic_averaging = FALSE, elastic_averaging_moving_rate = 0.9, elastic_averaging_regularization = 0.001)
A vector containing the names or indices of the predictor variables to use in building the model. If x is missing,then all columns except y are used.
The name of the response variable in the model.If the data does not contain a header, this is the first column index, and increasing from left to right. (The response must be either an integer or a categorical variable).
Id of the training data frame (Not required, to allow initial validation of model parameters).
Destination id for this model; auto-generated if not specified.
Id of the validation data frame.
Number of folds for N-fold cross-validation (0 to disable or >= 2). Defaults to 0.
Logical. Whether to keep the predictions of the cross-validation models. Defaults to FALSE.
Logical. Whether to keep the cross-validation fold assignment. Defaults to FALSE.
Cross-validation fold assignment scheme, if fold_column is not specified. The 'Stratified' option will stratify the folds based on the response variable, for classification problems. Must be one of: "AUTO", "Random", "Modulo", "Stratified". Defaults to AUTO.
Column with cross-validation fold index assignment per observation.
Logical. Ignore constant columns. Defaults to TRUE.
Logical. Whether to score during each iteration of model training. Defaults to FALSE.
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed.
Offset column. This will be added to the combination of columns before applying the link function.
Logical. Balance training data class counts via over/under-sampling (for imbalanced data). Defaults to
Desired over/under-sampling ratios per class (in lexicographic order). If not specified, sampling factors will be automatically computed to obtain class balance during training. Requires balance_classes.
Maximum relative size of the training data after balancing class counts (can be less than 1.0). Requires balance_classes. Defaults to 5.0.
Max. number (top K) of predictions to use for hit ratio computation (for multi-class only, 0 to disable). Defaults to 0.
Model checkpoint to resume training with.
Pretrained autoencoder model to initialize this model with.
Logical. If enabled, override the final model with the best model found during training. Defaults to
Logical. Use all factor levels of categorical variables. Otherwise, the first factor level is omitted
(without loss of accuracy). Useful for variable importances and auto-enabled for autoencoder. Defaults to
Logical. If enabled, automatically standardize the data. If disabled, the user must provide properly
scaled input data. Defaults to TRUE.
Activation function. Must be one of: "Tanh", "TanhWithDropout", "Rectifier", "RectifierWithDropout", "Maxout", "MaxoutWithDropout". Defaults to Rectifier.
Hidden layer sizes (e.g. [100, 100]). Defaults to [200, 200].
How many times the dataset should be iterated (streamed), can be fractional. Defaults to 10.
Number of training samples (globally) per MapReduce iteration. Special values are 0: one epoch, -1: all available data (e.g., replicated training data), -2: automatic. Defaults to -2.
Target ratio of communication overhead to computation. Only for multi-node operation and train_samples_per_iteration = -2 (auto-tuning). Defaults to 0.05.
Seed for random numbers (affects certain parts of the algo that are stochastic and those might or might not be enabled by default) Note: only reproducible when running single threaded. Defaults to -1 (time-based random number).
Logical. Adaptive learning rate. Defaults to TRUE.
Adaptive learning rate time decay factor (similarity to prior updates). Defaults to 0.99.
Adaptive learning rate smoothing factor (to avoid divisions by zero and allow progress). Defaults to 1e-08.
Learning rate (higher => less stable, lower => slower convergence). Defaults to 0.005.
Learning rate annealing: rate / (1 + rate_annealing * samples). Defaults to 1e-06.
Learning rate decay factor between layers (N-th layer: rate * rate_decay ^ (n - 1). Defaults to 1.
Initial momentum at the beginning of training (try 0.5). Defaults to 0.
Number of training samples for which momentum increases. Defaults to 1000000.
Final momentum after the ramp is over (try 0.99). Defaults to 0.
Logical. Use Nesterov accelerated gradient (recommended). Defaults to TRUE.
Input layer dropout ratio (can improve generalization, try 0.1 or 0.2). Defaults to 0.
Hidden layer dropout ratios (can improve generalization), specify one value per hidden layer, defaults to 0.5.
L1 regularization (can add stability and improve generalization, causes many weights to become 0). Defaults to 0.
L2 regularization (can add stability and improve generalization, causes many weights to be small. Defaults to 0.
Constraint for squared sum of incoming weights per unit (e.g. for Rectifier). Defaults to 3.4028235e+38.
Initial weight distribution. Must be one of: "UniformAdaptive", "Uniform", "Normal". Defaults to UniformAdaptive.
Uniform: -value...value, Normal: stddev. Defaults to 1.
A list of H2OFrame ids to initialize the weight matrices of this model with.
A list of H2OFrame ids to initialize the bias vectors of this model with.
Loss function. Must be one of: "Automatic", "CrossEntropy", "Quadratic", "Huber", "Absolute", "Quantile". Defaults to Automatic.
Distribution function Must be one of: "AUTO", "bernoulli", "multinomial", "gaussian", "poisson", "gamma", "tweedie", "laplace", "quantile", "huber". Defaults to AUTO.
Desired quantile for Quantile regression, must be between 0 and 1. Defaults to 0.5.
Tweedie power for Tweedie regression, must be between 1 and 2. Defaults to 1.5.
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1). Defaults to 0.9.
Shortest time interval (in seconds) between model scoring. Defaults to 5.
Number of training set samples for scoring (0 for all). Defaults to 10000.
Number of validation set samples for scoring (0 for all). Defaults to 0.
Maximum duty cycle fraction for scoring (lower: more training, higher: more scoring). Defaults to 0.1.
Stopping criterion for classification error fraction on training data (-1 to disable). Defaults to 0.
Stopping criterion for regression error (MSE) on training data (-1 to disable). Defaults to 1e-06.
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable) Defaults to 5.
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression) Must be one of: "AUTO", "deviance", "logloss", "MSE", "RMSE", "MAE", "RMSLE", "AUC", "lift_top_group", "misclassification", "mean_per_class_error". Defaults to AUTO.
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much) Defaults to 0.
Maximum allowed runtime in seconds for model training. Use 0 to disable. Defaults to 0.
Method used to sample validation dataset for scoring. Must be one of: "Uniform", "Stratified". Defaults to Uniform.
Logical. Enable diagnostics for hidden layers. Defaults to TRUE.
Logical. Enable fast mode (minor approximation in back-propagation). Defaults to TRUE.
Logical. Force extra load balancing to increase training speed for small datasets (to keep all cores
busy). Defaults to TRUE.
Logical. Compute variable importances for input features (Gedeon method) - can be slow for large
networks. Defaults to TRUE.
Logical. Replicate the entire training dataset onto every node for faster training on small datasets.
Defaults to TRUE.
Logical. Run on a single node for fine-tuning of model parameters. Defaults to FALSE.
Logical. Enable shuffling of training data (recommended if training data is replicated and
train_samples_per_iteration is close to #nodes x #rows, of if using balance_classes). Defaults to FALSE.
Handling of missing values. Either MeanImputation or Skip. Must be one of: "MeanImputation", "Skip". Defaults to MeanImputation.
Logical. Enable quiet mode for less output to standard output. Defaults to FALSE.
Logical. Auto-Encoder. Defaults to FALSE.
Logical. Sparse data handling (more efficient for data with lots of 0 values). Defaults to FALSE.
Logical. #DEPRECATED Use a column major weight matrix for input layer. Can speed up forward
propagation, but might slow down backpropagation. Defaults to FALSE.
Average activation for sparse auto-encoder. #Experimental Defaults to 0.
Sparsity regularization. #Experimental Defaults to 0.
Max. number of categorical features, enforced via hashing. #Experimental Defaults to 2147483647.
Logical. Force reproducibility on small data (will be slow - only uses 1 thread). Defaults to FALSE.
Logical. Whether to export Neural Network weights and biases to H2O Frames. Defaults to FALSE.
Mini-batch size (smaller leads to better fit, larger can speed up and generalize better). Defaults to 1.
Encoding scheme for categorical features Must be one of: "AUTO", "Enum", "OneHotInternal", "OneHotExplicit", "Binary", "Eigen", "LabelEncoder", "SortByResponse", "EnumLimited". Defaults to AUTO.
Logical. Elastic averaging between compute nodes can improve distributed model convergence.
#Experimental Defaults to FALSE.
Elastic averaging moving rate (only if elastic averaging is enabled). Defaults to 0.9.
Elastic averaging regularization strength (only if elastic averaging is enabled). Defaults to 0.001.
predict.H2OModel for prediction
library(h2o) h2o.init() iris.hex <- as.h2o(iris) iris.dl <- h2o.deeplearning(x = 1:4, y = 5, training_frame = iris.hex) # now make a prediction predictions <- h2o.predict(iris.dl, iris.hex)