h2o.word2vec

0th

Percentile

Trains a word2vec model on a String column of an H2O data frame

Trains a word2vec model on a String column of an H2O data frame

Usage
h2o.word2vec(training_frame = NULL, model_id = NULL,
  min_word_freq = 5, word_model = c("SkipGram"),
  norm_model = c("HSM"), vec_size = 100, window_size = 5,
  sent_sample_rate = 0.001, init_learning_rate = 0.025, epochs = 5,
  pre_trained = NULL, max_runtime_secs = 0,
  export_checkpoints_dir = NULL)
Arguments
training_frame

Id of the training data frame.

model_id

Destination id for this model; auto-generated if not specified.

min_word_freq

This will discard words that appear less than <int> times Defaults to 5.

word_model

Use the Skip-Gram model Must be one of: "SkipGram". Defaults to SkipGram.

norm_model

Use Hierarchical Softmax Must be one of: "HSM". Defaults to HSM.

vec_size

Set size of word vectors Defaults to 100.

window_size

Set max skip length between words Defaults to 5.

sent_sample_rate

Set threshold for occurrence of words. Those that appear with higher frequency in the training data will be randomly down-sampled; useful range is (0, 1e-5) Defaults to 0.001.

init_learning_rate

Set the starting learning rate Defaults to 0.025.

epochs

Number of training iterations to run Defaults to 5.

pre_trained

Id of a data frame that contains a pre-trained (external) word2vec model

max_runtime_secs

Maximum allowed runtime in seconds for model training. Use 0 to disable. Defaults to 0.

export_checkpoints_dir

Automatically export generated models to this directory.

Aliases
  • h2o.word2vec
Documentation reproduced from package h2o, version 3.22.1.1, License: Apache License (== 2.0)

Community examples

Looks like there are no examples yet.