h2o (version 3.10.3.6)

h2o.word2vec: Trains a word2vec model on a String column of an H2O data frame.

Description

Trains a word2vec model on a String column of an H2O data frame.

Usage

h2o.word2vec(training_frame, model_id = NULL, min_word_freq = 5,
  word_model = c("SkipGram"), norm_model = c("HSM"), vec_size = 100,
  window_size = 5, sent_sample_rate = 0.001, init_learning_rate = 0.025,
  epochs = 5)

Arguments

training_frame
Id of the training data frame (Not required, to allow initial validation of model parameters).
model_id
Destination id for this model; auto-generated if not specified.
min_word_freq
This will discard words that appear less than <int> times Defaults to 5.
word_model
Use the Skip-Gram model Must be one of: "SkipGram". Defaults to SkipGram.
norm_model
Use Hierarchical Softmax Must be one of: "HSM". Defaults to HSM.
vec_size
Set size of word vectors Defaults to 100.
window_size
Set max skip length between words Defaults to 5.
sent_sample_rate
Set threshold for occurrence of words. Those that appear with higher frequency in the training data will be randomly down-sampled; useful range is (0, 1e-5) Defaults to 0.001.
init_learning_rate
Set the starting learning rate Defaults to 0.025.
epochs
Number of training iterations to run Defaults to 5.