Trains a word2vec model on a String column of an H2O data frame
h2o.word2vec(
training_frame = NULL,
model_id = NULL,
min_word_freq = 5,
word_model = c("SkipGram"),
norm_model = c("HSM"),
vec_size = 100,
window_size = 5,
sent_sample_rate = 0.001,
init_learning_rate = 0.025,
epochs = 5,
pre_trained = NULL,
max_runtime_secs = 0,
export_checkpoints_dir = NULL
)
Id of the training data frame.
Destination id for this model; auto-generated if not specified.
This will discard words that appear less than <int> times Defaults to 5.
Use the Skip-Gram model Must be one of: "SkipGram". Defaults to SkipGram.
Use Hierarchical Softmax Must be one of: "HSM". Defaults to HSM.
Set size of word vectors Defaults to 100.
Set max skip length between words Defaults to 5.
Set threshold for occurrence of words. Those that appear with higher frequency in the training data will be randomly down-sampled; useful range is (0, 1e-5) Defaults to 0.001.
Set the starting learning rate Defaults to 0.025.
Number of training iterations to run Defaults to 5.
Id of a data frame that contains a pre-trained (external) word2vec model
Maximum allowed runtime in seconds for model training. Use 0 to disable. Defaults to 0.
Automatically export generated models to this directory.