A auxiliary function for defining the control variables.
ft_control(
loss = c("softmax", "hs", "ns"),
learning_rate = 0.05,
learn_update = 100L,
word_vec_size = 100L,
window_size = 5L,
epoch = 5L,
min_count = 5L,
min_count_label = 0L,
neg = 5L,
max_len_ngram = 1L,
nbuckets = 2000000L,
min_ngram = 3L,
max_ngram = 6L,
nthreads = 1L,
threshold = 1e-04,
label = "__label__",
verbose = 0,
pretrained_vectors = "",
output = "",
save_output = FALSE,
seed = 0L,
qnorm = FALSE,
retrain = FALSE,
qout = FALSE,
cutoff = 0L,
dsub = 2L,
autotune_validation_file = "",
autotune_metric = "f1",
autotune_predictions = 1L,
autotune_duration = 300L,
autotune_model_size = ""
)
a list with the control variables.
a character string giving the name of the loss function
allowed values are 'softmax'
, 'hs'
and 'ns'
.
a numeric giving the learning rate, the default value is 0.05
.
an integer giving after how many tokens the learning rate
should be updated. The default value is 100L
, which
means the learning rate is updated every 100 tokens.
an integer giving the length (size) of the word vectors.
an integer giving the size of the context window.
an integer giving the number of epochs.
an integer giving the minimal number of word occurences.
and integer giving the minimal number of label occurences.
an integer giving how many negatives are sampled (only used if loss is "ns"
).
an integer giving the maximum length of ngrams used.
an integer giving the number of buckets.
an integer giving the minimal ngram length.
an integer giving the maximal ngram length.
an integer giving the number of threads.
a numeric giving the sampling threshold.
a character string specifying the label prefix (default is '__label__'
).
an integer giving the verbosity level, the default value
is 0L
and shouldn't be changed since Rcpp::Rcout
cann't handle the traffic.
a character string giving the file path to the pretrained word vectors which are used for the supervised learning.
a character string giving the output file path.
a logical (default is FALSE
)
an integer
a logical (default is FALSE
)
a logical (default is FALSE
)
a logical (default is FALSE
)
an integer (default is 0L
)
an integer (default is 2L
)
a character string
a character string (default is "f1"
)
an integer (default is 1L
)
an integer (default is 300L
)
a character string