powered by
Trains a fasttext vector/unsupervised model following method described in Enriching Word Vectors with Subword Information using the fasttext implementation.
See FastText word representation tutorial for more information on training unsupervised models using fasttext.
build_vectors(documents, model_path, modeltype = c("skipgram", "cbow"), bucket = 2e+06, dim = 100, epoch = 5, label = "__label__", loss = c("ns", "hs", "softmax", "ova", "one-vs-all"), lr = 0.05, lrUpdateRate = 100, maxn = 6, minCount = 5, minn = 3, neg = 5, t = 1e-04, thread = 12, verbose = 2, wordNgrams = 1, ws = 5)
character vector of documents used for training
Name of output file without file extension.
Should training be done using skipgram or cbow? Defaults to skipgram.
number of buckets
size of word vectors
number of epochs
text string, labels prefix. Default is "label"
loss function ns, hs, softmax
learning rate
change the rate of updates for the learning rate
max length of char ngram
minimal number of word occurences
min length of char ngram
number of negatives sampled
sampling threshold
number of threads
verbosity level
max length of word ngram
size of the context window
path to model file, as character
# NOT RUN { library(fastrtext) text <- train_sentences model_file <- build_vectors(text[['text']], 'my_model') model <- load_model(model_file) # }
Run the code above in your browser using DataLab