Learn R Programming

text2vec (version 0.4.0)

LatentSemanticAnalysis: Latent Semantic Analysis model

Description

Creates LSA(Latent semantic analysis) model. See https://en.wikipedia.org/wiki/Latent_semantic_analysis for details.

Usage

LatentSemanticAnalysis

LSA

Format

R6Class object.

Fields

verbose

logical = TRUE whether to display training inforamtion

Usage

For usage details see Methods, Arguments and Examples sections.

lsa = LatentSemanticAnalysis$new(n_topics)
lsa$fit_transform(x)
lsa$get_word_vectors()

Methods

$new(n_topics)

create LSA model with n_topics latent topics

$fit(x, ...)

fit model to an input DTM (preferably in "dgCMatrix" format)

$fit_transform(x, ...)

fit model to an input sparse matrix (preferably in "dgCMatrix" format) and then transform x to latent space

$transform(x, ...)

transform new data x to latent space

Arguments

lsa

A LSA object.

x

An input document-term matrix.

n_topics

integer desired number of latent topics.

...

Arguments to internal functions. Notably useful for fit(), fit_transform() - these arguments will be passed to irlba function which is used as backend for SVD.

Examples

Run this code
# NOT RUN {
data("movie_review")
N = 100
tokens = movie_review$review[1:N] %>% tolower %>% word_tokenizer
dtm = create_dtm(itoken(tokens), hash_vectorizer())
n_topics = 10
lsa_1 = LatentSemanticAnalysis$new(n_topics)
fit(dtm, lsa_1) # or lsa_1$fit(dtm)
d1 = lsa_1$transform(dtm)
lsa_2 = LatentSemanticAnalysis$new(n_topics)
d2 = lsa_2$fit_transform(dtm)
all.equal(d1, d2)
# the same, but wrapped with S3 methods
all.equal(fit_transform(dtm, lsa_2), fit_transform(dtm, lsa_1))
# }

Run the code above in your browser using DataLab