lsa (version 0.73.3)

as.textmatrix: Display a latent semantic space generated by Latent Semantic Analysis (LSA)


Returns a latent semantic space (created by createLSAspace) in textmatrix format: rows are terms, columns are documents.


as.textmatrix( LSAspace )



a latent semantic space generated by createLSAspace.



a textmatrix representation of the latent semantic space.


To allow comparisons between terms and documents, the internal format of the latent semantic space needs to be converted to a classical document-term matrix (just like the ones generated by textmatrix() that are of class `textmatrix').

Remark: There are other ways to compare documents and terms using the partial matrices from an LSA space directly. See (Berry, 1995) for more information.


Berry, M., Dumais, S., and O'Brien, G (1995) Using Linear Algebra for Intelligent Information Retrieval. In: SIAM Review, Vol. 37(4), pp.573--595.

See Also

textmatrix, lsa, fold_in


Run this code
# create some files
td = tempfile()
write( c("dog", "cat", "mouse"), file=paste(td, "D1", sep="/"))
write( c("hamster", "mouse", "sushi"), file=paste(td, "D2", sep="/"))
write( c("dog", "monster", "monster"), file=paste(td, "D3", sep="/"))
write( c("dog", "mouse", "dog"), file=paste(td, "D4", sep="/"))

# read files into a document-term matrix
myMatrix = textmatrix(td, minWordLength=1)

# create the latent semantic space
myLSAspace = lsa(myMatrix, dims=dimcalc_raw()) 

# display it as a textmatrix again
round(as.textmatrix(myLSAspace),2) # should give the original

# create the latent semantic space
myLSAspace = lsa(myMatrix, dims=dimcalc_share()) 

# display it as a textmatrix again
myNewMatrix = as.textmatrix(myLSAspace) 
myNewMatrix # should look be different!

# compare two terms with the cosine measure
cosine(myNewMatrix["dog",], myNewMatrix["cat",])

# compare two documents with pearson
cor(myNewMatrix[,1], myNewMatrix[,2], method="pearson")

# clean up
unlink(td, recursive=TRUE)

# }

Run the code above in your browser using DataLab