
Last chance! 50% off unlimited learning
Sale ends in
This package offers various tools for semantic vector spaces. There are techniques for correspondence analysis (simple, multiple and discriminant), latent semantic analysis, probabilistic latent semantic analysis, non-negative matrix factorization, latent class analysis, EM clustering, logratio analysis and log-multiplicative (association) analysis. Furthermore, the package has specialized distance measures and plotting functions as well as some helper functions.
This package contains the following raw data files (in the folder extdata):
SndT_Fra.txt
Seventeen Dutch source words and their French translations.
SndT_Eng.txt
Seventeen Dutch source words and their English translations.
InvT_Fra.txt
Seventeen Dutch target words and their French source words.
InvT_Eng.txt
Seventeen Dutch target words and their English source words.
Ctxt_Dut.txt
Context words for seventeen Dutch words.
Ctxt_Fra.txt
Context words for seventeen Dutch words translated from French.
Ctxt_Eng.txt
Context words for seventeen Dutch words translated from English.
The (fast procedures for the) techniques in this package are:
fast_sca
Simple correspondence analysis.
fast_mca
Multiple correspondence analysis.
fast_dca
Discriminant correspondence analysis.
fast_lsa
Latent semantic analysis.
fast_psa
Probabilistic latent semantic analysis.
fast_nmf
Non-negative matrix factorization.
fast_lca
Latent class analysis.
fast_E_M
EM clustering.
fast_lra
Logratio analysis.
fast_lma
Log-multiplicative (association) analysis.
The complete overview of local and global weighting functions in this package can be found on weighting_functions
.
The specialized distance measures are:
dist_chisquare
Chi-square distance.
dist_cosine
Cosine distance.
dist_wrt
Distance with respect to a certain point.
dist_wrt_centers
Distance with respect to cluster centers.
The specialized plotting functions are:
There are two helper functions for correspondence analysis:
freq_ca
Compute level frequencies (for a factor).
centers_ca
Compute coordinates for cluster centers.
There is one helper function for pvclust:
complete_pvpick
Complete the output of pvpick
.
The remaining helper functions in this package are:
Many packages contain correspondence analysis: ca, FactoMineR, MASS and others.
For latent semantic analysis there is also the package lsa.
The package NMF provides more flexibility for non-negative matrix factorization.
For topic models there are the packages lda and topicmodels.
Latent class analysis can also be run in the package poLCA.
Koen Plevoets, koen.plevoets@ugent.be
This package has benefited greatly from the helpful comments of Lore Vandevoorde, Pauline De Baets and Gert De Sutter. Thanks to Kurt Hornik, Uwe Ligges and Brian Ripley for their valuable recommendations when proofing this package.