Unlimited learning, half price | 50% off

Last chance! 50% off unlimited learning

Sale ends in


svs (version 3.0.0)

svs-package: Tools for Semantic Vector Spaces

Description

This package offers various tools for semantic vector spaces. There are techniques for correspondence analysis (simple, multiple and discriminant), latent semantic analysis, probabilistic latent semantic analysis, non-negative matrix factorization, latent class analysis, EM clustering, logratio analysis and log-multiplicative (association) analysis. Furthermore, the package has specialized distance measures and plotting functions as well as some helper functions.

Arguments

Contents

This package contains the following raw data files (in the folder extdata):

  • SndT_Fra.txt Seventeen Dutch source words and their French translations.

  • SndT_Eng.txt Seventeen Dutch source words and their English translations.

  • InvT_Fra.txt Seventeen Dutch target words and their French source words.

  • InvT_Eng.txt Seventeen Dutch target words and their English source words.

  • Ctxt_Dut.txt Context words for seventeen Dutch words.

  • Ctxt_Fra.txt Context words for seventeen Dutch words translated from French.

  • Ctxt_Eng.txt Context words for seventeen Dutch words translated from English.

The (fast procedures for the) techniques in this package are:

The complete overview of local and global weighting functions in this package can be found on weighting_functions.

The specialized distance measures are:

The specialized plotting functions are:

  • cd_plot Cumulative distribution plot.

  • pc_plot Parallel coordinate plot.

There are two helper functions for correspondence analysis:

  • freq_ca Compute level frequencies (for a factor).

  • centers_ca Compute coordinates for cluster centers.

There is one helper function for pvclust:

The remaining helper functions in this package are:

  • vec2ind Transform a vector into an indicator matrix.

  • tab2dat Transform a table into a data frame.

  • tab2ind Transform a table into an indicator matrix.

  • outerec Recursive application of the outer product.

  • pmi Pointwise mutual information.

  • MI Mutual information.

  • log_or_0 Logarithmic transform.

Further reference

  • Many packages contain correspondence analysis: ca, FactoMineR, MASS and others.

  • For latent semantic analysis there is also the package lsa.

  • The package NMF provides more flexibility for non-negative matrix factorization.

  • For topic models there are the packages lda and topicmodels.

  • Latent class analysis can also be run in the package poLCA.

Author

Koen Plevoets, koen.plevoets@ugent.be

Acknowledgements

This package has benefited greatly from the helpful comments of Lore Vandevoorde, Pauline De Baets and Gert De Sutter. Thanks to Kurt Hornik, Uwe Ligges and Brian Ripley for their valuable recommendations when proofing this package.