Learn R Programming

LSAfun (version 0.6.2)

multidocs: Comparison of sentence sets

Description

Computes cosine values between sets of sentences and/or documents

Usage

multidocs(x,y=x,chars=10,tvectors=tvectors,breakdown=FALSE)

Arguments

x

a character vector containing different sentences/documents

y

a character vector containing different sentences/documents (y = x by default)

chars

an integer specifying how many letters (starting from the first) of each sentence/document are to be printed in the row.names and col.names of the output matrix

tvectors

the semantic space in which the computation is to be done (a numeric matrix where every row is a word vector)

breakdown

if TRUE, the function breakdown is applied to the input

Value

A list of three elements:

cosmat

A numeric matrix giving the cosines between the input sentences/documents

xdocs

A legend for the row.names of cosmat

ydocs

A legend for the col.names of cosmat

Details

In the traditional LSA approach, the vector D for a document (or a sentence) consisting of the words (t1, . , tn) is computed as $$D = \sum\limits_{i=1}^n t_n$$

This function computes the cosines between two sets of documents (or sentences). The format of x (or y) should be of the kind x <- c("this is the first text","here is another text")

References

Landauer, T.K., & Dumais, S.T. (1997). A solution to Plato's problem: The Latent Semantic Analysis theory of acquisition, induction and representation of knowledge. Psychological Review, 104, 211-240.

Dennis, S. (2007). How to use the LSA Web Site. In T. K. Landauer, D. S. McNamara, S. Dennis, & W. Kintsch (Eds.), Handbook of Latent Semantic Analysis (pp. 35-56). Mahwah, NJ: Erlbaum.

http://lsa.colorado.edu/

See Also

cosine, Cosine, multicos, costring

Examples

Run this code
# NOT RUN {
data(wonderland)
multidocs(x = c("Alice was beginning to get very tired.",
                "The red queen greeted Alice."),
          y = c("The mad hatter and the mare hare are having a party.",
                "The hatter sliced the cup of tea in half."), 
      tvectors=wonderland)
# }

Run the code above in your browser using DataLab