R.temis (version 0.1.3)

corpus_ca: corpus_ca

Description

Run a correspondence analysis on a corpus.

Usage

corpus_ca(corpus, dtm, variables = NULL, ncp = 5, sparsity = 1, ...)

Arguments

corpus

A Corpus object.

dtm

A DocumentTermMatrix object corresponding to corpus with one row per document.

variables

An optional list of variables in meta(corpus) over which to aggregate dtm. If NULL (the default), the analysis is run on the unaggregated matrix.

ncp

The number of axes to compute (5 by default). Note that this determines the number of axes that will be used for clustering by HCPC. Pass Inf to compute all axes.

sparsity

Value between 0 and 1 indicating the proportion of documents with no occurrences of a term above which that term should be dropped. By default all terms are kept (sparsity=1).

...

Additional arguments passed to FactoMineR::CA.

Value

A CA object containing the correspondence analysis results.

Examples

Run this code
# NOT RUN {
file <- system.file("texts", "reut21578-factiva.xml", package="tm.plugin.factiva")
corpus <- import_corpus(file, "factiva", language="en")
dtm <- build_dtm(corpus)
corpus_ca(corpus, dtm, ncp=3, sparsity=0.98)

# }

Run the code above in your browser using DataLab