corpus_reshape

corpus whose document units will be reshaped

new document units in which the corpus will be recast

if <code>TRUE</code>, repeat the docvar values for each
segmented text; if <code>FALSE</code>, drop the docvars in the segmented corpus.
Dropping the docvars might be useful in order to conserve space or if these
are not desired for the segmented corpus.

use_docvars

additional arguments passed to <code><a rd-options="=tokens" href="/link/tokens()?package=quanteda&version=2.1.2&to=%3Dtokens" data-mini-rdoc="=tokens::tokens()">tokens()</a></code>, since the
syntactic segmenter uses this function)

For a corpus, reshape (or recast) the documents to a different level of aggregation.
Units of aggregation can be defined as documents, paragraphs, or sentences.
Because the corpus object records its current "units" status, it is possible
to move from recast units back to original units, for example from documents,
to sentences, and then back to documents (possibly after modifying the sentences).

corpus

A fast, flexible, and comprehensive framework for
quantitative text analysis in R.  Provides functionality for corpus management,
creating and manipulating tokens and ngrams, exploring keywords in context,
forming and manipulating sparse matrices
of documents by features and feature co-occurrences, analyzing keywords, computing feature similarities and
distances, applying content dictionaries, applying supervised and unsupervised machine learning,
visually representing text and text analyses, and more.

Kenneth Benoit

quanteda

Quantitative Analysis of Textual Data

Kohei Watanabe

Haiyan Wang

Paul Nulty

Adam Obeng

Stefan M<c3><bc>ller

Akitaka Matsuo

Jiong Wei Lua

Jouni Kuha

William Lowe

Christian M<c3><bc>ller

Lori Young

Stuart Soroka

Ian Fellows

European Research Council 

corpus_reshape function

additional arguments passed to <code><a rd-options='=tokens' href='tokens()'>tokens()</a></code>, since the
syntactic segmenter uses this function)

corpus_reshape: Recast the document units of a corpus

Description

Usage

Arguments

Value

Examples