stylest2_terms

A function to select terms for inclusion in a stylest2 model, based on a document-feature
matrix of texts to predict and a specified cutoff.

Estimates the authors or speakers of texts. Methods developed in Huang, Perry, and Spirling (2020) <doi:10.1017/pan.2019.49>. The model is built on a Bayesian framework in which the distinctiveness of each speaker is defined by how different, on average, the speaker's terms are to everyone else in the corpus of texts. An optional cross-validation method is implemented to select the subset of terms that generate the most accurate speaker predictions. Once a set of terms is selected, the model can be estimated. Speaker distinctiveness and term influence can be recovered from parameters in the model using package functions. Once fitted, the model can be used to predict authorship of new texts.

Christian Baehr

stylest2

Estimating Speakers of Texts

Arthur Spirling

Leslie Huang

stylest2_terms function

<dl><dt>dfm</dt>
<dd>a quanteda <code>dfm</code> object.</dd>
<dt>cutoff</dt>
<dd>a single numeric value - the quantile of term frequency under which
to drop terms.</dd></dl>

Arguments

Select terms above frequency cutoff — stylest2_terms

<dl>

<dt>dfm</dt>
<dd>a quanteda <code>dfm</code> object.</dd>


<dt>cutoff</dt>
<dd>a single numeric value - the quantile of term frequency under which
to drop terms.</dd>

</dl>

stylest2_terms: Select terms above frequency cutoff

Description

Usage

Value

Arguments

Examples