stylest_select_vocab

Corpus as text vector. May be a <code>corpus_frame</code> object

Vector of speaker labels. Should be the same length as
<code>x</code>

speaker

if not <code>NULL</code>, a <code>corpus</code> text_filter

filter

value for smoothing. Defaults to 0.5

smooth

Number of folds for cross-validation. Defaults to 5

nfold

Vector of cutoff percentages to test. Defaults to
<code>c(50, 60, 70, 80, 90, 99)</code>

cutoff_pcts

Selects optimal vocabulary quantile(s) for model fitting using performance on
predicting out-of-sampletexts.

Estimates distinctiveness in speakers' (authors') style. Fits models that can be used for predicting speakers of new texts. Methods developed in Spirling et al (2018) <doi:10.2139/ssrn.3235506> (working paper).

stylest_select_vocab: Select vocabulary using cross-validated out-of-sample prediction

Description

Usage

Arguments

Value

Examples