The main function in stylest, stylest_fit fits a
model using a corpus of texts labeled by speaker.
stylest_fit(
x,
speaker,
terms = NULL,
filter = NULL,
smooth = 0.5,
term_weights = NULL,
fill_method = "value",
fill_weight = 0,
weight_varname = "mean_distance"
)Text vector. May be a corpus_frame object
Vector of speaker labels. Should be the same length as
x
If not NULL, terms to be used in the model. If
NULL, use all terms
If not NULL, a text filter to specify the tokenization.
See corpus for more information about specifying filter
Numeric value used smooth term frequencies instead of the default of 0.5
Dataframe of distances (or any weights) per word in the vocab. This dataframe should have one column $word and a second column $weight_var containing the weight for the word. See the vignette for details.
if "value" (default), fill_weight is
used to fill any terms with NA weight. If "mean", the
mean term_weight should be used as the fill value
numeric value to fill in as weight for any term
which does not have a weight specified in term_weights,
default=0.0 (drops any words without weights)
Name of the column in term_weights containing the weights,
default="mean_distance"
A S3 stylest_model object containing:
speakers Vector of unique speakers,
filter text_filter used,
terms terms used in fitting the model,
ntoken Vector of number of tokens per speaker,
smooth Smoothing value,
weights If not NULL, a named matrix of weights for each term in the vocab,
rate Matrix of speaker rates for each term in vocabulary
The user may specify only one of terms or cutoff.
If neither is specified, all terms will be used.
# NOT RUN {
data(novels_excerpts)
speaker_mod <- stylest_fit(novels_excerpts$text, novels_excerpts$author)
# }
Run the code above in your browser using DataLab