data_corpus_LMRDsample

A sample of 100 positive and 100 negative reviews from the Maas et al. (2011)
dataset for sentiment classification. The original dataset contains 50,000
highly polar movie reviews.

data

Tools for AI-assisted qualitative data coding using large language
models ('LLMs') via the 'ellmer' package, supporting providers including
'OpenAI', 'Anthropic', 'Google', 'Azure', and local models via 'Ollama'.
Provides a 'codebook'-based workflow for defining coding instructions and
applying them to texts, images, and other data. Includes built-in 'codebooks'
for common applications such as sentiment analysis and policy coding, and
functions for creating custom 'codebooks' for specific research questions.
Supports systematic replication across models and settings, computing
inter-coder reliability statistics including Krippendorff's alpha
(Krippendorff 2019, <doi:10.4135/9781071878781>) and Fleiss' kappa
(Fleiss 1971, <doi:10.1037/h0031619>), as well as gold-standard validation
metrics including accuracy, precision, recall, and F1 scores following
Sokolova and Lapalme (2009, <doi:10.1016/j.ipm.2009.03.002>). Provides audit
trail functionality for documenting coding workflows following Lincoln and
Guba's (1985, ISBN:0803924313) framework for establishing trustworthiness
in qualitative research.

Seraphine F. Maerz

quallmer

Qualitative Analysis with Large Language Models

Kenneth Benoit

data_corpus_LMRDsample function

The corpus docvars consist of:<dl>
<dt>docnumber</dt>
<dd>serial (within set and polarity) document number</dd><dt>rating</dt>
<dd>user-assigned movie rating on a 1-10 point integer scale</dd><dt>polarity</dt>
<dd>either <code>neg</code> or <code>pos</code> to indicate whether the
movie review was negative or positive. See Maas et al (2011) for the
cut-off values that governed this assignment.</dd>
</dl>

Format

Sample from Large Movie Review Dataset (Maas et al. 2011) — data_corpus_LMRDsample

The corpus docvars consist of:<dl>
<dt>docnumber</dt>
<dd>serial (within set and polarity) document number</dd>

<dt>rating</dt>
<dd>user-assigned movie rating on a 1-10 point integer scale</dd>

<dt>polarity</dt>
<dd>either <code>neg</code> or <code>pos</code> to indicate whether the
movie review was negative or positive. See Maas et al (2011) for the
cut-off values that governed this assignment.</dd>


</dl>

data_corpus_LMRDsample: Sample from Large Movie Review Dataset (Maas et al. 2011)

Description

Usage

Arguments

Format

References

See Also

Examples