Learn R Programming

childeswordfreq (version 0.2.0)

phrase_counts: Count phrase matches in CHILDES utterances (experimental)

Description

Matches surface phrases in utterance text and outputs counts, plus dataset summary and run metadata. Supports simple wildcards in phrases: * (any chars), ? (one char). Normalization is per number of utterances.

Usage

phrase_counts(
  phrases,
  collection = NULL,
  language = NULL,
  corpus = NULL,
  age = NULL,
  sex = NULL,
  role = NULL,
  role_exclude = NULL,
  wildcard = FALSE,
  ignore_case = TRUE,
  normalize = FALSE,
  per_utts = 10000L,
  db_version = "current",
  cache = FALSE,
  cache_dir = NULL,
  output_file = NULL
)

Value

If output_file is NULL, returns a tibble of phrase counts; otherwise writes an Excel file and returns the file path (invisibly).

Arguments

phrases

Character vector of phrases or patterns.

collection, language, corpus, age, sex, role, role_exclude

CHILDES filters.

wildcard

Logical; enable * and ? in phrases.

ignore_case

Logical; case-insensitive matching.

normalize

Logical; if TRUE, add per-N utterance rates.

per_utts

Integer; denominator for utterance rates (default 10000).

db_version

CHILDES DB version (recorded).

cache

Logical; cache CHILDES queries on disk.

cache_dir

Optional cache directory.

output_file

Optional .xlsx path; if NULL, returns a tibble.

Details

Tier targeting is not applied in phrase mode. Phrases are matched in the main utterance text. For tier-constrained contexts around words, use contexts_for(..., mode = "word", tier = "mor").