Learn R Programming

childeswordfreq

Nahar Albudoor

childeswordfreq is an R package for extracting word and phrase frequencies from the CHILDES database using the childesr API. It enables users to:

  • Get word counts by speaker role with word_counts().
  • Count surface phrases in utterance text with phrase_counts().
  • Normalize frequencies by corpus size or utterance count.
  • Export results as Excel workbooks with full metadata for replication.

The primary functions are:

  • word_counts() for type- or token-based frequencies by speaker role, with optional stemming, MOR-tier filtering, wildcard matching, normalization, and Zipf scaling.
  • phrase_counts() for surface phrase counts in utterance text, with optional wildcard patterns and per-utterance normalization.

Requirements

childeswordfreq does not require CLAN or local copies of CHILDES corpora. All queries go through childesr. Optional on-disk caching can be enabled to speed up repeated CHILDES queries.


Installation

# CRAN
install.packages("childeswordfreq")

# Or install the development version from GitHub
remotes::install_github("n-albudoor/childeswordfreq")

Then load the package:

library(childeswordfreq)

The package depends on childesr and its requirements. An active internet connection is required as all queries are executed through the TalkBank API.


Usage & Best Practices

For more detailed guidance on using the package, see the vignette:

browseVignettes("childeswordfreq-best-practices")

License

MIT.


Copy Link

Version

Install

install.packages('childeswordfreq')

Monthly Downloads

165

Version

0.2.0

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Nahar Albudoor

Last Published

November 15th, 2025

Functions in childeswordfreq (0.2.0)

cwf_cache_enabled

Return TRUE if caching is enabled
childeswordfreq-package

childeswordfreq: Word and Phrase Frequency Tools for CHILDES
cwf_cache_enable

Enable on-disk caching of CHILDES queries
phrase_counts

Count phrase matches in CHILDES utterances (experimental)
cwf_cache_disable

Disable caching
word_counts

Get word counts by speaker role