Learn R Programming

dhlabR (version 1.0.6)

get_dispersion: Dispersion of tokens in a text

Description

This function wraps a call to the dispersion service, which calculates the dispersion of a list of tokens throughout a text in the National Library of Norway's collection, given by the URN. The text is divided into chunks, and the count of tokens in each chunk is returned.

Usage

get_dispersion(urn = NULL, words = list(".", ","), window = 500, pr = 100)

Value

A data frame with the count of tokens in each chunk.

Arguments

urn

A National Library of Norway URN to a text object.

words

A list or vector of words (tokens) to analyze for dispersion.

window

The size of the text chunk to count the tokens within.

pr

(Per) Determines the step size for moving forward to the next chunk. If 'pr' is equal to 'window', the text is divided into non-overlapping chunks of size 'window'. If 'pr' is smaller than 'window', the chunks will overlap, creating a smoother curve.

Examples

Run this code
urn <- "URN:NBN:no-nb_digibok_2013060406055"
words <- c("Dracula", "Mina", "Helsing")
window <- 1000
pr <- 1000
dispersion_result <- get_dispersion(urn, words, window, pr)

Run the code above in your browser using DataLab