lnre.vgc: Expected Vocabulary Growth Curves of LNRE Model (zipfR)

Description

lnre.vgc computes expected vocabulary growth curves \(E[V(N)]\) according to a LNRE model, returning an object of class vgc. Data points are returned for the specified values of \(N\), optionally including estimated variances and/or growth curves for the spectrum elements \(E[V_m(N)]\).

Usage

lnre.vgc(model, N, m.max=0, variances=FALSE)

Arguments

model

an object belonging to a subclass of lnre, representing a LNRE model

an increasing sequence of non-negative integers, specifying the sample sizes \(N\) for which vocabulary growth data should be calculated

m.max

if specified, include vocabulary growth curves \(E[V_m(N)]\) for spectrum elements up to m.max. Must be a single integer in the range \(1 \ldots 9\).

variances

if TRUE, include variance estimates for the vocabulary size (and the spectrum elements, if applicable)

Value

An object of class vgc, representing the expected vocabulary growth curve \(E[V(N)]\) of the LNRE model lnre, with data points at the sample sizes N.

If m.max is specified, expected growth curves \(E[V_m(N)]\) for spectrum elements (hapax legomena, dis legomena, etc.) up to m.max are also computed.

If variances=TRUE, the vgc object includes variance data for all growth curves.

Details

~~ TODO, if any ~~

Examples

Run this code

# NOT RUN {
## load Dickens dataset and estimate lnre models
data(Dickens.spc)

zm <- lnre("zm",Dickens.spc)
fzm <- lnre("fzm",Dickens.spc,exact=FALSE)
gigp <- lnre("gigp",Dickens.spc)

## compute expected V and V_1 growth up to 100 million tokens
## in 100 steps of 1 million tokens
zm.vgc <- lnre.vgc(zm,(1:100)*1e6, m.max=1)
fzm.vgc <- lnre.vgc(fzm,(1:100)*1e6, m.max=1)
gigp.vgc <- lnre.vgc(gigp,(1:100)*1e6, m.max=1)

## compare
plot(zm.vgc,fzm.vgc,gigp.vgc,add.m=1,legend=c("ZM","fZM","GIGP"))

## load Italian ultra- prefix data
data(ItaUltra.spc)

## compute zm model
zm <- lnre("zm",ItaUltra.spc)

## compute vgc up to about twice the sample size
## with variance of V
zm.vgc <- lnre.vgc(zm,(1:100)*70, variances=TRUE)

## plot with confidence intervals derived from variance in
## vgc (with larger datasets, ci will typically be almost
## invisible)
plot(zm.vgc)

# }