Learn R Programming

zipfR (version 0.6-5)

spc: Frequency Spectra (zipfR)

Description

In the zipfR library, spc objects are used to represent a word frequency spectrum (either an observed spectrum or the expected spectrum of a LNRE model at a given sample size).

With the spc constructor function, an object can be initialized directly from the specified data vectors. It is more common to read an observed spectrum from a disk file with read.spc or compute an expected spectrum with lnre.spc, though. spc objects should always be treated as read-only.

Usage

spc(Vm, m=1:length(Vm), VVm=NULL, N=NA, V=NA, VV=NA,
      m.max=0, expected=!missing(VVm))

Arguments

m
integer vector of frequency classes $m$ (if omitted, Vm is assumed to list the first $k$ frequency classes $V_1, \ldots, V_k$)
Vm
vector of corresponding class sizes $V_m$ (may be fractional for expected frequency spectrum $E[V_m]$)
VVm
optional vector of estimated variances $\mathop{Var}[V_m]$ (for expected frequency spectrum only)
N, V
total sample size $N$ and vocabulary size $V$ of frequency spectrum. While these values are usually determined automatically from m and Vm, they are required for an incomplete frequency spectrum that does not list al
VV
variance $\mathop{Var}[V]$ of expected vocabulary size. If VVm is specified, VV should also be given.
m.max
highest frequency class $m$ listed in incomplete spectrum. If m.max is set, N and V also have to be specified, and all non-zero frequency classes up to m.max have to be included in the input
expected
set to TRUE if the frequency spectrum represents expected values $E[V_m]$ of the class sizes according to some LNRE model (this is automatically triggered when the VVm argument is specified).

Value

  • An object of class spc representing the specified frequency spectrum. This object should be treated as read-only (although such behaviour cannot be enforced in R).

Details

A spc object is a data frame with the following variables:

[object Object],[object Object],[object Object]

The following attributes are used to store additional information about the frequency spectrum:

[object Object],[object Object],[object Object],[object Object],[object Object]

See Also

read.spc, write.spc, spc.vector, sample.spc, spc2tfl, tfl2spc, lnre.spc, plot.spc

Generic methods supported by spc objects are print, summary, N, V, Vm, VV, and VVm.

Implementation details and non-standard arguments for these methods can be found on the manpages print.spc, summary.spc, N.spc, V.spc, etc.

Examples

Run this code
## load Brown imaginative prose spectrum and inspect it
data(BrownImag.spc)

summary(BrownImag.spc)
print(BrownImag.spc)

plot(BrownImag.spc)

N(BrownImag.spc)
V(BrownImag.spc)
Vm(BrownImag.spc,1)
Vm(BrownImag.spc,1:5)

## compute ZM model, and generate PARTIAL expected spectrum
## with variances for a sample of 10 million tokens
zm <- lnre("zm",BrownImag.spc)
zm.spc <- lnre.spc(zm,1e+7,variances=TRUE)

## inspect extrapolated spectrum
summary(zm.spc)
print(zm.spc)

plot(zm.spc,log="x")

N(zm.spc)
V(zm.spc)
VV(zm.spc)
Vm(zm.spc,1)
VVm(zm.spc,1)

## generate an artificial Zipfian-looking spectrum
## and take a look at it
zipf.spc <- spc(round(1000/(1:1000)^2))

summary(zipf.spc)
plot(zipf.spc)

## see manpages of lnre, and the various *.spc mapages
## for more examples of spc usage

Run the code above in your browser using DataLab