Learn R Programming

VLMC (version 1.3-13)

vlmc: Fit a Variable Length Markov Chain (VLMC)

Description

Fit a Variable Length Markov Chain (VLMC) to a discrete time series, in basically two steps: First a large Markov Chain is generated containing (all if threshold.gen = 1) the context states of the time series. In the second step, many states of the MC are collapsed by pruning the corresponding context tree.

Currently, the alphabet may contain can at most 26 different characters.

Usage

vlmc(dts,
     cutoff.prune = qchisq(alpha.c, df=max(.1,alpha.len-1),lower.tail=FALSE)/2,
     alpha.c = 0.05,
     threshold.gen = 2,
     code1char = TRUE, y = TRUE, debug = FALSE, quiet = FALSE,
     dump = 0, ctl.dump = c(width.ct = 1+log10(n), nmax.set = -1) )

is.vlmc(x) ## S3 method for class 'vlmc': print(x, digits = max(3, getOption("digits") - 3), ...)

Arguments

dts
a discrete ``time series''; can be a numeric, character or factor.
cutoff.prune
non-negative number; the cutoff used for pruning; defaults to half the $\alpha$-quantile of a chisq distribution, where $\alpha =$ alpha.c, the following argument:
alpha.c
number in (0,1) used to specify cutoff.prune in the more intuitive $\chi^2$ quantile scale; defaulting to 5%.
threshold.gen
integer >= 1 (usually left at 2). When generating the initial large tree, only generate nodes with count >= threshold.gen.
code1char
logical; if true (default), the data dts will be ..........FIXME...........
y
logical; if true (default), the data dts will be returned. This allows to ensure that residuals (residuals.vlmc) and ``k-step ahead'' predictions can be computed from the re
debug
logical; should debugging info be printed to stderr.
quiet
logical; if true, don't print some warnings.
dump
integer in 0:2. If positive, the pruned tree is dumped to stderr; if 2, the initial unpruned tree is dumped as well.
ctl.dump
integer of length 2, say ctl[1:2] controlling the above dump when dump > 0. ctl[1] is the width (number of characters) for the ``counts'', ctl[2] the maximal number of set elements that are
x
a fitted "vlmc" object.
digits
integer giving the number of significant digits for printing numbers.
...
potentially further arguments [Generic].

Value

  • A "vlmc" object, basically a list with components
  • nobslength of data series when fit. (was named "n" in earlier versions.)
  • threshold.gen, cutoff.prunethe arguments (or their defaults).
  • alpha.lenthe alphabet size.
  • alphathe alphabet used, as one string.
  • sizea named integer vector of length (>=) 4, giving characteristic sizes of the fitted VLMC. Its named components are [object Object],[object Object],[object Object],[object Object]
  • vlmc.vecinteger vector, containing (an encoding of) the fitted VLMC tree.
  • yif y = TRUE, the data dts, as character, using the letters from alpha.
  • callthe call vlmc(..) used.

encoding

latin1

References

Buhlmann P. and Wyner A. (1998) Variable Length Markov Chains. Annals of Statistics 27, 480--513.

M�chler M. and B�hlmann P. (2004) Variable Length Markov Chains: Methodology, Computing, and Software. J. Computational and Graphical Statistics 2, 435--455.

M�chler M. (2004) VLMC --- Implementation and Rinterface; working paper.

See Also

draw.vlmc, entropy, simulate.vlmc for ``VLMC bootstrapping''.

Examples

Run this code
f1 <- c(1,0,0,0)
f2 <- rep(1:0,2)
(dt1 <- c(f1,f1,f2,f1,f2,f2,f1))

(vlmc.dt1  <- vlmc(dt1))
 vlmc(dt1, dump = 1,
      ctl.dump = c(wid = 3, nmax = 20), debug = TRUE)
(vlmc.dt1c01 <- vlmc(dts = dt1, cutoff.prune = .1, dump=1))

data(presidents)
dpres <- cut(presidents, c(0,45,70, 100)) # three values + NA
table(dpres <- factor(dpres, exclude = NULL)) # NA as 4th level
levels(dpres)#-> make the alphabet -> warning
vlmc.pres <- vlmc(dpres, debug = TRUE)
vlmc.pres

## alphabet & and its length:
vlmc.pres$alpha
stopifnot(
  length(print(strsplit(vlmc.pres$alpha,NULL)[[1]])) == vlmc.pres$ alpha.len
)

## You now can use larger alphabets (up to 95) letters:
set.seed(7); it <- sample(40, 20000, replace=TRUE)
v40 <- vlmc(it)
v40
## even larger alphabets now give an error:
il <- sample(100, 10000, replace=TRUE)
ee <- tryCatch(vlmc(il), error= function(e)e)
stopifnot(is(ee, "error"))

Run the code above in your browser using DataLab