copula (version 0.999-19)

xvCopula: Model (copula) selection based on k-fold cross-validation

Description

Computes the leave-one-out cross-validation criterion (or a k-fold version of it) for the hypothesized parametric copula family using, by default, maximum pseudo-likelihood estimation.

The leave-one-out criterion is a crossvalidated log likelihood. It is denoted by \(\widehat{xv}_n\) in Gr<U+00F8>nneberg and Hjort (2014) and defined in equation (42) therein. When computed for several parametric copula families, it is thus meaningful to select the family maximizing the criterion.

For \(k < n\), \(n\) the sample size, the k-fold version is an approximation of the leave-one-out criterion that uses \(k\) randomly chosen (almost) equally sized data blocks instead of \(n\). When \(n\) is large, \(k\)-fold cross-validation is considerably faster (if \(k\) is “small” compared to \(n\)).

Usage

xvCopula(copula, x, k = NULL, verbose = interactive(),
         ties.method = eval(formals(rank)$ties.method), …)

Arguments

copula

object of class "'>copula" representing the hypothesized copula family.

x

a data matrix that will be transformed to pseudo-observations.

k

the number of data blocks; if k = NULL, nrow(x) blocks are considered (which corresponds to leave-one-out cross-validation).

verbose

a logical indicating if progress of the cross validation should be displayed via txtProgressBar.

ties.method

string specifying how ranks should be computed if there are ties in any of the coordinate samples of x and fitting is based on maximum pseudo-likelihood; passed to pobs.

additional arguments passed to fitCopula().

Value

A real number equal to the cross-validation criterion multiplied by the sample size.

References

Gr<U+00F8>nneberg, S., and Hjort, N.L. (2014) The copula information criteria. Scandinavian Journal of Statistics 41, 436--459.

See Also

fitCopula() for the underlying estimation procedure and gofCopula() for goodness-of-fit tests.

Examples

Run this code
# NOT RUN {
<!-- % reproducibility -->
# }
# NOT RUN {
## A two-dimensional data example ----------------------------------
x <- rCopula(200, claytonCopula(3))

# }
# NOT RUN {
<!-- % slow -->
## Model (copula) selection -- takes time: each fits 200 copulas to 199 obs.
xvCopula(gumbelCopula(), x)
xvCopula(frankCopula(), x)
xvCopula(joeCopula(), x)
xvCopula(claytonCopula(), x)
xvCopula(normalCopula(), x)
xvCopula(tCopula(), x)
xvCopula(plackettCopula(), x)
# }
# NOT RUN {
<!-- % dont test -->
# }
# NOT RUN {
## The same with 5-fold cross-validation [to save time ...]
set.seed(1) # k-fold is random (for k < n) !
xvCopula(gumbelCopula(),  x, k=5)
xvCopula(frankCopula(),   x, k=5)
xvCopula(joeCopula(),     x, k=5)
xvCopula(claytonCopula(), x, k=5)
xvCopula(normalCopula(),  x, k=5)
xvCopula(tCopula(),       x, k=5)
xvCopula(plackettCopula(),x, k=5)
# }

Run the code above in your browser using DataCamp Workspace