esaddle (version 0.0.6)

selectDecay: Tuning the Extended Empirical Saddlepoint (EES) density by cross-validation

Description

Performs k-fold cross-validation to choose the EES's tuning parameter, which determines the mixture between a consistent and a Gaussian estimator of the Cumulant Generating Function (CGF).

Usage

selectDecay(
  decay,
  simulator,
  K,
  nrep = 1,
  normalize = FALSE,
  draw = TRUE,
  multicore = !is.null(cluster),
  cluster = NULL,
  ncores = detectCores() - 1,
  control = list(),
  ...
)

Arguments

decay

Numeric vector containing the possible values of the tuning parameter.

simulator

Function with prototype function(...) that will be called nrep times to simulate d-dimensional random variables. Each time simulator is called, it will return a n by d matrix.

K

the number of folds to be used in cross-validation.

nrep

Number of times the whole cross-validation procedure will be repeated, by calling simulator to generate random variable and computing the cross-validation score for every element of the decay vector.

normalize

if TRUE the normalizing constant of EES is normalized at each value of decay. FALSE by default.

draw

if TRUE the results of cross-validation will be plotted. TRUE by default.

multicore

if TRUE each fold will run on a different core.

cluster

an object of class c("SOCKcluster", "cluster"). This allowes the user to pass her own cluster, which will be used if multicore == TRUE. The user has to remember to stop the cluster.

ncores

number of cores to be used.

control

a list of control parameters, with entries:

  • method The method used to calculate the normalizing constant. Either "LAP" (laplace approximation) or "IS" (importance sampling).

  • tol The tolerance used to assess the convergence of the solution to the saddlepoint equation. The default is 1e-6.

  • nNorm Number of simulations to be used in order to estimate the normalizing constant of the saddlepoint density. By default equal to 1e3.

  • ml if method=="IS" nNorm, random variables are generated from a Gaussian importance density with covariance matrix ml*cov(X). By default the inflation factor is ml=2.

...

extra arguments to be passed to simulator.

Value

A list with entries:

  • negLogLik A matrix length{decay} by K*nrep where the i-th row represent the negative loglikelihood estimated for the i-th value of decay, while each column represents a different fold and repetition.

  • summary A matrix of summary results from the cross-validation procedure.

  • normConst A matrix length{decay} by nrep where the i-th row contains the estimates of the normalizing constant.

The list is returned invisibly. If control$draw == TRUE the function will also plot the cross-validation curve.

References

Fasiolo, M., Wood, S. N., Hartig, F. and Bravington, M. V. (2016). An Extended Empirical Saddlepoint Approximation for Intractable Likelihoods. ArXiv http://arxiv.org/abs/1601.01849.

Examples

Run this code
# NOT RUN {
library(esaddle)
# The data is far from normal: saddlepoint is needed and we expect 
# cross validation to be minimized at low "decay"
set.seed(4124)
selectDecay(decay = c(0.001, 0.01, 0.05, 0.1, 0.5, 1), 
            simulator = function(...) rgamma(400, 2, 1), 
            K = 5)
            
# The data is normal: saddlepoint is not needed and we expect 
# the curve to be fairly flat for high "decay"
selectDecay(decay = c(0.001, 0.01, 0.05, 0.1, 0.5, 1), 
            simulator = function(...) rnorm(400, 0, 1), 
            K = 5)

# }

Run the code above in your browser using DataLab