bcpglm-bcpglmm-mcmcsamp: Bayesian Compound Poisson Linear Models

Description

These functions fit Tweedie compound Poisson linear models using Markov Chain Monte Carlo methods.

Usage

bcpglm(formula, link = "log", data, inits = NULL, 
  weights, offset, subset, na.action, contrasts = NULL, 
  n.chains = 3, n.iter = 2000, n.burnin = floor(n.iter / 2), 
  n.thin = max(1, floor(n.chains * (n.iter - n.burnin) / n.sims)), 
  n.sims = 1000, n.report = 1000, prior.beta.mean, prior.beta.var, 
  bound.phi = 100, bound.p = c(1.01, 1.99), method = "dtweedie", 
  tune.iter = 4000, n.tune = 10, tune.weight = 0.25,...)
  
bcpglmm(formula, link = "log", data, inits = NULL, weights, 
  offset, subset, na.action, contrasts = NULL, n.chains = 3, 
  n.iter = 2000, n.burnin = floor(n.iter / 2), 
  n.thin = max(1, floor(n.chains * (n.iter - n.burnin) / n.sims)), 
  n.sims = 1000, n.report = 1000, prior.beta.mean = NULL, 
  prior.beta.var = NULL, bound.phi = 100, bound.p = c(1.01, 1.99), 
  prior.Sigma = NULL,  tune.iter = 4000, n.tune = 10, tune.weight = 0.25,
  basisGenerators = c("tp","tpU","bsp","sp2d"),
  method = "dtweedie",...)
    
## S3 method for class 'cpglm':
mcmcsamp(object, inits = NULL, n.chains = 3, 
  n.iter = 2000, n.burnin = floor(n.iter / 2),
  n.thin = max(1, floor(n.chains * (n.iter - n.burnin) / n.sims)),
  n.sims = 1000, n.report = 1000, 
  prior.beta.mean = NULL, prior.beta.var = NULL, 
  bound.phi = 100, bound.p = c(1.01,1.99),  
  tune.iter = 4000, n.tune = 10, tune.weight = 0.25,
  method = "dtweedie",...)
## S3 method for class 'cpglmm':
mcmcsamp(object, inits = NULL, n.chains = 3, 
  n.iter = 2000, n.burnin = floor(n.iter / 2),
  n.thin = max(1, floor(n.chains * (n.iter - n.burnin) / n.sims)),
  n.sims = 1000, n.report = 1000, 
  prior.beta.mean = NULL, prior.beta.var = NULL, 
  bound.phi = 100, bound.p = c(1.01,1.99),  prior.Sigma = NULL,
  tune.iter = 4000, n.tune = 10, tune.weight = 0.25,
  method = "dtweedie",...)

Arguments

object

a cpglm or cpglmm object.

formula

an object of class formula. For bcpglm, it is the same as in glm. For bcpglmm, it is a two-sided linear formula object describing the fixed-effects part of the model,

link

a specification for the model link function. This can be either a literal character string or a numeric number. If it is a character string, it must be one of "log", "identity", "sqrt" or "inverse". If it is numeric, it is the same as the link.power

inits

a list of initial values to be used for each chain. It must be of length n.chains. In bcpglm, it is a named list with three components 'beta' (fixed effects), 'phi' (dispersion) and 'p' (index parameter) for each element of the l

data, subset, weights, na.action, offset, contrasts

further model specification arguments as in cpglm; see there for details.

n.chains

an integer indicating the number of Markov chains (default: 3).

n.iter

number of total iterations per chain (including burn in; default: 2000)

n.burnin

length of burn in, i.e. number of iterations to discard at the beginning. Default is n.iter/2, that is, discarding the first half of the simulations.

n.thin

thinning rate. Must be a positive integer. Set n.thin > 1 to save memory and computation time if n.iter is large. Default is max(1, floor(n.chains*(n.iter-n.burnin) / 1000)) which will only thin if there are at least 2000 simulations.

n.sims

The approximate number of simulations to keep after thinning.

n.report

if greater than zero, fitting information will be printed out after each n.report iterations.

prior.beta.mean

a vector of prior means for the model coefficients. Default is a vector of zeros.

prior.beta.var

a vector of prior variance for the model coefficients. Default is a vector of 10000's.

prior.Sigma

a list that specifies the prior for each of the variance components.

bound.phi

upper bound of the uniform prior for the dispersion parameter phi. The default is 100.

bound.p

a vector of lower and upper bound for the index parameter $p$. The default is c(1.01,1.99).

method

determines how the MCMC is implemented. If method="dtweedie" (the default), then full conditionals are computed using numerical methods to approximate the tweedie density. If method="latent", then the latent Poisson variables are

tune.iter

number of iterations used for tuning the variance for the Normal proposal distribution. These iterations will not be used in the final output. Default is 4000, and set it to be zero if the tuning process is not desired.

n.tune

if positive, the tune.iter iterations will be divided into n.tune loops. Default is 10.

tune.weight

the weight to be given to the old covariance matrix as opposed to the covariance matrix estimated from simulated samples. Default is 0.25.

basisGenerators

a character vector of names of functions that generate spline bases. See tp for details.

...

not used.

Value

bcpglm returns an object of class "bcpglm", and bcpglmm returns an object of class "bcpglmm". See bcpglm-class and bcpglmm-class for details of the return values as well as various methods available for this class.

eqn

$\phi$

code

coda

Details

These functions provide Monte Carlo Markov Chain [MCMC] methods for fitting Tweedie compound Poisson linear models within the Bayesian framework. bcpglm handles generalized linear models while bcpglmm fits the mixed models. mcmcsamp takes an existing cpglm or cpglmm object and runs MCMC simulations based on the existing model structure and estimated parameters.

Two methods are provided here: one relies on the capability to approximate the Tweedie density (i.e., the normalizing constant involving $\phi$ and $p$), the other relies on the use of latent Poisson variables to avoid direct evaluation of the Tweedie density (method="latent").

In a Bayesian model, prior distributions have to be specified for all parameters in the model. Here, Normal distributions are used for the model coefficients ($\beta$), a Uniform distribution for the dispersion parameter ($\phi$), a Uniform distribution for the index parameter ($p$), and for the variance component, Inverse-Gamma (igamma(scale=0.001, shape=0.001)) if there is one variance component parameter, or Inverse-Wishart (iwish(df=3,scale=diag(1,df))) if there are more than one parameters. Prior means and variances for the model coefficients can be supplied using the argument prior.beta.mean and prior.beta.var, respectively. The prior distribution for $\phi$ is uniform on (0, bound.phi). And the bounds of the Uniform for $p$ can be specified in the argument bound.p. Prior distribution for the variance components can be changed in the prior.Sigma argument.See details in Section 'Arguments'.

In implementing the MCMC, a Gibbs sampler is constructed as follows:

Ifmethod="latent", simulation of the latent Poisson variables relies on rejection sampling, where zero-truncated Poisson distributions are used as proposals.

Fixed effects and random effects are updated block-wise using the Metropolis-Hastings algorithm, where a random-walk multivariate Normal proposal distribution is used. Updates of the index parameter and the dispersion parameter rely on univariate Metropolis-Hastings, where a truncated Normal distribution is used as a proposal.

Examples

Run this code

# fit the fineroot data with Bayesian models
# first use the latent variable approach
set.seed(10)
fit1 <- bcpglm(RLD~ factor(Zone)*factor(Stock), data=fineroot,
               n.iter=11000, n.burnin=1000,
               n.thin=10, n.report=5000, method="latent")
gelman.diag(fit1$sims.list)
# diagnostic plots                             
acfplot(fit1$sims.list,lag.max=20)
xyplot(fit1$sims.list)                              
densityplot(fit1$sims.list)               

# summary results
summary(fit1)                             

# now try direct density evaluation (This is much slower)
set.seed(10)
fit2 <- bcpglm(RLD~ factor(Zone)*factor(Stock), data=fineroot,
               n.iter=11000, n.burnin=1000,
               n.thin=10, n.report=5000)
gelman.diag(fit2$sims.list)
summary(fit2)                             


# now fit the Bayesian model to an insurance loss triangle 
# (see Peters et al. 2009)
# MLE estimate
fit3 <- cpglm(increLoss~ factor(year)+factor(lag), data=insLoss)

fit4 <- mcmcsamp(fit3, tune.iter=5000, n.tune = 10,
               n.iter=11000, n.burnin=1000,
               n.thin=10, n.report=5000, bound.p=c(1.1,1.95))
gelman.diag(fit4$sims.list)
summary(fit4)                             

# mixed models 

set.seed(10)
fit5 <- bcpglmm(RLD~Zone*Stock+(1|Plant), data = fineroot,  
                   n.chains=3, n.iter=25000, n.burnin=5000,
                   n.sims=2000, n.report=5000)
gelman.diag(fit5$sims.list)                   
summary(fit5)

# now change the prior distribution for the random component
fit6 <- bcpglmm(RLD~Zone*Stock+(1|Plant), data = fineroot,  
                   n.chains=3, n.iter=25000, n.burnin=5000,
                   n.sims=2000, n.report=5000, tune.iter=6000, n.tune=20,                  
                   prior.Sigma = list(igamma(scale=0.01,shape=0.01)))
summary(fit6)                   

# now use latent variables and mcmcsamp
fit7 <- cpglmm(RLD~Zone*Stock+(1|Plant), data = fineroot)   
fit8 <- mcmcsamp(fit7, n.chains=3, n.iter=25000, n.burnin=5000,
                   n.sims=2000, n.report=5000, tune.iter=6000, n.tune=20,                  
                   prior.Sigma = list(igamma(scale=0.01,shape=0.01)),
                   method = "latent")
summary(fit8)

Run the code above in your browser using DataLab