evm(y, data, family, ...)
"evm"(y, data, family=gpd, th=-Inf, qu, ..., penalty = NULL, prior = "gaussian", method = "optimize", cov="observed", start = NULL, priorParameters = NULL, maxit = 10000, trace = NULL, iter = 40500, burn = 500, thin = 4, proposal.dist = c("gaussian", "cauchy"), jump.cov, jump.const=NULL, R=100, verbose = TRUE)
"print"(x, digits=max(3, getOption("digits") - 3), ...)
"summary"(object, nsim=1000, alpha=0.05, ...)
"plot"(x, main=rep(NULL, 4), xlab=rep(NULL, 4), nsim=1000, alpha=0.05, ...)
"AIC"(object, penalized=FALSE, ..., k=2)
"print"(x, print.seed=FALSE, ...)
"plot"(x, which.plots=1:3, density.adjust=2, print.seed=FALSE, ...)
texmexMetropolis(x, log.lik, proposals, verbose, trace)
texmexJumpConst(j, map)
data
. y
and any covariates. family=gpd
and a generalized Pareto distribution is
fit to the data. Alternatively the family could be gev
,
resulting in a generalized extreme value distribution being
fit. No other families are currently available in texmex,
but users may write their own.family=gpd
), the threshold for y
,
exceedances above which will be used to fit the upper tail
model. Note that if you have already thresholded your data
and want to model all of y
, you still need to specify
th
.th
, a probability defined
such that quantile(y,qu)
equals th
.evm
, formulae for the parameters in the
family, e.g. phi = ~ x
. If none are specified,
they all default to ~1
.penalty
is "gaussian" or "lasso" then
the parameters for the penalization are specified through
the priorParameters
argument. See below. Defaults to
penalty=NULL
and applies maximum likelihood estimation.method = "optimize"
, just an alternative way of
specifying the pentalty, and only one or neither of penalty
and prior
should be given. If method = "simulate"
,
prior must be "gaussian" because no other prior distributions
have been implemented.optim
and point estimates (either ML or MAP estimates) are returned with other
information. If "simulate", a Metropolis algorithm is used to simulate
from the joint posterior distribution of the parameters. If "bootstrap",
a parametric boostrap is performed.cov = "observed"
in which case the observed information matrix
is used, if the info
element of the texmexFamily
object
is present. Note that currently, this is not implemented for gev
. The only other
option is cov = "numeric"
in which case a numerical approximation
of the Hessian is used (see the help for optim
). In some cases,
particularly with small samples, the numerical approximation can be
quite different from the closed form (cov="observed"
)
result, and the value derived from the observed information
should be preferred. However, in either case, since the
underlying log-likelihood may be far from quadratic for small samples,
the resulting estimates of standard errors are liable to approximate
poorly the true standard errors. Also see the comments in
the Details section, below.optim
.
If not provided, the function will use the start
element
of the texmexFamily
object if it exists.method = "simulate"
then these represent the
parameters in the Gaussian prior distribution. If
method = 'optimize'
then these represent the
parameters in the penalty function.
If not supplied: all default prior means are zero;
all default prior variances are $10^4$; all covariances
are zero. optim
. method = "optimize"
,
the argument is passed into optim
-- see the help for that
function. If method = "simulate"
, the argument determines at
how many steps of the Markov chain the function should tell the user, and
in this case it defaults to trace = 10000
. method = "simulate"
.
Defaults to 40500.jump.const
.verbose=TRUE
.evmOpt
, evmSim
, summary.evmOpt
or summary.evmSim
returned by evmSim
or
summary.evmSim
. In texmexMetropolis
, x
is a matrix, the first row of which is used as the starting
point of the simulation.plot
method for class evmSim
, titles for
diagnostic plots. Should be a vector of length 4, with values
corresponding to the character strings to appear on the titles of
the pp, qq, return level, and density estimate plots respectively.main
but labels for x-axes rather than titles.plot
and summary
methods for class
evmSim
. The number of replicates to be simulated to produce
the simulated tolerance intervals. Defaults to nsim = 1000
.plot
and summary
methods for class evmSim
.
A 100(1 - alpha)% simulation envelope is produced. Defaults to
alpha = 0.05
penalized=FALSE
and uses the
true log-likelihood.k=2
.print.seed=FALSE
.plot
method for class evmSim
. Which plots
to produce. Option 1 gives kernel density estimates,
2 gives traces of the Markov chains with superimposed
cumulative means, 3 gives autocorrelation functions.
Defaults to which.plots=1:3
.plot
method for class evmSim
.
Passed into density
. Controls the amount of
smoothing of the kernel density estimate. Defaults to
density.adjust=2
.texmexJumpConst
, a model fit by maximum
(penalized) likelihood. The function is not intended to be called by
the end user.method = "bootstrap"
is requested. Defaults to 100.method = "optimize"
, an object of class evmOpt
:evmSim
that produced the object.data
is a list with elements y
and D
. y
is the response variable and D
is a list containing the design
matrices implied by any formlae used in the call to evm
.optim
relating to whether or
not the optimizer converged.texmexFamily
object, if any.penalty='none'
is used, this will be identical to ploglik
, above.predict
method to ensure all factor levels are known, even if they don't appear in
newdata
.method = "simulate"
, an object of class evmSim
:evmSim
that produced the object.evmOpt
and methods for this
class (such as resid and plot) may be useful.method = "bootstrap"
, an object of class evmBoot
:evmBoot
that produced the object.evmOpt
and methods for this
class (such as resid and plot) may be useful.texmex
, the main modelling
function was gpd
and models using the generalized
Pareto distribution (GPD) were used. In this version, the main
modelling function is now evm
(extreme value model)
and the distribution to be used is specified through the
family
argument. Object gpd
is now an object of class texmexFamily
.
Currently, the only other texmexFamily
object available
is gev
which results in fitting a generalized extreme
value (GEV) distribution to the data.
See Coles (2001) for an introduction to extreme value modelling and the GPD and GEV models.
For the GPD model, we use the following parameterisation of evm:
$$P(Y \le y) = 1 - \left(1 + \frac{\xi y}{\sigma}\right)^{-1/\xi}$$ for $y \ge 0$ and $1 + \xi y / \sigma \ge 0.$
For the GEV model, we use: $$P(Y \le y) = \exp \left\{ -\left[ 1 + \xi \left( \frac{y - \mu}{\sigma}\right) \right] ^{-1/\xi} \right\} $$
In each case, the scale parameter is sigma ($\sigma$) and the shape parameter is xi ($\xi$). The GEV distribution also has location parameter mu ($\mu$).
Working with the log of the scale parameter improves the stability of computations, makes a quadratic penalty more appropriate and enables the inclusion of covariates in the model for the scale parameter, which must remain positive. We therefore work with $\phi$=log($\sigma$). All specification of priors or penalty functions refer to $\phi$ rather than $\sigma$. A quadratic penalty can be thought of as a Gaussian prior distribution, whence the terminology of the function.
Parameters of the evm are estimated by using maximum (penalized)
likelihood (method = "optimize"
), or by simulating from the
posterior distribution of the model parameters using a
Metropolis algorithm (method = "simulate"
). In the
latter case, start
is used as a starting value for the
Metropolis algorithm; in its absence, the maximum penalized likelhood
point estimates are computed and used.
A boostrap approach is also available (method = "bootstrap"
).
This runs a parametric bootstrap, simulating from the model fit
by optimization.
When method = "optimize"
, the plot
function produces
diagnostic plots for the fitted model. These differ
depending on whether or not there are covariates in the model. If there
are no covariates then the diagnostic plots are PP- and QQ-plots, a
return level plot (produced by plotrl.evmSim
) and a histogram
of the data with superimposed density estimate. These are all calculated
using the data on the original scale. If there are covariates in the
model then the diagnostics consist of PP- and QQ- plots calculated
by using the model residuals (which will be standard exponential
devaiates under the GPD model and standard Gumbel deviates under
the GEV model), and plots of residuals versus fitted model parameters.
The PP- and QQ-plots show simulated pointwise tolerance intervals.
The region is a 100(1 - alpha)% region based on nsim
simulated
samples.
When method = "simulate"
the plot
function produces
diagnostic plots for the Markov chains used to simulate from the posterior
distributions for the model parameters. If the chains have converged on
the posterior distributions, the trace plots should look like "fat hairy
caterpillars" and their cumulative means should converge rapidly. Moreover,
the autocorrelation functions should converge quickly to zero.
When method = "bootstrap"
, the plot
function plots
the bootstrap distributions of the parameters.
When method = "simulate"
the print
and summary
functions
give posterior means and standard deviations. Posterior means are also
returned by the coef
method. Depending on what you want to do
and what the posterior distributions look like (use plot
method)
you might want to work with quantiles of the posterior distributions
instead of relying on standard errors.
When method = "bootstrap"
, summaries of the bootstrap distribution
and the bootstrap estimate of bias are displayed.
rl.evmOpt
, predict.evmOpt
,
evm.declustered
. mod <- evm(rain, th=30)
mod
par(mfrow=c(2, 2))
plot(mod)
# mod <- evm(rain, th=30, method="sim")
# par(mfrow=c(3, 2))
# plot(mod)
mod <- evm(SeaLevel, data=portpirie, family=gev)
mod
plot(mod)
# mod <- evm(SeaLevel, data=portpirie, family=gev, method="sim")
# par(mfrow=c(3, 3))
# plot(mod)
Run the code above in your browser using DataLab