fitmaxstab: Fits a max-stable process to data

Description

This function fits max-stable processes to data using pairwise likelihood. Two max-stable characterisations are available: the Smith and Schlather representations.

Usage

fitmaxstab(data,  coord, cov.mod, loc.form, scale.form,
shape.form, marg.cov = NULL, temp.cov = NULL, temp.form.loc = NULL,
temp.form.scale = NULL, temp.form.shape = NULL, iso = FALSE, ...,
fit.marge = FALSE, warn = TRUE, method = "Nelder", start, control =
list(), weights = NULL, corr = FALSE, check.grad
= FALSE)

Arguments

data

A matrix representing the data. Each column corresponds to one location.

coord

A matrix that gives the coordinates of each location. Each row corresponds to one location.

cov.mod

A character string corresponding to the covariance model in the max-stable representation. Must be one of "gauss" for the Smith's model; "whitmat", "cauchy", "powexp", "bessel" or "caugen" for the Whittle-Matern, the Cauchy, the Powered Expone

loc.form, scale.form, shape.form

R formulas defining the spatial linear model for the GEV parameters. May be missing. See section Details.

marg.cov

Matrix with named columns giving additional covariates for the GEV parameters. If NULL, no extra covariates are used.

temp.cov

Matrix with names columns giving additional *temporal* covariates for the GEV parameters. If NULL, no temporal trend are assume for the GEV parameters --- see section Details.

temp.form.loc, temp.form.scale, temp.form.shape

R formulas defining the temporal trends for the GEV parameters. May be missing. See section Details.

iso

Logical. If TRUE an isotropic model is fitted to data. Otherwise (default), anisotropy is allowed. Currently, this is only implemented for the Smith's model.

...

Several arguments to be passed to the optim, nlm or nlminb functions. See section details.

fit.marge

Logical. If TRUE, the GEV parameters are estimated pointwise or using the formulas given by loc.form, scale.form and shape.form. If FALSE, observations are supposed to be unit Fr

warn

Logical. If TRUE (default), users are warned if the log-likelihood is infinite at starting values and/or problems arised while computing the standard errors.

method

The method used for the numerical optimisation procedure. Must be one of BFGS, Nelder-Mead, CG, L-BFGS-B, SANN, nlm or nlminb. See

start

A named list giving the initial values for the parameters over which the pairwise likelihood is to be minimized. If start is omitted the routine attempts to find good starting values - but might fail.

control

A list giving the control parameters to be passed to the optim function.

weights

A numeric vector specifying the weights in the pairwise likelihood - and so has length the number of pairs. If NULL (default), no weighting scheme is used.

corr

Logical. If TRUE (non default), the asymptotic correlation matrix is computed.

check.grad

Logical. If TRUE (non default), the analytic gradient of the pairwise likelihood will be compared to the numerical one. Such a checking might be usefull for ill-conditionned situation diagnosis.

Value

This function returns a object of class maxstab. Such objects are list with components:
fitted.valuesA vector containing the estimated parameters.
std.errA vector containing the standard errors.
fixedA vector containing the parameters of the model that have been held fixed.
paramA vector containing all parameters (optimised and fixed).
devianceThe (pairwise) deviance at the maximum pairwise likelihood estimates.
corrThe correlation matrix.
convergence, counts, messageComponents taken from the list returned by optim - for the mle method.
dataThe data analysed.
modelThe max-stable characterisation used.
fit.margeA logical that specifies if the GEV margins were estimated.
cov.funThe estimated covariance function - for the Schlather model only.
extCoeffThe estimated extremal coefficient function.
cov.modThe covariance model for the spatial structure.

Warning

When using reponse surfaces to model spatially the GEV parameters, the likelihood is pretty rough so that the general purpose optimization routines may fail. It is your responsability to check if the numerical optimization succeeded or not. I tried, as best as I can, to provide warning messages if the optimizers failed but in some cases, no warning will appear!

Details

As spatial data often deal with a large number of locations, it is impossible to write analytically the joint distribution. Consequently, the fitting procedure substitutes the "full likelihood" for the pairwise likelihood.

Let define $L_{i,j}(x_{i,j}, \theta)$ the likelihood for site $i$ and $j$, where $i = 1, \dots, N-1$, $j = i+1, \dots, N$, $N$ is the number of site within the region and $x_{i,j}$ are the joint observations for site $i$ and $j$. Then the pairwise likelihood $PL(\theta)$ is defined by:

$$\ell_P = \log PL(\theta) = \sum_{i = 1}^{N-1} \sum_{j=i+1}^{N} \log L_{i,j} (x_{i,j}, \theta)$$

As pairwise likelihood is an approximation of the ``full likelihood'', standard errors cannot be computed directly by the inverse of the Fisher information matrix. Instead, a sandwich estimate must be used to account for model mispecification e.g.

$$\hat{\theta} \sim N(\theta, H^{-1} J H^{-1})$$ where $H$ is the Fisher information matrix (computed from the misspecified model) and $J$ is the variance of the score function. There are two different kind of covariates : "spatial" and "temporal".

A "spatial" covariate may have different values accross station but does not depend on time. For example the coordinates of the stations are obviously "spatial". These "spatial" covariates should be used with the marg.cov and loc.form, scale.form, shape.form.

A "temporal" covariates may have different values accross time but does not depend on space. For example the years where the annual maxima were recorded is "temporal". These "temporal" covariates should be used with the temp.cov and temp.form.loc, temp.form.scale, temp.form.shape.

As a consequence note that marg.cov must have K rows (K being the number of sites) while temp.cov must have n rows (n being the number of observations).

References

Cox, D. R. and Reid, N. (2004) A note on pseudo-likelihood constructed from marginal densities. Biometrika 91, 729--737.

Demarta, S. and McNeil, A. (2005) The t copula and Related Copulas International Statistical Review 73, 111-129.

Gholam--Rezaee, M. (2009) Spatial extreme value: A composite likelihood. PhD Thesis. Ecole Polytechnique Federale de Lausanne.

Kabluchko, Z., Schlather, M. and de Haan, L. (2009) Stationary max-stable fields associated to negative definite functions Annals of Probability 37:5, 2042--2065.

Padoan, S. A. (2008) Computational Methods for Complex Problems in Extreme Value Theory. PhD Thesis. University of Padova.

Padoan, S. A., Ribatet, M. and Sisson, S. A. (2010) Likelihood-based inference for max-stable processes. Journal of the American Statistical Association (Theory and Methods) 105:489, 263-277.

Schlather, M. (2002) Models for Stationary Max-Stable Random Fields. Extremes 5:1, 33--44.

Smith, R. L. (1990) Max-stable processes and spatial extremes. Unpublished.

Examples

Run this code

##Define the coordinate of each location
n.site <- 30
locations <- matrix(runif(2*n.site, 0, 10), ncol = 2)
colnames(locations) <- c("lon", "lat")

##Simulate a max-stable process - with unit Frechet margins
data <- rmaxstab(40, locations, cov.mod = "whitmat", nugget = 0, range = 30,
smooth = 0.5)

##Now define the spatial model for the GEV parameters
param.loc <- -10 + 2 * locations[,2]
param.scale <- 5 + 2 * locations[,1] + locations[,2]^2
param.shape <- rep(0.2, n.site)

##Transform the unit Frechet margins to GEV
for (i in 1:n.site)
  data[,i] <- frech2gev(data[,i], param.loc[i], param.scale[i],
param.shape[i])

##Define a model for the GEV margins to be fitted
##shape ~ 1 stands for the GEV shape parameter is constant
##over the region
loc.form <- loc ~ lat
scale.form <- scale ~ lon + I(lat^2)
shape.form <- shape ~ 1

##Fit a max-stable process using the Schlather's model
fitmaxstab(data, locations, "whitmat", loc.form, scale.form,
           shape.form)

## Model without any spatial structure for the GEV parameters
## Be careful this could be *REALLY* time consuming
fitmaxstab(data, locations, "whitmat")

##  Fixing the smooth parameter of the Whittle-Matern family
##  to 0.5 - e.g. considering exponential family. We suppose the data
##  are unit Frechet here.
fitmaxstab(data, locations, "whitmat", smooth = 0.5, fit.marge = FALSE)

##  Fitting a penalized smoothing splines for the margins with the
##     Smith's model
data <- rmaxstab(40, locations, cov.mod = "gauss", cov11 = 100, cov12 =
                 25, cov22 = 220)

##     And transform it to ordinary GEV margins with a non-linear
##     function
fun <- function(x)
  2 * sin(pi * x / 4) + 10
fun2 <- function(x)
  (fun(x) - 7 ) / 15

param.loc <- fun(locations[,2])
param.scale <- fun(locations[,2])
param.shape <- fun2(locations[,1])

##Transformation from unit Frechet to common GEV margins
for (i in 1:n.site)
  data[,i] <- frech2gev(data[,i], param.loc[i], param.scale[i],
param.shape[i])

##Defining the knots, penalty, degree for the splines
n.knots <- 5
knots <- quantile(locations[,2], prob = 1:n.knots/(n.knots+1))
knots2 <- quantile(locations[,1], prob = 1:n.knots/(n.knots+1))

##Be careful the choice of the penalty (i.e. the smoothing parameter)
##may strongly affect the result Here we use p-splines for each GEV
##parameter - so it's really CPU demanding but one can use 1 p-spline
##and 2 linear models.
##A simple linear model will be clearly faster...
loc.form <- y ~ rb(lat, knots = knots, degree = 3, penalty = .5)
scale.form <- y ~ rb(lat, knots = knots, degree = 3, penalty = .5)
shape.form <- y ~ rb(lon, knots = knots2, degree = 3, penalty = .5)

fitted <- fitmaxstab(data, locations, "gauss", loc.form, scale.form, shape.form,
                     control = list(ndeps = rep(1e-6, 24), trace = 10),
                     method = "BFGS")
fitted
op <- par(mfrow=c(1,3))
plot(locations[,2], param.loc, col = 2, ylim = c(7, 14),
     ylab = "location parameter", xlab = "latitude")
plot(fun, from = 0, to = 10, add = TRUE, col = 2)
points(locations[,2], predict(fitted)[,"loc"], col = "blue", pch = 5)
new.data <- cbind(lon = seq(0, 10, length = 100), lat = seq(0, 10, length = 100))
lines(new.data[,1], predict(fitted, new.data)[,"loc"], col = "blue")
legend("topleft", c("true values", "predict. values", "true curve", "predict. curve"),
       col = c("red", "blue", "red", "blue"), pch = c(1, 5, NA, NA), inset = 0.05,
       lty = c(0, 0, 1, 1), ncol = 2)

plot(locations[,2], param.scale, col = 2, ylim = c(7, 14),
     ylab = "scale parameter", xlab = "latitude")
plot(fun, from = 0, to = 10, add = TRUE, col = 2)
points(locations[,2], predict(fitted)[,"scale"], col = "blue", pch = 5)
lines(new.data[,1], predict(fitted, new.data)[,"scale"], col = "blue")
legend("topleft", c("true values", "predict. values", "true curve", "predict. curve"),
       col = c("red", "blue", "red", "blue"), pch = c(1, 5, NA, NA), inset = 0.05,
       lty = c(0, 0, 1, 1), ncol = 2)

plot(locations[,1], param.shape, col = 2,
     ylab = "shape parameter", xlab = "longitude")
plot(fun2, from = 0, to = 10, add = TRUE, col = 2)
points(locations[,1], predict(fitted)[,"shape"], col = "blue", pch = 5)
lines(new.data[,1], predict(fitted, new.data)[,"shape"], col = "blue")
legend("topleft", c("true values", "predict. values", "true curve", "predict. curve"),
       col = c("red", "blue", "red", "blue"), pch = c(1, 5, NA, NA), inset = 0.05,
       lty = c(0, 0, 1, 1), ncol = 2)
par(op)

Run the code above in your browser using DataLab