Renouv: Fit a 'Renouvellement' model

Description

Fit a 'renouvellement' POT model using Over the Threshold data and possibly historical data of two kinds.

Usage

Renouv(x,
          threshold = NULL,
          effDuration = NULL,
          distname.y = "exponential",
          MAX.data = NULL,
          MAX.effDuration = NULL,
          OTS.data = NULL,
          OTS.effDuration = NULL,
          OTS.threshold = NULL,
          fixed.par.y = NULL,
          start.par.y = NULL,
          force.start.H = FALSE,
          numDeriv = TRUE,
          trans.y = NULL,
          jitter.KS = TRUE,
          pct.conf = c(95, 70),
          rl.prob = NULL,
          prob.max = 1.0-1e-04 ,
          pred.period = NULL,
          suspend.warnings = TRUE,
          control = list(maxit = 300, fnscale = -1),
          control.H = list(maxit = 300, fnscale = -1),
          trace = 0,
          plot = TRUE,
          ...)

Arguments

Either a numeric vector or an object of the class "Rendata". In the first case, x contains all the levels above the threshold for a variable of interest. In the second case, most formal arguments take values in accord

threshold

Value of the threshold for the OT data.

effDuration

Effective duration, i.e. duration of the OT period .

distname.y

Name of the distribution for the exceedances over the threshold.

MAX.data

Either numeric vector or a list of numeric vectors representing historical data $r$-max by blocks. When a vector is given, there is only one block, and the data are the corresponding $r$-max observed levels where $r$ is the vector length;

MAX.effDuration

Vector of (effective) durations, one by block MAX data.

OTS.data

A numeric vector or a list of numeric vectors representing supplementary Over Threshold data in blocks. When a vector is given, there is only one block, and the data contain all the 'historical' levels over the corresponding threshold given in

OTS.effDuration

A numeric vector giving the (effective) durations for the OTS blocks.

OTS.threshold

A vector giving the thresholds for the different OTS blocks.

fixed.par.y

Named list of known (or fixed) parameter values for the y-distribution.

start.par.y

Named list of parameter initial values for the y-distribution. Only used when the distribution does not belong to the list of special distributions.

force.start.H

Logical. When TRUE, the values in start.par.y (which must then be correct) will be used also as starting values in the maximisation of the global likelihood : OT data and historical data. This is useful e.g. when the

numDeriv

Logical: should the hessian be computed using the numDeriv package (value TRUE) or should it be taken from the results of optim?

trans.y

Transformation of the levels before thresholding (if not NULL). This is only possible with the "exponential" value distname.y. The two allowed choices are "square" and "log"

jitter.KS

Logical. When set to TRUE, a small amount of
    noise is added to the "OT" data used in the Kolmogorov-Smirnov test
    in order to remove ties. This is done using the OTjitter function.

pct.conf

Character or numeric vector specifying the
    percentages for the confidence (bilateral) limits on quantiles.

rl.prob

Vector of probabilities for the computation of return levels.
    These are used in plots (hence must be dense enough) and appear on output
    in the data.frame ret.lev.

prob.max

Max value of probability for return level and confidence limits
    evaluations. This argument 'shortens' the default
    prob vector: values > prob.max in the default
    prob vector are omitted. Ignored when a

pred.period

A vector of "pretty" periods at which return level
    and probability will be evaluated and returned in the pred
    data.frame.

suspend.warnings

Logical. If TRUE, the warnings will be suspended
    during optimisation steps. This is useful when the parameters
    are subject to constraints as is usually the case.

control

A named list used in optim for the
    no-history stage (if any). Note that fnscale = -1 says that 
    maximisation is required (not minimisation) and must not be changed!

control.H

A named list used in optim for the historical stage (if any).

trace

Level of verbosity. Value 0 prints nothing.

plot

Draw a return level plot?

...

Arguments passed to plot.Renouv, e.g. main,
    ylim.

`Value`

An object with class "Renouv". This is mainly a list with the various
  results.
est.NEstimate(s) for the count "N" part. This estimate
    does not use the historical data, even if is available.
est.yEstimate(s) for the exceedance "y" part. This
    estimate does not use the historical data, even if available.
cov.N, cov.yThe (co-)variances for the estimates above.
estimateEstimate(s) for the whole set of parameters based on
    OT data and on historical data if available.
ks.testKolmogorov-Smirnov goodness-of-fit test.
ret.levA data.frame containing return levels and confidence
    limits. The  corresponding probabilities are either provided by user
    or taken as default values.
predA data.frame similar to ret.lev, but with
    "pretty" return periods. These are taken as the provided values
    pred.period if any or are chosen as "round" multiples of the
    time unit (taken from effDuration). The periods are chosen in order
    to cover periods ranging from 1/10 to 10 time units.
Other results are available. Use names(result) to see their
  list.
  
  Except in the the special case where distname.y is
  "exponential" and where no historical data are used, the
  inference on quantiles is obtained with the delta method and
  using numerical derivatives. Confidence limits are unreliable for
  return levels much greater than the observation-historical period.
  
  Due to the presence of estimated parameters, the Kolmogorov-Smirnov
  test is unreliable when less than 30 observations are available.

`encoding`

UTF-8

`Warning`

With some distributions or in presence of
  historical data, the estimation can fail due to some problem during the
  optimisation. Even when the optimisation converges, the determination of
  the (numerical) hessian can be impossible: This can happen if one
    or more parameter is too small to compute a finite difference
  approximation of gradient. For instance the 'rate' parameter of the
  exponential distribution (= inverse mean) will be small when the mean of
  exceedances is large.
  A possible solution is then to rescale the data e.g. dividing
  them  by 10 or 100. As a rule of thumb, an
  acceptable scaling leads to data (exceedances) of a few units to a few
  hundreds, but an order of magnitude of thousands or more should be
    avoided and reduced by scaling. The rescaling is recommanded for the square
  exponential distribution (obtained with trans = "square") since
  the observations are squared. 
  Another possible way to solve the problem is to change the
  numDeriv value. 
  This problem will be solved in a future version.

`Details`

The model is fitted using Maximum Likelihood (ML).
  
  Some distributions listed below and here called "special" are
  considered in a special manner.  For these
  distributions, it is not necessary to give starting values nor
  parameter names which are unambiguous. 
  ll{
    distribution 	parameters 
    exponential 	rate 
    weibull 	shape, scale
    gpd 	scale, shape
    log-normal 	meanlog,
    sdlog 
    gamma 	shape, scale
    mixexp2 	prob1, rate1, delta
  }
  Other distributions can be used. Because the
  probability functions are then used in a "black-box" fashion, these
  distributions should respect the following
  formal requirements:
  
  The name for thedensity,distributionandquantilefunctions must obey to theclassical "prefixing
      convention". Prefixes must be respectively:"d","p","q".
      This rules applies for distribution of thestatspackage
    and those of many other packages suchevd.
The first (main) argument must be vectorisablein all three
    functions, i.e. a vector ofx,qorpmust be
    accepted by the density, the distribution and the quantile
    functions.
The density must have alogformalargument. WhenlogisTRUE, the log-density is returned instead of
    the density.
  
  For such a distribution, it is necessary to give arguments names
  in start.par.y. The arguments list must have exactly the required
  number of parameters for the family (e.g. 2 for gamma).
  Some parameters can be fixed (known); then the parameter set will be
  the reunion of those appearing in start.par.y
  and those in fixed.par.y. Anyway, in the present version, at least one parameter
  must be unknown for the y part of model.
  
  Mathematical requirements exist for a correct use of ML. They
  are refered to as "regularity conditions" in ML theory. Note that
  the support of the distribution must be the set of non-negative
  real numbers.
  
  The estimation procedure differs according to the existence of
  historical (MAX) data.
  When no historical data is given, the whole set
  of parameters contains orthogonal subsets: a "point process" part
  concerning the process of events, and an "observation" part
  concerning the threshold exceedances part. The parameters can in this
  case be estimated separately. The rate of the Poisson process is estimated
  by the empirical rate, i.e. the number of events divided by the total
  duration given ineffDuration. The  "Over the Threshold"
  parameters are estimated from the exceedances computed asx.OTminus the threshold.
When historical data is given, the two parameter vectors must
  be coped with together in maximising the global likelihood. In this case, we
  begin the estimation ignoring the historical data and then use the
  estimates as starting values for the maximisation of the global
  likelihood. In some circumstances, the estimates obtained in the
  first stage can not be used with historical data because some of these
  fall outside the support of the distribution fitted. This can happen
  e.g. with adistname.y = "gpd"when historical data exceedthreshold-scale/shapefor the values ofshapeandscalecomputed in the first stage.

`References`

Miquel J. (1984) Guide pratique d'estimation des
probabilités{probabilites} de crues, Eyrolles (coll. EDF DER).

`See Also`

rRenouv to simulate "renouvellement" data,
  RLplot for the return level plot. See
  optim for the tuning of optimisation.

`Examples`

Run this codelibrary(Renext)
data(Garonne)

## use a "Rendata" object as 'x'. Historical data are used!
fit <- Renouv(x = Garonne, distname = "weibull", trace = 1,
              main = "'Garonne' data")
summary(fit)

## generates a warning because of the ties
fit2 <- Renouv(x = Garonne, distname = "gpd",
               jitter.KS = FALSE,
               threshold = 2800, trace = 1,
               main = "'Garonne' data with threshold = 2800 and GPD fit")

## use a numeric vector as 'x'
fit3 <-
    Renouv(x = Garonne$OTdata$Flow,
           threshold = 2500,
           effDuration = 100,
           distname = "gpd",
           OTS.data = list(numeric(0), c(6800, 7200)),
           OTS.effDuration = c(100, 150),
           OTS.threshold = c(7000, 6000), 
           trace = 1,
           main = "'Garonne' data with artificial "OTS" data")

## Add historical (fictive) data
fit4 <- Renouv(x = Garonne$OTdata$Flow,
               threshold = 2500,
               effDuration = 100,
               distname = "weibull",
               fixed.par.y = list(shape = 1.1),
               OTS.data = list(numeric(0), c(6800, 7200)),
               OTS.effDuration = c(100, 150),
               OTS.threshold = c(7000, 6000),
               trace = 0,
               main = "'Garonne' data with artificial "OTS" data")
Run the code above in your browser using DataLab