Bayesian Markov Chain Monte Carlo algorithm for flood frequency analysis with historical and other information. The user can choose between a local and a regional analysis.
BayesianMCMC (xcont, xhist=NA, infhist=NA, suphist=NA,
nbans=NA, seuil=NA, nbpas=1000, nbchaines=3,
confint=c(0.05, 0.95), dist="GEV",
apriori=function(...){1},
parameters0=NA, varparameters0=NA)
BayesianMCMCcont (x, nbpas=NA)
BayesianMCMCreg (xcont, scont, xhist=NA, infhist=NA, suphist=NA, shist=NA,
nbans=NA, seuil=NA, nbpas=1000, nbchaines=3,
confint=c(0.05, 0.95), dist="GEV",
apriori=function(...){1},
parameters0=NA, varparameters0=NA)
BayesianMCMCregcont (x, nbpas=NA)
plotBayesianMCMCreg_surf (x, surf, ask=FALSE, ...)
# S3 method for BayesianMCMC
plot (x, which=1, ask=FALSE, ...)
# S3 method for BayesianMCMC
print (x, ...)
# S3 method for BayesianMCMCreg
plot (x, which=1, ask=FALSE, ...)
# S3 method for BayesianMCMCreg
print (x, ...)
BayesianMCMC
and BayesianMCMCcont
(which just continues the simulations of BayesianMCMC
for local analyses and BayesianMCMCreg
and BayesianMCMCregcont
for regional analyses return the following values:
BayesianMCMCreg
and BayesianMCMCregcont
(which just continues the simulations of BayesianMCMC
starting from its output) return the following values:
parameters
matrix (nbpas)x(nbchaines) with the simulated sets of parameters with the MCMC algorithm;
parametersML
set of parameters correspondent to the maximum likelihood;
returnperiods
return periods for which quantilesML
and intervals
are calculated;
quantilesML
quantiles correspondent to returnperiods
for the distribution whose parameters are parametersML
;
logML
maximum log-likelihood;
intervals
confidence intervals for the quantiles quantilesML
for limits confint
;
varparameters
matrix (nbpas)x(nbchaines)x(number of parameters) with the simulated variances for the MCMC algorithm;
vraisdist
likelihoods for the sets parameters
;
propsaut
vector showing the evolution of the acceptance rate during the Bayesian MCMC fitting;
plot.BayesianMCMC
and plot.BayesianMCMCreg
(for a normalized surface of 1 km2) plot the following figures:
1
data as plotting position (the Cunanne plotting position
2
diagnostic plot of the MCMC simulation (parameters);
3
diagnostic plot of the MCMC simulation (likelihood and MCMC acceptance rate);
4
posterior distribution of parameters obtained with the MCMC simulation (cloud plots);
5
a-priori distribution of parameters (contour plots);
plotBayesianMCMCreg_surf
plots the same plot as the first one given by plot.BayesianMCMCreg
but for each surface in argument, as well as its mean as a function of the surfaces;
object of class BayesianMCMC
, output of function BayesianMCMC
vector of systematic data
vector of upstream catchment surfaces of systematic data
vector of historical data and/or extreme discharges at ungauged sites
vector of inferior limit for historical data and/or extreme discharges at ungauged sites
vector of superior limit for historical data and/or extreme discharges at ungauged sites
vector of upstream catchment surfaces of extreme discharges at ungauged sites
period (in years) over which every threshold has not been exceeded except for the historical data and/or extreme discharges at ungauged sites. If several values of xhist for a same threshold, put the number of years associated to the threshold on the first row, then put 0 (see examples)
threshold not exceeded in the historical period except for the historical data and/or extreme discharges at ungauged sites (several thresholds allowed).
number of iterations for the MCMC algorithm
number of chains for the MCMC algorithm
confidence limits for the flood quantiles
distribution: normal "NORM"
, log-normal with 2 parameters "LN"
, Exponential "EXP"
, Gumbel "GUMBEL"
, Generalized Extreme Value "GEV"
, Generalized Logistic "GENLOGIS"
, Generalized Pareto "GENPAR"
, log-normal with 3 parameters "LN3"
, Pearson type III "P3"
, (log-Pearson type III "LP3"
, not implemented yet)
function of the parameters of the model `proportional to' their a-priori guessed distribution. The default fuction returns always 1, i.e. there is no a-priori information
initial values of the parameters for the MCMC algorithm
initial values of the parameter variances for the MCMC algorithm
a number of a vector of numbers that defines the graph to plot (see details)
if TRUE, the interactive mode is run
a particular surface (number or vector), not necessarily being a surface included in the scont or shist vectors
other arguments
Eric Gaume, Alberto Viglione, Jose Luis Salinas, Olivier Payrastre, Chi Cong N'guyen, Karine Halbert
Supported cases
These functions are taking 4 cases into account, depending on the type of data provided: - Using only the systematic data: xcont provided, xhist=NA, infhist=NA and suphist=NA - Using censored information, historical flood known: xcont and xhist provided, infhist=NA and suphist=NA - Using censored information, historical flood unknown precisely but its lower limit known: xcont and infhist provided, xhist=NA and suphist=NA - Taking into account flood estimation intervals: infhist and suphist (respectively lower and upper limits) provided, xcont provided, xhist=NA - Please note that every other case is NOT supported. For example, you can't have some historical flood values perfectly known as well as some other for which you only know a lower limit or an interval.
Regarding the perception thresholds: - By definition, the number of exceedances of each perception threshold within its application period has to be known precisely, and all the floods exceeding the threshold have to be included in xhist (or infhist or suphist). - Several thresholds are allowed. - It is possible to include in xhist (or infhist or suphist) historical values that do not exceed the associated perception threshold. - If for one or several thresholds you only know that this or these threshold have never been exceeded and no more information is available on floods that did not exceed the threshold(s), this case is also supported. In this case, put for the historical flood corresponding to the threshold xhist=-1 (or infhist=-1 or infhist=suphist=-1).
Bayesian inference
Bayesian inference uses a numerical estimate of the degree of belief in a hypothesis before evidence has been observed and calculates a numerical estimate of the degree of belief in the hypothesis after evidence has been observed.
The name `Bayesian' comes from the frequent use of Bayes' theorem in the inference process.
In our case the problem is: which is the probability that a frequency function dist
) has parameters xcont
, xhist
, infhist
, suphist
, nbans
, seuil
).
The Bayes' theorem writes
apriori
(it can be used to account for causal information);
dist
and depending on the availability of historical data;
Since complex models cannot be processed in closed form by a Bayesian analysis, namely because of the extreme difficulty in computing the normalization factor
MCMC Metropolis algorithm
Markov chain Monte Carlo (MCMC) methods (which include random walk Monte Carlo methods), are a class of algorithms for sampling from probability distributions based on constructing a Markov chain that has the desired distribution as its equilibrium distribution. The state of the chain after a large number of steps is then used as a sample from the desired distribution. The quality of the sample improves as a function of the number of steps.
The MCMC is performed here through a simple Metropolis algorithm, i.e. a Metropolis-Hastings algorithm with symmetric proposal density.
The Metropolis-Hastings algorithm can draw samples from any probability distribution
The Markov chain is started from a random initial value
The MCMC algorithm is based on a code developed by Eric Gaume on Scilab. It is still unstable and not all the distributions have been tested.
.
set.seed(2988)
serie <- rand.GEV(120, xi=40, alfa=20, k=-0.4)
serie100 <- serie[1:100]
serie100[serie100 < 250] <- NA
serie20 <- serie[101:120]
serie <- c(serie100, serie20)
plot(serie, type="h", ylim=c(0, 600), xlab="",
ylab="Annual flood peaks [m3/s]", lwd=3)
abline(h=0)
points(serie100, col=2)
if (FALSE) {
# Using only sistematic data
only_sist <- BayesianMCMC (xcont=serie20, nbpas=5000, nbchaines=3, varparameters0=c(70, 20, 0.5),
confint=c(0.05, 0.95), dist="GEV")
plot(only_sist, which=c(1:3), ask=TRUE, ylim=c(1,600))
only_sist <- BayesianMCMCcont(only_sist)
plot(only_sist, which=c(1:3), ask=TRUE, ylim=c(1,600))
only_sist <- BayesianMCMCcont(only_sist)
plot(only_sist, which=c(1:3), ask=TRUE, ylim=c(1,600))
# Adding the information that the threshold 250 m3/s was exceeded
# 3 times in the past 100 years
with_hist_thresh <- BayesianMCMC (xcont=serie20, infhist=rep(250,3),
nbans=100, seuil=250,
nbpas=5000, nbchaines=3,
confint=c(0.05, 0.95), dist="GEV")
plot(with_hist_thresh, which=c(1:3), ask=TRUE, ylim=c(1,600))
# Assuming that the 3 historical events are known with high uncertainty
with_hist_limits <- BayesianMCMC (xcont=serie20,
infhist=c(320,320,250),
suphist=c(360,400,270),
nbans=100, seuil=250,
nbpas=5000, nbchaines=3,
confint=c(0.05, 0.95), dist="GEV")
plot(with_hist_limits, which=c(1:3), ask=TRUE, ylim=c(1,600))
# Assuming that the 3 historical events are perfectly known
with_hist_known <- BayesianMCMC (xcont=serie20, xhist=serie100[!is.na(serie100)],
nbans=100, seuil=250,
nbpas=5000, nbchaines=3,
confint=c(0.05, 0.95), dist="GEV")
plot(with_hist_known, which=c(1:3), ask=TRUE, ylim=c(1,600))
# Perception threshold without available information on floods
without_info <- BayesianMCMC (xcont=serie20, xhist=-1,
nbans=100, seuil=2400,
nbpas=5000, nbchaines=3,
confint=c(0.05, 0.95), dist="GEV")
plot(without_info, which=c(1:3), ask=TRUE, ylim=c(1,600))
# Using one reasonable a-priori distribution
fNORM3 <- function (x) {
# x = vector of values
# mu = vector of means
mu = c(44, 26, -0.40)
# CM = covariance matrix
CM = matrix(c(13, 7.8, -0.055,
7.8, 15, -0.42,
-0.055, -0.42, 0.056), nrow=3, ncol=3)
CMm1 <- solve(CM)
term2 <- exp(-((x - mu) %*% CMm1 %*% (x - mu))/2)
term1 <- 1/(2*pi)^(3/2)/sqrt(det(CM))
term1*term2
}
with_hist_known2 <- BayesianMCMC (xcont=serie20, xhist=serie100[!is.na(serie100)],
nbans=100, seuil=250,
nbpas=5000, nbchaines=3, apriori=fNORM3,
confint=c(0.05, 0.95), dist="GEV")
plot(with_hist_known2, 5)
plot(with_hist_known2, 4)
plot(with_hist_known, 4)
plot(with_hist_known)
plot(with_hist_known2)
# Using one non-reasonable a-priori distribution
fNORM3 <- function (x) {
# x = vector of values
# mu = vector of means
mu = c(30, 50, -0.10)
# CM = covariance matrix
CM = matrix(c(13, 7.8, -0.055,
7.8, 15, -0.42,
-0.055, -0.42, 0.056), nrow=3, ncol=3)
CMm1 <- solve(CM)
term2 <- exp(-((x - mu) %*% CMm1 %*% (x - mu))/2)
term2
}
with_hist_known3 <- BayesianMCMC (xcont=serie20, xhist=serie100[!is.na(serie100)],
nbans=100, seuil=250,
nbpas=5000, nbchaines=3, apriori=fNORM3,
confint=c(0.05, 0.95), dist="GEV")
plot(with_hist_known3, 5)
plot(with_hist_known3, 4)
plot(with_hist_known, 4)
plot(with_hist_known)
plot(with_hist_known3)
}
if (FALSE) {
# Assuming that the historical events are perfectly known and there are 4 different thresholds
# The data file is presenting this way:
# xhist nbans seuil
# 6000 55 6000
# 7400 28 7250
# 6350 8 3050
# 4000 0 3050
# 4550 0 3050
# 3950 0 3050
# 7550 58 2400
# 4650 0 2400
# 3950 0 2400
## Warning: nbans and seuil should have the same length as xhist.
# So when a threshold is exceeded several times, replicate it as many times it is exceeded
# and part the number of years of exceedance into the number of times of exceedance
# (the way you part the nbans vector is not important, what is important is that you have
# length(nbans)=length(xhist) and the total of years for one same threshold equals the number
# of years covered by the perception threshold)
xhist_thres <- c(6000, 7400, 6350, 4000, 4550, 3950, 7550, 4650, 3950)
seuil_thres <- c(6000, 7250, rep(3050, 4), rep(2400, 3))
nbans_thres <- c(55, 28, 8, 0, 0, 0, 58, 0, 0)
# The threshold at 6000 has been exceeded for 55 years, the one at 7250 for 28 years,
# the one at 3050 for 8 years and the one at 2400 for 58 years
with_hist_known_several_thresholds <- BayesianMCMC (xcont=serie20,
xhist=xhist_thres,
nbans=nbans_thres, seuil=seuil_thres,
nbpas=5000, nbchaines=3,
confint=c(0.05, 0.95), dist="GEV",
varparameters0=c(NA, NA, 0.5))
plot(with_hist_known_several_thresholds, which=c(1:3), ask=TRUE)
## REGIONAL:
# Regional analysis, assuming that the 3 historical events are perfectly known and
# there are 2 perception thresholds
regional_with_hist_known <- BayesianMCMCreg (xcont=serie20,
scont=c(rep(507,9),rep(2240,11)),
xhist=serie100[!is.na(serie100)],
shist=c(495, 495, 87),
nbans=c(100, 0, 50), seuil=c(312, 312, 221),
nbpas=5000, nbchaines=3,
confint=c(0.05, 0.95), dist="GEV",
varparameters0=c(NA, NA, NA, 0.5))
plot(regional_with_hist_known, which=1:3, ask=TRUE, ylim=c(1,600))
surf=c(571, 2240)
plotBayesianMCMCreg_surf(regional_with_hist_known, surf)
}
Run the code above in your browser using DataLab