BMP: Bayesian Moving Particles

Description

This function runs the Bayesian Moving Particles algorithm for estimating extreme probability and quantile.

Usage

BMP(
  dimension,
  lsf,
  q,
  N = 1000,
  N.final = N,
  N.iter = 30,
  adaptive = FALSE,
  N.DoE = 5 * dimension,
  firstDoE = "uniform",
  radius = qnorm(1e-10, lower.tail = FALSE),
  X,
  y,
  covariance = NULL,
  learn_each_train = Inf,
  km.param = list(nugget.estim = TRUE, multistart = 1, optim.method = "BFGS", coef.trend
    = q),
  burnin = 20,
  fast = TRUE,
  sur = list(integrated = TRUE, r = 1, approx.pnorm = FALSE),
  lower.tail = TRUE,
  save.dir,
  plot = FALSE,
  plot.lsf = TRUE,
  plot.lab = c("x_1", "x_2"),
  chi2 = FALSE,
  verbose = 1,
  breaks
)

Value

An object of class list containing the outputs described below:

alpha: the estimated conditional expectation of the probability.
alpha.seq: the sequence of estimated alpha during the refinement step.
cv2: an estimate of the squarred coefficient of variation of alpha.
cv.seq: the sequence of the estimated coefficients of variations.
h: the sequence of the estimated upper bound of the conditional variance divided by estimated alpha.
I: the sequence of the estimated integrated h.
sur_min: a list containing the the sequence of corresponding thresholds and -log probability of the sample minimising the SUR criterion.
sur_stat: a list containing at each iterations number of points tried for the SUR criterion as well as the computational spent.
q: the reference quantile for the probability estimate.
ecdf: the empirical cdf, i.e. the estimation of the function q -> E(alpha(q)).
L_max: the farthest state reached by the random process. Validity range for the ecdf is then (-Inf, L_max] or [L_max, Inf).
PPP: the last Poisson process generated with N.final particles.
meta_fun: the metamodel approximation of the lsf. A call output is a list containing the value and the standard deviation.
model: the final metamodel. An S4 object from DiceKriging. Note that the algorithm enforces the problem to be the estimation of P[lsf(X)>q] and so using ‘predict’ with this object will return inverse values if lower.tail==TRUE; in this scope prefer using directly meta_fun which handles this possible issue.
model.first: the first metamodel with the intial DoE.
alpha_int: a 95% confidence intervalle on the estimate of alpha.
moves: a vector containing the number of moves for each one of the N.batch particles.
chi2: the output of the chisq.test function.

Arguments

dimension: the dimension of the input space.
lsf: the function defining the RV of interest Y = lsf(X).
q: a given quantile to estimate the corresponding probability.
N: the total number of Poisson processes during the refinement step.
N.final: the total number of Poisson processes for the final alpha estimate.
N.iter: the total number of iteration of the algorithm, ie that total number of calls to the lsf will be N.DoE + N.iter*r.
adaptive: if the algorithm should stop automatically if the stopping criterion is verified, precisely the mean probability of misclassification of the particles being over a given threshold.
N.DoE: the number of points for the initial Design of Experiment
firstDoE: default is "uniform" for a random uniform sampling over a sphere of radius radius. Also available "maximim" for a maximim LHS.
radius: the size of the radius of the sphere for uniform DoE or the semi length of the interval on each dimension for maximin LHS
X: (optional) a first Design of Experiemnt to be used instead of building a new DoE
y: the value of lsf on the X
covariance: (optional) to give a covariance kernel for the km object.
learn_each_train: a integer: after this limit the covariance parameters are not learnt any more and model is just updated with the new datapoints.
km.param: (optional) list of parameters to be passed to DiceKriging::km.
burnin: a burnin parameter for Markov Chain drawing of the metamodel based Poisson process (this does not change the number of calls to lsf).
fast: in current implementation it appears that the call to the metamodel is faster when doing batch computation. This parameter lets do the Markov chain the other way around: instead of first selecting a starting point and then applying burnin times the transition kernel, it creates a working population by apply the kernel to all the particles and then makes some moves with the generated discretised distribution.
sur: a list containing any parameters to be passed to estimateSUR. Default is sur$integrated=TRUE and sur$r=1 for a one step ahead integrated SUR criterion.
lower.tail: as for pxxxx functions, TRUE for estimating P(lsf(X) < q), FALSE for P(lsf(X) > q).
save.dir: (optional) a directory to save the X and y at each iteration.
plot: to plot the DoE and the updated model.
plot.lsf: to plot the contour of the true lsf. Note that this requires its evaluation on a grid and should be used only on toy examples.
plot.lab: the labels of the axis for the plot.
chi2: for a chi2 test on the number of events.
verbose: controls the level of outputs of the algorithm.
breaks: optional, for the final histogram if chi2 == TRUE.

Author

Clement WALTER clementwalter@icloud.com

Details

The Bayesian Moving Particles algorithm uses the point process framework for rare event to iteratively estimate the conditional expectation of the (random) limit-state function, to quantify the quality of the learning and to propose a new point to be added to the model with a SUR criterion.

References

A. Guyader, N. Hengartner and E. Matzner-Lober:
Simulation and estimation of extreme quantiles and extreme probabilities
Applied Mathematics and Optimization, 64(2), 171-196.
C. Walter:
Moving Particles: a parallel optimal Multilevel Splitting method with application in quantiles estimation and meta-model based algorithms
Structural Safety, 55, 10-25.
J. Bect, L. Li and E. Vazquez:
Bayesian subset simulation
arXiv preprint arXiv:1601.02557

Examples

Run this code

# Estimate P(g(X)<0)
if (FALSE) p <- BMP(dimension = 2, lsf = kiureghian, q = 0, N = 100, N.iter = 30, plot = TRUE)

# More extreme event
if (FALSE) p <- BMP(dimension = 2, lsf = waarts, q = -4, N = 100, N.iter = 50, plot = TRUE)

# One can also estimate probability of the form P(g(X)>q)
if (FALSE) p <- BMP(dimension = 2, lsf = cantilever, q = 1/325, N = 100, N.iter = 30, plot = TRUE)

Run the code above in your browser using DataLab