pval_gen: Simulate p-values from earthquakes and explosions

Description

p-values are simulated for depth and first polarity discriminants to use in testing classification models.

Usage

pval_gen(
  sims = 100,
  grid.dim = c(800, 800, 30),
  seismometer = list(N = 100, max.depth = 2, Sig = NULL, sig.draws = c(15, 2)),
  explosion = list(max.depth = 5, prob = 0.4),
  pwave.arrival = list(H0 = 5, V = 5.9, optim.starts = 15),
  first.polarity = list(read.err = 0.95)
)

Value

A data frame, with the number of rows equal to sims. Columns contain the p-values observed for each simulation along with the true event type.

Arguments

sims

numeric stipulating the number of individual events to generate.

grid.dim

numeric 3-vector providing the extent of the coordinate system in c(X,Y,Z) to be used in km.

seismometer

list stipulating variables relating to the seismometers. Providing an incomplete list reverts to the default values. List elements are:

N a numeric stipulating the number of seismometers
max.depth is a numeric providing the maximum depth for a seismometer location in km
Sig supplying a numeric vector of length N to this argument will stipulate the variance in the error in observed arrival time of each of the N seismometers.
sig.draws a numeric 2-vector, if Sig is not provided the variance in arrival time at each station is drawn from MCMCpack::rinvgamma() using shape = sig.draws[1] and scale = sig.draws[2].

explosion

list stipulating variables regarding a detonation event. Providing an incomplete list reverts to the default values. List elements are:

max.depth is a numeric providing the maximum depth of an explosion in km
prob is the probability of an explosion, controlling the fraction of events which are explosions. Value provided must be in the interval $[0,1]$

pwave.arrival

list stipulating variables regarding the depth from p-wave arrival discriminant. Providing an incomplete list reverts to the default values. List elements are:

H0 a numeric providing the value of depth in km for the null hypothesis
V a numeric stipulating the velocity of p-waves in km. Used for simulating p-wave arrival times as an argument of time_fn().
optim.starts number of stats::optim() starts used to maximize likelihood of event location.

first.polarity

list stipulating variables regarding the depth from the polarity of first motion discriminant. List elements are:

read.err numeric in the interval $[0,1]$ providing the probability of an error in reading the true first polarity.

Depth Discriminant

The equation below is used to model p-wave arrival time $t_i$ at seismometer $i$.

$$t_i = t_0 + T(S_i, S_0) + \epsilon_i$$

Where $t_0$ is the time of the event, $T()$ is a function modeling the arrival time (in this case time_fn), $S_i$ is the location of seismometer $i$, $S_0$ is the location of the event, and $\epsilon_i$ is normally distributed error with known variance $\sigma^2_i$. Given N seismometers, the MLE of the event time $\hat{t}_0$ can be solved as the following:

$$\hat{t}_0 = \frac{\mathrm{tr}\left(\Sigma^{-1} T_i\right) - \mathrm{tr}\left(\Sigma^{-1}T(S,S_0)\right)}{\mathrm{tr}(\Sigma^{-1})}$$

Where $\mathrm{tr}()$ is the matrix trace operation, $\Sigma$ is a matrix containing the elements of each $\sigma_i^2$ on the diagonal, $T_i$ is a diagonal matrix containing each $t_i$, and $T(S, S_0)$ is a diagonal matrix containing each $T(S_i, S_0)$. This result is then plugged back into the first equation, which is then used in a normal likelihood function. Derivatives are taken of the likelihood so that a fast gradient based approch can be used to find the maximum likelihood estimate (MLE) of $S_0$.

The remainder of the calculation of the p-value is consistent with the Depth from P-Wave Arrival Times section of anderson2007mathematicalezECM. First note $S_0$ is equal to the 3-vector $(X_0, Y_0, Z_0)^{\top}$. Given the null hypothesis for the depth of $\mathrm{H}_0: Z_0 \leq z_0$, the MLE $(\hat{X}_0, \hat{Y}_0)$ given $Z_0 = z_0$ is found. The sum of squared errors (SSE) is calculated as follows:

$$\mathrm{SSE}(S_0, t_0) = \sum_{i = 1}^N\left(\frac{t_i - t_0 - T(S_i, S_0)}{\sigma_i}\right)^2$$

If $\mathrm{H}_0$ is true then the following statistic has a central $F$ distribution with $1$ and $N-4$ degrees of freedom.

$$F_{1,N-4} = \frac{\mathrm{SSE}(S_0, t_0|Z_0 = z_0) - \mathrm{SSE}(S_0, t_0)}{\mathrm{SSE}(S_0, t_0)}$$

Because the test has directionality, the $F$ statistic is then converted to a $T$ statistic as such:

$$T_{N-4} = \mathrm{sign}(\hat{Z}_0 - z_0)\sqrt{F_{1,n-4}}$$

This $T$ statistic is then used to compute the p-value

Polarity of First Motion

Under the null hypothesis that the event is an explosion, (and therefore the true polarity of first motion is always one), the error rate for mistakenly identifying the polarity of first motion is stipulated as the argument first.polarity$read.err. For an error rate of $\theta$ the p-value can then be calculated as follows:

$$\sum_{i = 0}^n {N \choose i} \theta^i(1-\theta)^{N-i}$$

Where $n$ is the number of stations where a positive first motion was observed, and $N$ is the total number of stations.

Details

Methods are adapted from the discriminants listed in anderson2007mathematicalezECM.

References

Examples

Run this code


test.data <- pval_gen(sims = 5)

Run the code above in your browser using DataLab