mps2par: Use Maximum Product of Spacings to Estimate the Parameters of a Distribution

Description

This function uses the method of maximum product of spacings (MPS; maximum spacing estimation or maximum product of spacings estimation) to estimate the parameters of a distribution. The method is based on maximization of the geometric mean of probability spacings in the data where the spacings are defined as the differences between the values of the cumulative distribution function, $F(x)$, at sequential data indices.

MPS (Dey et al., 2016, pp. 13--14) is an optimization problem formed by maximizing the geometric mean of the spacing between consecutively ordered observations standardized to a U-statistic. Let $\Theta$ represent a vector of parameters for a candidate fit of $F(x|\Theta)$, and let $U_i(\Theta) = F(X_{i:n}|\Theta)$ be the nonexceedance probabilities of the observed values of the order statistics $x_{i:n}$ for a sample of size $n$. Define the differences $$D_i(\Theta) = U_i(\Theta) - U_{i-1}(\Theta)\mbox{\ for\ } i = 1, \ldots, n+1\mbox{,}$$ with the additions to the vector $U$ of $U_0(\Theta) = 0$ and $U_{n+1}(\Theta) = 1$. The objective function is $$M_n(\Theta) = - \sum_{i=1}^{n+1} \log\, D_i(\Theta)\mbox{,}$$ where the $\Theta$ for a maximized ${-}M_n$ represents the parameters fit by MPS. Some authors to keep with the idea of geometric mean include factor of $1/(n+1)$ for the definition of $M_n$. Whereas other authors (Shao and Hahn, 1999, eq. 2.0), show $$S_n(\Theta) = (n+1)^{-1} \sum_{i=1}^{n+1} \log[(n+1)D_i(\Theta)]\mbox{.}$$ So it seems that some care is needed when considering the implementation when the value of “the summation of the logarithms” is to be directly interpreted. Wong and Li (2006) provide a salient review of MPS in regards to an investigation of maximum likelihood (MLE), MPS, and probability-weighted moments (pwm) for the GEV (quagev) and GPA (quagpa) distributions. Finally, Soukissian and Tsalis (2015) also study MPS, MLE, L-moments, and several other methods for GEV fitting.

If the initial parameters have a support inside the range of the data, infinity is returned immediately by the optimizer and further action stops and the parameters returned are NULL. For the implementation here, if check.support is true, and the initial parameter estimate (if not provided and acceptable by para.int) by default will be seeded through the method of L-moments (unbiased, lmoms), which should be close and convergence will be fairly fast if a solution is possible. If these parameters can not be used for spinup, the implementation will then attempt various probability-weighted moment by plotting position (pwm.pp) converted to L-moments (pwm2lmom) as part of an extended attempt to find a support of the starting distribution encompass the data. Finally, if that approach fails, a last ditch effort using starting parameters from maximum likelihood computed by a default call to mle2par is made. Sometimes data are pathological and user supervision is needed but not always successful---MPS can show failure for certain samples and(or) choice of distribution.

It is important to remark that the support of a fitted distribution is not checked within the loop for optimization once spun up. The reasons are twofold: (1) The speed hit by repeated calls to supdist, but in reality (2) PDFs in lmomco are supposed to report zero density for outside the support of a distribution (see NEWS) and for the $-\log(D_i(\Theta)\rightarrow 0) \rightarrow \infty$ and hence infinity is returned for that state of the optimization loop and alternative solution will be tried.

As a note, if all $U$ are equally spaced, then $|M(\Theta)| = I_o = (n+1)\log(n+1)$. This begins the concept towards goodness-of-fit. The $M_n(\Theta)$ is a form of the Moran-Darling statistic for goodness-of-fit. The $M_n(\Theta)$ is a Normal distribution with $$\mu_M \approx (n+1)[\log(n+1)+\gamma{}] - \frac{1}{2} - \frac{1}{12(n+1)}\mbox{,}$$ $$\sigma_M \approx (n+1)\biggl(\frac{\pi^2}{6\,{}} - 1\biggr)-\frac{1}{2} - \frac{1}{6(n+1)}\mbox{,}$$ where $\gamma \approx 0.577221$ (Euler--Mascheroni constant, -digamma(1)) or as the definite integral

$$\gamma^\mathrm{Euler}_{\mathrm{Mascheroni}} = -\int_0^\infty \mathrm{exp}(-t) \log(t)\; \mathrm{d}{t}\mbox{,}$$

An extension into small samples using the Chi-Square distribution is $$A = C_1 + C_2\times\chi^2_n\mbox{,}$$ where $$C_1 = \mu_M - \sqrt{\frac{\sigma^2_M\,n}{2}}\mbox{\ and\ }C_2 = \sqrt{\frac{\sigma^2_M}{2n}}\mbox{,}$$ and where $\chi^2_n$ is the Chi-Square distribution with $n$ degrees of freedom. A test statistic is $$T(\Theta) = \frac{M_n(\Theta) - C_1 + \frac{p}{2}}{C_2}\mbox{,}$$ where the term $p/2$ is a bias correction based on the number of fitted distribution parameters $p$. The null hypothesis that the fitted distribution is correct is to be rejected if $T(\Theta)$ exceeds a critical value from the Chi-Square distribution. The MPS method has a relation to maximum likelihood (mle2par) and the two are asymptotically equivalent.

Important Remark Concerning Ties---Ties in the data cause instant degeneration with MPS and must be mitigated for and thus attention to this documentation and even the source code itself is required.

Usage

mps2par(x, type, para.int=NULL, ties=c("bernstein", "rounding", "density"),
            delta=0, log10offset=3, get.untied=FALSE, check.support=TRUE,
            moran=TRUE, silent=TRUE, null.on.not.converge=TRUE,
            ptransf=  function(t) return(t),
            pretransf=function(t) return(t),
            mle2par=TRUE, ...)

Arguments

A vector of data values.

type

Three character (minimum) distribution type (for example, type="gev", see dist.list).

para.int

Initial parameters as a vector $\Theta$ or as an lmomco parameter “object” from say vec2par. If a vector is given, then internally vec2par is called with distribution equal to type.

ties

Ties cause degeneration in the computation of $M(\Theta)$: Option bernstein triggers a smoothing of only the ties using the dat2bernqua function---Bernstein-type smoothing for ties is likely near harmless when ties are near the center of the distribution, but of course caution is advised if ties exist near the extremal values; the settings for log10offset and delta are ignored if bernstein is selected Also for a tie-run having an odd number of elements, the middle tied value is left as original data. Option rounding triggers two types of adjustment: if delta > 0 then a round-off error approach inspired by Cheng and Stephens (1989, eq. 4.1) is used (see Note) and log10offset is ignored, but if delta=0, then log10offset is picked up as an order of magnitude offset (see Note). Use of options log10offset and delta are likely to not keep a middle unmodified in an odd-length, tie-run in contrast to use of bernstein. Option density triggers the substitution of the probability density $g(x_{i:n}|\Theta)$ at the $i$th tie from the current fit of the distribution. Warning---It appears that inference is lost almost immediately because the magnitude of $M_n$ losses meaning because probability densities are not in the same scale as changes in probabilities exemplified by the $D_i$. This author has not yet found literature discussing this, but density substitution is a recognized strategy.

delta

The optional $\delta$ value if $\delta > 0$ and if ties="rounding".

log10offset

The optional base-10 logarithmic offset approach to roundoff errors if delta=0 and if ties="rounding".

get.untied

A logical to populate a ties element in the returned list with the untied-pseudo data as it was made available to the optimizer and the number of iternations required to exhaust all ties. An emergency break it implemented if the number of iterations appears to be blowing up.

check.support

A logical to trigger a call to supdist to compute the support of the distribution at the initial parameters. As mentioned, MPS degenerates if min(x) $<$ the lower support or if max(x) $>$ the upper support. Regardless of the setting of check.support and NULL will be returned because this is what the optimizer will do anyway.

moran

A logical to trigger the goodness-of-fit test described previously.

silent

A logical to silence the try() function wrapping the optim() function and to provide a returned list of the optimization output.

null.on.not.converge

A logical to trigging simple return of NULL if the optim() function returns a nonzero convergence status.

ptransf

An optional parameter transformation function (see Examples) that is useful to guide the optimization run. For example, suppose the first parameter of a three parameter distribution resides in the positive domain, then ptransf(t) = function(t) c(log(t[1]), t[2], t[3]).

pretransf

An optional parameter retransformation function (see Examples) that is useful to guide the optimization run. For example, suppose the first parameter of a three parameter distribution resides in the positive domain, then pretransf(t) = function(t) c(exp(t[1]), t[2], t[3]).

mle2par

A logical to turn off the potential last attempt at maximum likelihood estimates of a valid seed as part of check.support=TRUE.

...

Additional arguments for the optim() function and other uses.

Value

An R list is returned. This list should contain at least the following items, but some distributions such as the revgum have extra.

type

The type of distribution in three character (minimum) format.

para

The parameters of the distribution.

source

Attribute specifying source of the parameters.

para.int

The initial parameters. Warning to users, when inspecting returned values make sure that one is referencing the MPS parameters in para and not those shown in para.int!

optim

An optional list of returned content from the optimizer if not silent.

ties

An optional list of untied-pseudo data and number of iterations required to achieve no ties (usually unity!) if and only if there were ties in the original data, get.untied is true, and ties != "density".

MoranTest

An optional list of returned values that will include both diagnostics and statistics. The diagnostics are the computed $\mu_M(n)$, $\sigma^2_M(n)$, $C_1$, $C_2$, and $n$. The statistics are the minimum value $I_o$ theoretically attainable $|M_n(\Theta)|$ for equally spaced differences, the minimized value $M_n(\Theta)$, the $T(\Theta)$, and the corresponding p.value from the upper tail of the $\chi^2_n$ distribution.

References

Cheng, R.C.H., Stephens, M.A., 1989, A goodness-of-fit test using Moran's statistic with estimated parameters: Biometrika, v. 76, no. 2, pp. 385--392.

Dey, D.K., Roy, Dooti, Yan, Jun, 2016, Univariate extreme value analysis, chapter 1, in Dey, D.K., and Yan, Jun, eds., Extreme value modeling and risk analysis---Methods and applications: Boca Raton, FL, CRC Press, pp. 1--22.

Shao, Y., and Hahn, M.G., 1999, Strong consistency of the maximum product of spacings estimates with applications in nonparametrics and in estimation of unimodal densities: Annals of the Institute of Statistical Mathematics, v. 51, no. 1, pp. 31--49.

Soukissian, T.H., and Tsalis, C., 2015, The effect of the generalized extreme value distribution parameter estimation methods in extreme wind speed prediction: Natural Hazards, v. 78, pp. 1777--1809.

Wong, T.S.T., and Li, W.K., 2006, A note on the estimation of extreme value distributions using maximum product of spacings: IMS Lecture Notes, v. 52, pp. 272--283.

Examples

Run this code

# NOT RUN {
pe3 <- vec2par(c(4.2, 0.2, 0.6), type="pe3") # Simulated values should have at least
X <- rlmomco(202, pe3); Xr  <- round(sort(X), digits=3) # one tie-run after rounding,
mps2par(X,  type="pe3")$para      # and the user can observe the (minor in this case)
mps2par(Xr, type="pe3")$para      # effect on parameters.
# Another note on MPS is needed. It is not reflection symmetric.
mps2par( X, type="pe3")$para
mps2par(-X, type="pe3")$para
# }
# NOT RUN {
# }
# NOT RUN {
# Use 1,000 replications for sample size of 75 and estimate the bias and variance of
# the method of L-moments and maximum product spacing (MPS) for the 100-year event
# using the Pearson Type III distribution.
set.seed(1596)
nsim <- 1000; n <- 75; Tyear <- 100; type <- "pe3"
parent.lmr <- vec2lmom(c(5.5, 0.15, 0.03))   # L-moments of the "parent"
parent  <- lmom2par(parent.lmr, type="pe3")  # "the parent"
Q100tru <- qlmomco(T2prob(Tyear), parent)    # "true value"
Q100lmr <- Q100mps <- rep(NA, nsim)          # empty vectors
T3lmr <- T4lmr <- T3mps <- T4mps <- rep(NA, nsim)
for(i in 1:nsim) { # simulate from the parent, compute L-moments
   tmpX <- rlmomco(n, parent); lmrX <- lmoms(tmpX)
   if(! are.lmom.valid(lmrX)) { # quiet check on viability
     lmrX <- pwm2lmom(pwms.pp(tmpX)) # try a pwm by plotting positions instead
     if(! are.lmom.valid(lmrX)) next
   }
   lmrpar <- lmom2par(lmrX, type=type)                  # Method of L-moments
   mpspar <-  mps2par(tmpX, type=type, para.int=lmrpar) # Method of MPS
   if(! is.null(lmrpar)) {
      Q100lmr[i] <- qlmomco(T2prob(Tyear), lmrpar); lmrlmr <- par2lmom(lmrpar)
      T3lmr[i] <- lmrlmr$ratios[3]; T4lmr[i] <- lmrlmr$ratios[4]
   }
   if(! is.null(mpspar)) {
      Q100mps[i] <- qlmomco(T2prob(Tyear), mpspar); mpslmr <- par2lmom(mpspar)
      T3mps[i] <- mpslmr$ratios[3]; T4mps[i] <- mpslmr$ratios[4]
   }
}
print(summary(Q100tru - Q100lmr)) # Method of L-moment   (mean = -0.00176)
print(summary(Q100tru - Q100mps)) # Method of MPS        (mean = -0.02746)
print(var(Q100tru - Q100lmr, na.rm=TRUE)) # Method of L-moments (0.009053)
print(var(Q100tru - Q100mps, na.rm=TRUE)) # Method of MPS       (0.009880)
# CONCLUSION: MPS is very competitive to the mighty L-moments.

LMR <- data.frame(METHOD=rep("Method L-moments",        nsim), T3=T3lmr, T4=T4lmr)
MPS <- data.frame(METHOD=rep("Maximum Product Spacing", nsim), T3=T3mps, T4=T4mps)
ZZ <- merge(LMR, MPS, all=TRUE)
boxplot(ZZ$T3~ZZ$METHOD, data=ZZ); mtext("L-skew Distributions")
boxplot(ZZ$T4~ZZ$METHOD, data=ZZ); mtext("L-kurtosis Distributions") #
# }
# NOT RUN {
# }
# NOT RUN {
# Data shown in Cheng and Stephens (1989). They have typesetting error on their
# "sigma." Results mu=34.072 and sigma=sqrt(6.874)=2.6218
H590 <- c(27.55, 31.82, 33.74, 34.15, 35.32, 36.78,
          29.89, 32.23, 33.74, 34.44, 35.44, 37.07,
          30.07, 32.28, 33.86, 34.62, 35.61, 37.36,
          30.65, 32.69, 33.86, 34.74, 35.61, 37.36,
          31.23, 32.98, 33.86, 34.74, 35.73, 37.36,
          31.53, 33.28, 34.15, 35.03, 35.90, 40.28,
          31.53, 33.28, 34.15, 35.03, 36.20) # breaking stress MPAx1E6 of carbon block.
mps2par(H590, type="nor", ties="rounding", delta=0.005)$para
mps2par(H590, type="nor", ties="rounding" )$para
mps2par(H590, type="nor", ties="bernstein")$para
#        mu     sigma
# 34.071424  2.622484 # using a slight variant on their eq. 4.1.
# 34.071424  2.622614 # using log10offset=3
# 34.088769  2.690781 # using Bernstein smooth and unaffecting middle of odd tie runs
# The MoranTest show rejection of the Normal distribution at alpha=0.05, with the
# "rounding" and "delta=0.005"" and T=63.8 compared to their result of T=63.1,
# which to be considered that the strategy here is not precisely the same as theirs.
# }
# NOT RUN {
# }
# NOT RUN {
# Demonstration of parameter transformation and retransformation
set.seed(9209) # same seed used under mle2par() in parallel example
x <- rlmomco(500, vec2par(c(1,1,3), type="gam")) # 3-p Generalized Gamma
guess <- lmr2par(x, type="gam", p=3) # By providing a 3-p guess the 3-p
# Generalized Gamma will be triggered internally. There are problems passing
# "p" argument to optim() if that function is to pick up the ... argument.
mps2par(x, type="gam", para.int=guess, silent=FALSE,
           ptransf=  function(t) { c(log(t[1]), log(t[2]), t[3])},
           pretransf=function(t) { c(exp(t[1]), exp(t[2]), t[3])})$para
# Reports:       mu     sigma        nu   for some simulated data.
#         0.9997019 1.0135674 3.0259012 
# }