
Compute the optimal p-factor through numerical integration of the smoothed empirical quantile function to estimate the L-moments of the distribution. This function attempts to report an optimal “p-factor” (author's term) for the given parent distribution in para
based on estimating the crossing of the origin of an error between the given L-moment ratio lmr.dist
. The estimated support of the distribution is that shown by Turnbull and Ghosh (2014) and is computed as follows
fix.lower
or fix.upper
. The polynomial type for smooth is provided in poly.type
. These three arguments are the same as those for dat2bernqua
and lmoms.bernstein
. The statistic type used to measure central tendency of the errors for the nsim
simulations per pfactors
argument. The p.lo
and p.hi
are the lower and upper bounds to truncate on immediately after the p-factors to use are assembled. These are made for three purposes: (1) protection against numerical problems for mathematical upper limits (unity), (2) to potentially provide for much faster execution if the user already knows the approximate optimal value for the p-factor, and (3) to potentially use this function in a direct optimization framework using the R functions optim
or uniroot
. It is strongly suggested to keep plot.em
set so the user can inspect the computations.
pfactor.bernstein(para, x=NULL, n=NULL,
bern.control=NULL,
poly.type=c("Bernstein", "Kantorovich"),
stat.type=c("Mean", "Median"),
fix.lower=NULL, fix.upper=NULL,
lmr.dist=NULL, lmr.n=c("3", "4", "5"),
nsim=500, plot.em=TRUE, pfactors=NULL,
p.lo=.Machine$double.eps, p.hi=1)
A mandatory “parent” distribution defined by a usual lmomco distribution parameter object for a distribution. The simulations are based on this distribution, although optimization for lmr.dist
.
An optional vector of data values.
An optional sample size to run the simulations on. This value is computed by length(x)
if x
is provided. If set by argument, then that size supersedes the length of the optional observed sample.
A list
that holds poly.type
, stat.type
, fix.lower
, and fix.upper
. And this list will supersede the respective
values provided as separate arguments. There is an implicit bound.type
of "Carv"
.
Same argument as for dat2bernqua
.
The central estimation statistic for each p-factor evaluated.
Same argument as for dat2bernqua
.
Same argument as for dat2bernqua
.
This is the value for the lmr.n
of the distribution in para
unless explicitly set through lmr.dist
.
The L-moment ratio number for p-factor optimization.
The number of simulations to run. Experiments suggest the default is adequate for reasonably small sample sizes---the simulation count can be reduced as n
becomes large.
A logical to trigger the diagnostic plot of the simulated errors and a smooth line through these errors.
An optional vector of p-factors to loop through for the simulations. The vector computing internall is this is set to NULL
seems to be more than adequate.
An computational lower boundary for which the pfactors
by argument or default are truncated to. The default for lo
is to be quite small and does no truncate the default pfactors
.
An computational upper boundary for which the pfactors
by argument or default are truncated to. The default for hi
is unity, which is the true upper limit that results in a 0 slope between the
An R list
or real
is returned. If pfactors
is a single value, then the single value for the error statistic is returned, otherwise the list described will be. If the returned pfactor
is NA
, then likely the smooth line did not cross zero and the reason the user should keep plot.em=TRUE
and inspect the plot. Perhaps revisions to the arguments will become evident. The contents of the list are
The estimated value of lowess
that has an error of zero, see err.stat
as a function of ps
.
Carv
, which is the same bound type as needed by dat2bernqua
and lmoms.bernstein
.
The given poly.type
.
The given stat.type
. The “Mean” seems to be preferable.
A string of the L-moment type: “Tau3”, “Tau4”, “Tau5”.
The given fixed lower boundary, which could stay NULL
.
The given fixed upper boundary, which could stay NULL
.
An attribute identifying the computational source of the L-moments: “pfactor.bernstein”.
The p-factors actually evaluated.
The error statistic computed by stat.type
of the simulated lmoms.bernstein
minus the “true” value para
or given by lmr.dist
where lmr.n
.
The lowess
-smoothed values for err.stat
and the pfactor
comes from a linear interpolation of this smooth for the error being zero.
Turnbull, B.C., and Ghosh, S.K., 2014, Unimodal density estimation using Bernstein polynomials. Computational Statistics and Data Analysis, v. 72, pp. 13--29.
# NOT RUN {
pdf("pfactor_exampleB.pdf")
X <- exp(rnorm(200)); para <- parexp(lmoms(X))
# nsim is too small, but makes the following three not take too long
pfactor.bernstein(para, n=20, lmr.n="3", nsim=100, p.lo=.06, p.hi=.3)
pfactor.bernstein(para, n=20, lmr.n="4", nsim=100, p.lo=.06, p.hi=.3)
pfactor.bernstein(para, n=20, lmr.n="5", nsim=100, p.lo=.06, p.hi=.3)
dev.off()
# }
# NOT RUN {
# Try intra-sample p-factor optimization from two perspectives. The 3-parameter
# GEV "over fits" the data and provides the parent. Then use Tau3 of the fitted
# GEV for peakedness restraint and then use Tau3 of the data. Then repeat but use
# the apparent "exact" value of Tau3 for the true exponential parent.
pdf("pfactor_exampleB.pdf")
lmr <- vec2lmom(c(60,20)); paraA <- parexp(lmr); n <- 40
tr <- lmorph(par2lmom(paraA))$ratios[3]
X <- rlmomco(n, paraA); para <- pargev(lmoms(X))
F <- seq(0.001,0.999, by=0.001)
plot(qnorm(pp(X, a=0.40)), sort(X), type="n", log="y",
xlab="Standard normal variate", ylab="Quantile",
xlim=qnorm(range(F)), ylim=range(qlmomco(F,paraA)))
lines(qnorm(F), qlmomco(F, paraA), col=8, lwd=2)
lines(qnorm(F), qlmomco(F, para), lty=2)
points(qnorm(pp(X, a=0.40)), sort(X))
# Make sure to fill in the p-factor when needed!
bc <- list(poly.type = "Bernstein", bound.type="Carv",
stat.type="Mean", fix.lower=0, fix.upper=NULL, p=NULL)
kc <- list(poly.type = "Kantorovich", bound.type="Carv",
stat.type="Mean", fix.lower=0, fix.upper=NULL, p=NULL)
# Bernstein
A <- pfactor.bernstein(para, n=n, nsim=100, bern.control=bc)
B <- pfactor.bernstein(para, x=X, n=n, nsim=100, bern.control=bc)
C <- pfactor.bernstein(para, n=n, nsim=100, lmr.dist=tr, bern.control=bc)
D <- pfactor.bernstein(para, x=X, n=n, nsim=100, lmr.dist=tr, bern.control=bc)
plot(qnorm(pp(X, a=0.40)), sort(X), type="n", log="y",
xlab="Standard normal variate", ylab="Quantile",
xlim=qnorm(range(F)), ylim=range(qlmomco(F,paraA)))
lines(qnorm(F), qlmomco(F, paraA), col=8, lwd=2)
lines(qnorm(F), qlmomco(F, para), lty=2)
points(qnorm(pp(X, a=0.40)), sort(X))
bc$p <- A$pfactor
lines(qnorm(F), dat2bernqua(F,X, bern.control=bc), col=2)
bc$p <- B$pfactor
lines(qnorm(F), dat2bernqua(F,X, bern.control=bc), col=3)
bc$p <- C$pfactor
lines(qnorm(F), dat2bernqua(F,X, bern.control=bc), col=2, lty=2)
bc$p <- D$pfactor
lines(qnorm(F), dat2bernqua(F,X, bern.control=bc), col=3, lty=2)
# Kantorovich
A <- pfactor.bernstein(para, n=n, nsim=100, bern.control=kc)
B <- pfactor.bernstein(para, x=X, n=n, nsim=100, bern.control=kc)
C <- pfactor.bernstein(para, n=n, nsim=100, lmr.dist=tr, bern.control=kc)
D <- pfactor.bernstein(para, x=X, n=n, nsim=100, lmr.dist=tr, bern.control=kc)
plot(qnorm(pp(X, a=0.40)), sort(X), type="n", log="y",
xlab="Standard normal variate", ylab="Quantile",
xlim=qnorm(range(F)), ylim=range(qlmomco(F,paraA)))
lines(qnorm(F), qlmomco(F, paraA), col=8, lwd=2)
lines(qnorm(F), qlmomco(F, para), lty=2)
points(qnorm(pp(X, a=0.40)), sort(X))
kc$p <- A$pfactor
lines(qnorm(F), dat2bernqua(F,X, bern.control=kc), col=2)
kc$p <- B$pfactor
lines(qnorm(F), dat2bernqua(F,X, bern.control=kc), col=3)
kc$p <- C$pfactor
lines(qnorm(F), dat2bernqua(F,X, bern.control=kc), col=2, lty=2)
kc$p <- D$pfactor
lines(qnorm(F), dat2bernqua(F,X, bern.control=kc), col=3, lty=2)
dev.off()
# }
# NOT RUN {
X <- exp(rnorm(200)); para <- parexp(lmoms(X))
"pfactor.root" <- function(para, p.lo, p.hi, ...) {
afunc <- function(p, para=NULL, x=NULL, ...) {
return(pfactor.bernstein(para=para, x=x, pfactors=p, ...)) }
rt <- uniroot(afunc, c(p.lo, p.hi),
tol=0.001, maxiter=30, nsim=500, para=para, ...)
return(rt)
}
pfactor.root(para, 0.05, 0.15, n=10, lmr.n="4")
pfactor.bernstein(para, n=10, lmr.n="4", nsim=200, p.lo=.05, p.hi=.15)
# }
Run the code above in your browser using DataLab