subbofit: Fit a power exponential density via maximum likelihood

Description

subbofit returns the parameters, standard errors. negative log-likelihood and covariance matrix of the Subbotin Distribution for a sample. The process can execute three steps, depending on the level of accuracy required. See details below.

Usage

subbofit(
  data,
  verb = 0L,
  method = 3L,
  interv_step = 10L,
  provided_m_ = NULL,
  par = as.numeric(c(2, 1, 0)),
  g_opt_par = as.numeric(c(0.1, 0.01, 100, 0.001, 1e-05, 3)),
  itv_opt_par = as.numeric(c(0.01, 0.001, 200, 0.001, 1e-05, 5))
)

Value

a list containing the following items:

"dt" - dataset containing parameters estimations and standard deviations.
"log-likelihood" - negative log-likelihood value.
"matrix" - the covariance matrix for the parameters.

Arguments

data

(NumericVector) - the sample used to fit the distribution.

verb

(int) - the level of verbosity. Select one of:

0 just the final result (default)
1 headings and summary table
2 intermediate steps results
3 intermediate steps internals
4+ details of optim. routine

method

int - the steps that should be used to estimate the parameters.

0 no optimization perform - just return the log-likelihood from initial guess.
1 initial estimation based on method of moments
2 global optimization not considering lack of smoothness in m
3 interval optimization taking non-smoothness in m into consideration (default, only occurs if provided_m_ is null)

interv_step

int - the number of intervals to be explored after the last minimum was found in the interval optimization. Default is 10.

provided_m_

NumericVector - if NULL (default), the m parameter is estimated by the routine. If numeric, the estimation fixes m to the given value.

par

NumericVector - vector containing the initial guess for parameters b, a and m, respectively. Default values are c(2, 1, 0).

g_opt_par

NumericVector - vector containing the global optimization parameters. The optimization parameters are:

step - (num) initial step size of the searching algorithm.
tol - (num) line search tolerance.
iter - (int) maximum number of iterations.
eps - (num) gradient tolerance. The stopping criteria is $||\text{gradient}||<\text{eps}$.
msize - (num) simplex max size. stopping criteria given by $||\text{max edge}||<\text{msize}$
algo - (int) algorithm. the optimization method used:
- 0 Fletcher-Reeves
- 1 Polak-Ribiere
- 2 Broyden-Fletcher-Goldfarb-Shanno
- 3 Steepest descent
- 4 Nelder-Mead simplex
- 5 Broyden-Fletcher-Goldfarb-Shanno ver.2

Details for each algorithm are available on the 'GSL' Manual. Default values are c(.1, 1e-2, 100, 1e-3, 1e-5, 3,0).

itv_opt_par

NumericVector - interval optimization parameters. Fields are the same as the ones for the global optimization. Default values are c(.01, 1e-3, 200, 1e-3, 1e-5, 5, 0).

Details

The Subbotin distribution is a exponential power distribution controlled by three parameters, with formula: $$f(x;a,b,m) = \frac{1}{A} e^{-\frac{1}{b} |\frac{x-m}{a}|^b}$$ with: $$A = 2ab^{1/b}\Gamma(1+1/b)$$ where $a$ is a scale parameter, $b$ controls the tails (lower values represent fatter tails), and $m$ is a location parameter. Due to its symmetry, the equations are simple enough to be estimated by the method of moments, which produce rough estimations that should be used only for first explorations. The maximum likelihood global estimation improves on this initial guess by using a optimization routine, defaulting to the Broyden-Fletcher-Goldfarb-Shanno method. However, due to the lack of smoothness of this function on the $m$ parameter (derivatives are zero whenever $m$ equals a sample observation), an exhaustive search must be done by redoing the previous step in all intervals between two observations. For a sample of $n$ observations, this would lead to $n-1$ optimization problems. Given the computational cost of such procedure, an interval search is used, where the optimization is repeated in the intervals at most the value of the interv_step from the last minimum found. Details on the method are available on the package vignette.

Examples

Run this code

sample_subbo <- rpower(1000, 1, 2)
subbofit(sample_subbo)

Run the code above in your browser using DataLab