mmedist: Matching moment fit of univariate distributions

Description

Fit of univariate distributions by matching moments (raw or centered) for non censored data.

Usage

mmedist(data, distr, order, memp, start = NULL, fix.arg = NULL, optim.method = "default",  lower = -Inf, upper = Inf, custom.optim = NULL, weights = NULL, silent = TRUE,  gradient = NULL, ...)

Arguments

data

A numeric vector for non censored data.

distr

A character string "name" naming a distribution (see 'details').

order

A numeric vector for the moment order(s). The length of this vector must be equal to the number of parameters to estimate.

memp

A function implementing empirical moments, raw or centered but has to be consistent with distr argument (and weights argument). See details below.

start

A named list giving the initial values of parameters of the named distribution or a function of data computing initial values and returning a named list. This argument may be omitted (default) for some distributions for which reasonable starting values are computed (see the 'details' section of mledist).

fix.arg

An optional named list giving the values of fixed parameters of the named distribution or a function of data computing (fixed) parameter values and returning a named list. Parameters with fixed value are thus NOT estimated.

optim.method

"default" or optimization method to pass to optim.

lower

Left bounds on the parameters for the "L-BFGS-B" method (see optim).

upper

Right bounds on the parameters for the "L-BFGS-B" method (see optim).

custom.optim

a function carrying the optimization .

weights

an optional vector of weights to be used in the fitting process. Should be NULL or a numeric vector with strictly positive integers (typically the number of occurences of each observation). If non-NULL, weighted MME is used, otherwise ordinary MME.

silent

A logical to remove or show warnings when bootstraping.

gradient

A function to return the gradient of the squared difference for the "BFGS", "CG" and "L-BFGS-B" methods. If it is NULL, a finite-difference approximation will be used, see details.

...

further arguments passed to the optim, constrOptim or custom.optim function.

Value

mmedist returns a list with following components,

Details

The argument distr can be one of the base R distributions: "norm", "lnorm", "exp" and "pois", "gamma", "logis", "nbinom" , "geom", "beta" and "unif". In that case, no other arguments than data and distr are required, because the estimate is computed by a closed-form formula. For distributions characterized by one parameter ("geom", "pois" and "exp"), this parameter is simply estimated by matching theoretical and observed means, and for distributions characterized by two parameters, these parameters are estimated by matching theoretical and observed means and variances (Vose, 2000). Note that for these closed-form formula, fix.arg cannot be used and start is ignored. The argument distr can also be the distribution name as long as a corresponding mdistr function exists, e.g. "pareto" if "mpareto" exists. In that case arguments arguments order and memp have to be supplied in order to carry out the matching numerically, by minimization of the sum of squared differences between observed and theoretical moments. Optionnally other arguments can be supplied to control optimization (see the 'details' section of mledist for details about arguments for the control of optimization). In that case, fix.arg can be used and start is taken into account. For non closed-form estimators, memp must be provided to compute empirical moments. When weights=NULL, this function must have two arguments x, order: x the numeric vector of the data and order the order of the moment. When weights!=NULL, this function must have three arguments x, order, weights: x the numeric vector of the data, order the order of the moment, weights the numeric vector of weights. See examples below.

Optionally, a vector of weights can be used in the fitting process. By default (when weigths=NULL), ordinary MME is carried out, otherwise the specified weights are used to compute (raw or centered) weighted moments. For closed-form estimators, weighted mean and variance are computed by wtd.mean and wtd.var from the Hmisc package. When a numerical minimization is used, weighted are expected to be computed by the memp function. It is not yet possible to take into account weighths in functions plotdist, plotdistcens, plot.fitdist, plot.fitdistcens, cdfcomp, cdfcompcens, denscomp, ppcomp, qqcomp, gofstat and descdist (developments planned in the future).

This function is not intended to be called directly but is internally called in fitdist and bootdist when used with the matching moments method.

References

Evans M, Hastings N and Peacock B (2000), Statistical distributions. John Wiley and Sons Inc.

Vose D (2000), Risk analysis, a quantitative guide. John Wiley & Sons Ltd, Chischester, England, pp. 99-143.

Delignette-Muller ML and Dutang C (2015), fitdistrplus: An R Package for Fitting Distributions. Journal of Statistical Software, 64(4), 1-34.

Examples

Run this code


# (1) basic fit of a normal distribution with moment matching estimation
#

set.seed(1234)
n <- 100
x1 <- rnorm(n=n)
mmedist(x1, "norm")

#weighted
w <- c(rep(1, n/2), rep(10, n/2))
mmedist(x1, "norm", weights=w)$estimate


# (2) fit a discrete distribution (Poisson)
#

set.seed(1234)
x2 <- rpois(n=30,lambda = 2)
mmedist(x2, "pois")

# (3) fit a finite-support distribution (beta)
#

set.seed(1234)
x3 <- rbeta(n=100,shape1=5, shape2=10)
mmedist(x3, "beta")


# (4) fit a Pareto distribution
#

## Not run: 
#   require(actuar)
#   #simulate a sample
#   x4  <-  rpareto(1000, 6, 2)
# 
#   #empirical raw moment
#   memp  <-  function(x, order) mean(x^order)
#   memp2 <- function(x, order, weights) sum(x^order * weights)/sum(weights)
# 
#   #fit by MME
#   mmedist(x4, "pareto", order=c(1, 2), memp=memp, 
#     start=list(shape=10, scale=10), lower=1, upper=Inf)
#   #fit by weighted MME
#   w <- rep(1, length(x4))
#   w[x4 < 1] <- 2
#   mmedist(x4, "pareto", order=c(1, 2), memp=memp2, weights=w,
#     start=list(shape=10, scale=10), lower=1, upper=Inf)
# ## End(Not run)

Run the code above in your browser using DataLab