Learn R Programming

energyGOF (version 0.1)

energyGOF.test: Goodness-of-fit tests for univariate data via energy

Description

Perform a goodness-of-fit test of univariate data x against a target y. y may be one of the following:

  • A string naming a distribution. For example, "normal". Both simple (known parameter) and composite (unknown parameter) tests are supported, but not all distributions allow for a composite test. See energyGOF-package for the table of supported distributions.

    • Result: A parametric goodness-of-fit test is performed.

    • Allowable values: uniform, exponential, bernoulli, binomial, geometric, normal, gaussian, beta, poisson, lognormal, lnorm, laplace, doubleexponential, asymmetriclaplace, alaplace, inversegaussian, invgaussian, halfnormal, chisq, chisquared, f, gamma, weibull, cauchy, pareto.

  • A numeric vector of data.

    • Result: A two-sample, non-parametric goodness-of-fit test is performed to test if x and y are equal in distribution.

  • A continuous cumulative distribution function. For example, pt. Only simple tests are supported.

    • Result: \(y(x)\) is tested for uniformity.

P-values are determined via parametric bootstrap. For distributions where \(E|Y|\) is not finite (Cauchy, Pareto), a generalized energy goodness-of-fit test is performed, and an additional tuning parameter pow is required.

Usage

energyGOF.test(x, y, nsim, ...)

egof.test(x, y, nsim, ...)

Value

If y is a string or function, return an object of class `htest' representing the result of the energy goodness-of-fit hypothesis test. The htest object has the elements:

  • method: Simple or Composite

  • data.name

  • distribution: The distribution object created to test

  • parameter: List of parameters if the test is simple

  • nsim: Number of bootstrap replicates

  • composite_p: TRUE/FALSE composite predicate

  • statistic: The value of the energy statistic (\(Q=nE^*\))

  • p.value

  • sim_reps: bootstrap replicates of energy statistic

  • estimate: Any parameter estimates, if the test is composite

If y is numeric, return the same htest object as energy::eqdist.etest().

Arguments

x

A numeric vector.

y

A string, distribution function, or numeric vector. The distribution to test x against.

nsim

A non-negative integer. The number of parametric bootstrap replicates taken to calculate the p-value. If 0, no simulation.

...

If y is a string or distribution function, parameters of the distribution y. Required for a simple test. For distributions in the stats library, parameter argument names are identical. If y is a string, to test the composite goodness-of-fit hypothesis that x is distributed according to the family of distributions y, don't pass parameters in .... For generalized energy tests, you can also optionally pass the generalized energy exponent pow here. Composite testing is not supported if y is a function. (As you can see, there is a lot going on in ... and if you don't like that, you may want to check out energyGOFdist() for a structured interface.)

Author

John T. Haman

See Also

  • energyGOF-package for specifics on the distributions available to test.

  • energyGOFdist() for the alternate S3 interface for parametric testing.

  • Distributions for a list of distributions available in most R installations.

  • energy::eqdist.etest() for information on the two-sample test.

  • energy::normal.test() for the energy goodness-of-fit test with unknown parameters. The tests for (multivariate) Normal in the energy package are implemented with compiled code, and are faster than the one available in the energyGOF package.

  • energy::poisson.mtest() for a different Poisson goodness-of-fit test based on mean distances.

Examples

Run this code
x <- rnorm(10)
y <- rt(10, 4)

## Composite energy goodness-of-fit test (test for Normality with unknown
## parameters)

energyGOF.test(x, "normal", nsim = 10)

## Simple energy goodness-of-fit test (test for Normality with known
## parameters). egof.test is an alias for energyGOF.test.

egof.test(x, "normal", nsim = 10, mean = 0, sd = 1)

## Alternatively, use the energyGOFdist generic directly so that you do not need
## to pass parameter names into `...`

energyGOFdist(x, normal_dist(0, 1), nsim = 10)

## Conduct a two-sample test

egof.test(x, y, 0)

## Conduct a test against any continuous distribution function

egof.test(x, pcauchy, 0)

## Simple energy goodness-of-fit test for Weibull distribution

y <- rweibull(10, 1, 1)
energyGOF.test(y, "weibull", shape = 1, scale = 3, nsim = 10)

## Alternatively, use the energyGOFdist generic directly, which is slightly less
## verbose. egofd is an alias for energyGOFdist.

egofd(y, weibull_dist(1, 3), nsim = 10)

## Conduct a generalized GOF test. `pow` is the exponent *s* in the generalized
## energy statistic. Pow is only necessary when testing Cauchy, and
## Pareto distributions. If you don't set a pow, there is a default for each
## of the distributions, but the default isn't necessarily better than any
## other number.

egofd(rcauchy(100),
   cauchy_dist(location = 0, scale = 1, pow = 0.5),
   nsim = 10)

## energyGOF does not support tests with a mix of known and unknown
## parameters, so this will result in an error.

# \donttest{
  energyGOF.test(x, "normal", mean = 0, nsim = 10) # sd is missing
# }


Run the code above in your browser using DataLab