GOFmontecarlo: Goodness of fit tests

Description

Anderson-Darling goodness of fit tests for Regional Frequency Analysis: Monte-Carlo method.

Usage

gofNORMtest (x)
 gofEXPtest (x, Nsim=1000)
 gofGUMBELtest (x, Nsim=1000)
 gofGENLOGIStest (x, Nsim=1000)
 gofGENPARtest (x, Nsim=1000)
 gofGEVtest (x, Nsim=1000)
 gofLOGNORMtest (x, Nsim=1000)
 gofP3test (x, Nsim=1000)

Arguments

data sample

Nsim

number of simulated samples from the hypothetical parent distribution

Value

gofNORMtest tests the goodness of fit of a normal (Gauss) distribution with the sample x.

gofEXPtest tests the goodness of fit of a exponential distribution with the sample x.

gofGUMBELtest tests the goodness of fit of a Gumbel (EV1) distribution with the sample x.

gofGENLOGIStest tests the goodness of fit of a Generalized Logistic distribution with the sample x.

gofGENPARtest tests the goodness of fit of a Generalized Pareto distribution with the sample x.

gofGEVtest tests the goodness of fit of a Generalized Extreme Value distribution with the sample x.

gofLOGNORMtest tests the goodness of fit of a 3 parameters Lognormal distribution with the sample x.

gofP3test tests the goodness of fit of a Pearson type III (gamma) distribution with the sample x.

They return the value $A_2$ of the Anderson-Darling statistics and its non exceedence probability $P$. Note that $P$ is the probability of obtaining the test statistic $A_2$ lower than the one that was actually observed, assuming that the null hypothesis is true, i.e., $P$ is one minus the p-value usually employed in statistical testing (see http://en.wikipedia.org/wiki/P-value). If $P(A_2)$ is, for example, greater than 0.90, the null hypothesis at significance level $\alpha=10\%$ is rejected.

Details

An introduction, analogous to the following one, on the Anderson-Darling test is available on http://en.wikipedia.org/wiki/Anderson-Darling_test.

Given a sample $x_i \ (i=1,\ldots,m)$ of data extracted from a distribution $F_R(x)$, the test is used to check the null hypothesis $H_0 : F_R(x) = F(x,\theta)$, where $F(x,\theta)$ is the hypothetical distribution and $\theta$ is an array of parameters estimated from the sample $x_i$.

The Anderson-Darling goodness of fit test measures the departure between the hypothetical distribution $F(x,\theta)$ and the cumulative frequency function $F_m(x)$ defined as: $$F_m(x) = 0 \ , \ x < x_{(1)}$$ $$F_m(x) = i/m \ , \ x_{(i)} \leq x < x_{(i+1)}$$ $$F_m(x) = 1 \ , \ x_{(m)} \leq x$$ where $x_{(i)}$ is the $i$-th element of the ordered sample (in increasing order).

The test statistic is: $$Q^2 = m \! \int_x \left[ F_m(x) - F(x,\theta) \right]^2 \Psi(x) \,dF(x)$$ where $\Psi(x)$, in the case of the Anderson-Darling test (Laio, 2004), is $\Psi(x) = [F(x,\theta) (1 - F(x,\theta))]^{-1}$. In practice, the statistic is calculated as: $$A^2 = -m -\frac{1}{m} \sum_{i=1}^m \left\{ (2i-1)\ln[F(x_{(i)},\theta)] + (2m+1-2i)\ln[1 - F(x_{(i)},\theta)] \right\}$$

The statistic $A^2$, obtained in this way, may be confronted with the population of the $A^2$'s that one obtain if samples effectively belongs to the $F(x,\theta)$ hypothetical distribution. In the case of the test of normality, this distribution is defined (see Laio, 2004). In other cases, e.g. the Pearson Type III case, can be derived with a Monte-Carlo procedure.

Examples

Run this code

# NOT RUN {
x <- rnorm(30,10,1)
gofNORMtest(x)

x <- rand.gamma(50, 100, 15, 7)
gofP3test(x, Nsim=200)

x <- rand.GEV(50, 0.907, 0.169, 0.0304)
gofGEVtest(x, Nsim=200)

x <- rand.genlogis(50, 0.907, 0.169, 0.0304)
gofGENLOGIStest(x, Nsim=200)

x <- rand.genpar(50, 0.716, 0.418, 0.476)
gofGENPARtest(x, Nsim=200)

x <- rand.lognorm(50, 0.716, 0.418, 0.476)
gofLOGNORMtest(x, Nsim=200)

# }

Run the code above in your browser using DataLab