GOFTESTS: Goodness of fit tests

Description

Anderson-Darling goodness of fit tests for Regional Frequency Analysis.

Usage

gofNORMtest (x)
 gofP3test (x, Nsim=1000)

Arguments

data sample

Nsim

number of simulated samples from the hypothetical parent distribution

Value

gofNORMtest tests the goodness of fit of a normal (Gauss) distribution with the sample x.
gofP3test tests the goodness of fit of a Pearson type III (gamma) distribution with the sample x.
They return the value $A_2$ of the Anderson-Darling statistics and its probability $P$. If $P$ is, for example, 0.92, the sample shouldn't be considered extracted from the hypothetical parent distribution with significance level greater than 8

Details

Given a sample $x_i \ (i=1,\ldots,m)$ of data extracted from a distribution $F_R(x)$, the test is used to check the null hypothesis $H_0 : F_R(x) = F(x,\theta)$, where $F(x,\theta)$ is the hypothetical distribution and $\theta$ is an array of parameters estimated from the sample $x_i$.

The Anderson-Darling goodness of fit test measures the departure between the hypothetical distribution $F(x,\theta)$ and the cumulative frequency function $F_m(x)$ defined as: $$F_m(x) = 0 \ , \ x < x_{(1)}$$ $$F_m(x) = i/m \ , \ x_{(i)} \leq x < x_{(i+1)}$$ $$F_m(x) = 1 \ , \ x_{(m)} \leq x$$ where $x_{(i)}$ is the $i$-th element of the ordered sample (in increasing order).

The test statistic is: $$Q^2 = m \! \int_x \left[ F_m(x) - F(x,\theta) \right]^2 \Psi(x) \,dF(x)$$ where $\Psi(x)$, in the case of the Anderson-Darling test (Laio, 2004), is $\Psi(x) = [F(x,\theta) (1 - F(x,\theta))]^{-1}$. In practice, the statistic is calculated as: $$A^2 = -m -\frac{1}{m} \sum_{i=1}^m \left{ (2i-1)\ln[F(x_{(i)},\theta)] + (2m+1-2i)\ln[1 - F(x_{(i)},\theta)] \right}$$

The statistic $A^2$, obtained in this way, may be confronted with the population of the $A^2$'s that one obtain if samples effectively belongs to the $F(x,\theta)$ hypothetical distribution. In the case of the test of normality, this distribution is defined (see Laio, 2004). In other cases, e.g. the Pearson Type III case here, can be derived with a Monte-Carlo procedure.

References

D'Agostino R., Stephens M. (1986) Goodness-of-Fit Techniques, chapter Tests based on EDF statistics. Marcel Dekker, New York.

Hosking, J.R.M. and Wallis, J.R. (1997) Regional Frequency Analysis: an approach based on L-moments, Cambridge University Press, Cambridge, UK.

Laio, F., Cramer-von Mises and Anderson-Darling goodness of fit tests for extreme value distributions with unknown parameters, Water Resour. Res., 40, W09308, doi:10.1029/2004WR003204.

Viglione A., Claps P., Laio F. (2006) Utilizzo di criteri di prossimit`a nell'analisi regionale del deflusso annuo, XXX Convegno di Idraulica e Costruzioni Idrauliche - IDRA 2006, Roma, 10-15 Settembre 2006.

Viglione A. (2007) Metodi statistici non-supervised per la stima di grandezze idrologiche in siti non strumentati, PhD thesis , In press.

Examples

Run this code

x <- rnorm(30,10,1)
gofNORMtest(x)

x <- rand.gamma(50, 100, 15, 7)
gofP3test(x, Nsim=200)

Run the code above in your browser using DataLab