gofWhite
tests a given 2 dimensional dataset for a copula with the gof test based on White's information matrix equality. The possible copulae are "normal", "t", "gumbel", "clayton" and "frank". See for reference Schepsmeier et al. (2015). The parameter estimation is performed with pseudo maximum likelihood method. In case the estimation fails, inversion of Kendall's tau is used. The margins can be estimated by a bunch of distributions and the time which is necessary for the estimation can be given. The approximate p-values are computed with a parametric bootstrap, which computation can be accelerated by enabling in-build parallel computation. The computation of the test statistic and p-values is performed by corresponding functions from the VineCopula
package.
gofWhite(copula, x, M = 1000, param = 0.5, param.est = T, df = 4, df.est = T, margins = "ranks", execute.times.comp = T, processes = 1)
"normal"
, "clayton"
, "gumbel"
and "frank"
.
TRUE
or FALSE
. TRUE
means that param
will be estimated with a maximum likelihood estimation.
"t"
-copula.
df
shall be estimated. Has to be either FALSE
or TRUE
, where TRUE
means that it will be estimated.
"ranks"
, which is the standard approach to convert data in such a case. Alternatively can the following distributions be specified: "beta"
, "cauchy"
, Chi-squared ("chisq"
), "f"
, "gamma"
, Log normal ("lnorm"
), Normal ("norm"
), "t"
, "weibull"
, Exponential ("exp"
).
M
is at least 100.
class
gofCOP with the components
gofCOP with the componentsThe test statistic is derived by $$T_n = n(\bar{d}(\theta_n))^\top V_{\theta_n}^{-1} \bar{d}(\theta_n)$$ with $$\bar{d}(\theta_n) = \frac{1}{n} \sum_{i=1}^n vech(\mathbf{H}_n(\theta_n|\mathbf{u}) + \mathbf{S}_n(\theta_n|\mathbf{u})),$$
$$d(\theta_n) = vech(\mathbf{H}_n(\theta_n|\mathbf{u}) + \mathbf{S}_n(\theta_n|\mathbf{u})),$$
$$V_{\theta_n} = \frac{1}{n} \sum_{i=1}^n (d(\theta_n) - D_{\theta_n} \mathbf{H}_n(\theta_n)^{-1} \delta l(\theta_n))(d(\theta_n) - D_{\theta_n} \mathbf{H}_n(\theta_n)^{-1} \delta l(\theta_n))^\top$$ and $$D_{\theta_n} = \frac{1}{n} \sum_{i=1}^n [\delta_{\theta_k} d_l(\theta_n)]_{l=1, \dots, \frac{p(p+1)}{2}, k=1, \dots, p}$$ where $l(theta_n)$ represents the log likelihood function and $p$ is the length of the parameter vector $theta$.
The test statistic will be rejected if $$T > (1 - \alpha) (\chi^2_{p(p+1)/2})^{-1}.$$
For small values of M
, initializing the parallization via processes
does not make sense. The registration of the parallel processes increases the computation time. Please consider to enable parallelization just for high values of M
.
data(IndexReturns)
gofWhite("normal", IndexReturns[c(1:100),c(1:2)], M = 10)
Run the code above in your browser using DataLab