gofWhite tests a given 2 dimensional dataset for a copula with the gof test based on White's information matrix equality. The possible copulae are "normal", "t", "gumbel", "clayton" and "frank". See for reference Schepsmeier et al. (2015). The parameter estimation is performed with pseudo maximum likelihood method. In case the estimation fails, inversion of Kendall's tau is used. The margins can be estimated by a bunch of distributions and the time which is necessary for the estimation can be given. The approximate p-values are computed with a parametric bootstrap, which computation can be accelerated by enabling in-build parallel computation. The computation of the test statistic and p-values is performed by corresponding functions from the VineCopula package.
gofWhite(copula, x, M = 1000, param = 0.5, param.est = T, df = 4, df.est = T, margins = "ranks", execute.times.comp = T, processes = 1)"normal", "clayton", "gumbel" and "frank".
TRUE or FALSE. TRUE means that param will be estimated with a maximum likelihood estimation.
"t"-copula.
df shall be estimated. Has to be either FALSE or TRUE, where TRUE means that it will be estimated.
"ranks", which is the standard approach to convert data in such a case. Alternatively can the following distributions be specified: "beta", "cauchy", Chi-squared ("chisq"), "f", "gamma", Log normal ("lnorm"), Normal ("norm"), "t", "weibull", Exponential ("exp").
M is at least 100.
class gofCOP with the components
gofCOP with the componentsThe test statistic is derived by $$T_n = n(\bar{d}(\theta_n))^\top V_{\theta_n}^{-1} \bar{d}(\theta_n)$$ with $$\bar{d}(\theta_n) = \frac{1}{n} \sum_{i=1}^n vech(\mathbf{H}_n(\theta_n|\mathbf{u}) + \mathbf{S}_n(\theta_n|\mathbf{u})),$$
$$d(\theta_n) = vech(\mathbf{H}_n(\theta_n|\mathbf{u}) + \mathbf{S}_n(\theta_n|\mathbf{u})),$$
$$V_{\theta_n} = \frac{1}{n} \sum_{i=1}^n (d(\theta_n) - D_{\theta_n} \mathbf{H}_n(\theta_n)^{-1} \delta l(\theta_n))(d(\theta_n) - D_{\theta_n} \mathbf{H}_n(\theta_n)^{-1} \delta l(\theta_n))^\top$$ and $$D_{\theta_n} = \frac{1}{n} \sum_{i=1}^n [\delta_{\theta_k} d_l(\theta_n)]_{l=1, \dots, \frac{p(p+1)}{2}, k=1, \dots, p}$$ where $l(theta_n)$ represents the log likelihood function and $p$ is the length of the parameter vector $theta$.
The test statistic will be rejected if $$T > (1 - \alpha) (\chi^2_{p(p+1)/2})^{-1}.$$
For small values of M, initializing the parallization via processes does not make sense. The registration of the parallel processes increases the computation time. Please consider to enable parallelization just for high values of M.
data(IndexReturns)
gofWhite("normal", IndexReturns[c(1:100),c(1:2)], M = 10)
Run the code above in your browser using DataLab