gofRosenblattChisq
contains the RosenblattChisq gof test for copulae, described in Genest (2009) and Hofert (2014), and compares the empirical copula against a parametric estimate of the copula derived under the null hypothesis. The margins can be estimated by a bunch of distributions and the time which is necessary for the estimation can be given. The approximate p-values are computed with a parametric bootstrap, which computation can be accelerated by enabling in-build parallel computation. The gof statistics are computed with the function gofTstat
from the package copula. It is possible to insert datasets of all dimensions above 1 and the possible copulae are "normal", "t", "gumbel", "clayton" and "frank". The parameter estimation is performed with pseudo maximum likelihood method. In case the estimation fails, inversion of Kendall's tau is used.
gofRosenblattChisq(copula, x, M = 1000, param = 0.5, param.est = T, df = 4, df.est = T, margins = "ranks", dispstr = "ex", execute.times.comp = T, processes = 1)
"normal"
, "t"
, "clayton"
, "gumbel"
and "frank"
.
TRUE
or FALSE
. TRUE
means that param
will be estimated.
"t"
-copula.
df
shall be estimated. Has to be either FALSE
or TRUE
, where TRUE
means that it will be estimated.
"ranks"
, which is the standard approach to convert data in such a case. Alternatively can the following distributions be specified: "beta"
, "cauchy"
, Chi-squared ("chisq"
), "f"
, "gamma"
, Log normal ("lnorm"
), Normal ("norm"
), "t"
, "weibull"
, Exponential ("exp"
).
copula
.
M
is at least 100.
class
gofCOP with the components
gofCOP with the componentsThis test is based on the Rosenblatt probability integral transform which uses the mapping $R : (0,1)^d -> (0,1)^d$. Following Genest et al. (2009) ensures this transformation the decomposition of a random vector $u in [0,1]^d$ with a distribution into mutually independent elements with a uniform distribution on the unit interval. The mapping provides pseudo observations $E[i]$, given by $$E_1 = \mathcal{R}(U_1), \dots, E_n = \mathcal{R}(U_n).$$ The mapping is performed by assigning to every vector $u$ for $e[1] = u[1]$ and for $i in {2, ..., d}$, $$e_i = \frac{\partial^{i-1} C(u_1, \dots, u_i, 1, \dots, 1)}{\partial u_1 \cdots \partial u_{i-1}} / \frac{\partial^{i-1} C(u_1, \dots, u_{i-1}, 1, \dots, 1)}{\partial u_1 \cdots \partial u_{i-1}}.$$
The Anderson-Darling test statistic of the variates
$$G(x_j) = \chi_d^2 \left( x_j \right)$$
is computed (via ADGofTest::ad.test
), where $x[j] = (Phi^{-1}(e_{1j}))^2+...+(Phi^{-1}(e_{dj}))^2$, $Phi^{-1}$ denotes the quantile function of the standard normal distribution function, $pchisq(.,df=d)$ denotes the distribution function of the chi-square distribution with d
degrees of freedom, and $u_{ij}$ is the $j$th component in the $i$th row of $u$.
The test statistic is then given by $$T = -n - \sum_{j=1}^n \frac{2j - 1}{n} [\ln(G(x_j)) + \ln(1 - G(x_{n+1-j}))].$$
The approximate p-value is computed by the formula,
$$\sum_{b=1}^M \mathbf{I}(|T_b| \geq |T|) / M,$$
where $T$ and $T[b]$ denote the test statistic and the bootstrapped test statistc, respectively.
For small values of M
, initializing the parallization via processes
does not make sense. The registration of the parallel processes increases the computation time. Please consider to enable parallelization just for high values of M
.
data(IndexReturns)
gofRosenblattChisq("normal", IndexReturns[c(1:100),c(1:2)], M = 20)
Run the code above in your browser using DataLab