gofKernel
tests a 2 dimensional dataset with the Scaillet test for a
copula. The possible copulae are "normal"
, "t"
,
"clayton"
, "gumbel"
, "frank"
, "joe"
,
"amh"
, "galambos"
, "huslerReiss"
, "tawn"
,
"tev"
, "fgm"
and "plackett"
. The parameter
estimation is performed with pseudo maximum likelihood method. In case the
estimation fails, inversion of Kendall's tau is used. The approximate
p-values are computed with a parametric bootstrap, which computation can be
accelerated by enabling in-build parallel computation.
gofKernel(
copula = c("normal", "t", "clayton", "gumbel", "frank", "joe", "amh", "galambos",
"huslerReiss", "tawn", "tev", "fgm", "plackett"),
x,
param = 0.5,
param.est = TRUE,
df = 4,
df.est = TRUE,
margins = "ranks",
flip = 0,
M = 1000,
MJ = 100,
dispstr = "ex",
delta.J = 0.5,
nodes.Integration = 12,
lower = NULL,
upper = NULL,
seed.active = NULL,
processes = 1
)
An object of the class
gofCOP with the components
a character which informs about the performed analysis
the copula tested for
the method used to estimate the margin distribution.
the parameters of
the estimated margin distributions. Only applicable if the margins were not
specified as "ranks"
or NULL
.
dependence parameters of the copulae
the degrees of freedem of the copula. Only applicable for t-copula.
a matrix with the p-values and test statistics of the hybrid and the individual tests
The copula to test for. Possible are "normal"
,
"t"
, "clayton"
, "gumbel"
, "frank"
, "joe"
,
"amh"
, "galambos"
, "huslerReiss"
, "tawn"
,
"tev"
, "fgm"
and "plackett"
.
A matrix containing the data with rows being observations and columns being variables.
The parameter to be used.
Shall be either TRUE
or FALSE
. TRUE
means that param
will be estimated with a maximum likelihood
estimation.
Degrees of freedom, if not meant to be estimated. Only necessary
if tested for "t"
-copula.
Indicates if df
shall be estimated. Has to be either
FALSE
or TRUE
, where TRUE
means that it will be
estimated.
Specifies which estimation method for the margins shall be
used. The default is "ranks"
, which is the standard approach to
convert data in such a case. Alternatively the following distributions can
be specified: "beta"
, "cauchy"
, Chi-squared ("chisq"
),
"f"
, "gamma"
, Log normal ("lnorm"
), Normal
("norm"
), "t"
, "weibull"
, Exponential ("exp"
).
Input can be either one method, e.g. "ranks"
, which will be used for
estimation of all data sequences. Also an individual method for each margin
can be specified, e.g. c("ranks", "norm", "t")
for 3 data sequences.
If one does not want to estimate the margins, set it to NULL
.
The control parameter to flip the copula by 90, 180, 270 degrees clockwise. Only applicable for bivariate copula. Default is 0 and possible inputs are 0, 90, 180, 270 and NULL.
Number of bootstrapping loops.
Size of bootstrapping sample.
A character string specifying the type of the symmetric
positive definite matrix characterizing the elliptical copula. Implemented
structures are "ex" for exchangeable and "un" for unstructured, see package
copula
.
Scaling parameter for the matrix of smoothing parameters.
Number of knots of the bivariate Gauss-Legendre quadrature.
Lower bound for the maximum likelihood estimation of the copula
parameter. The constraint is also active in the bootstrapping procedure. The
constraint is not active when a switch to inversion of Kendall's tau is
necessary. Default NULL
.
Upper bound for the maximum likelihood estimation of the copula
parameter. The constraint is also active in the bootstrapping procedure. The
constraint is not active when a switch to inversion of Kendall's tau is
necessary. Default NULL
.
Has to be either an integer or a vector of M+1 integers.
If an integer, then the seeds for the bootstrapping procedure will be
simulated. If M+1 seeds are provided, then these seeds are used in the
bootstrapping procedure. Defaults to NULL
, then R
generates
the seeds from the computer runtime. Controlling the seeds is useful for
reproducibility of a simulation study to compare the power of the tests or
for reproducibility of an empirical study.
The number of parallel processes which are performed to speed up the bootstrapping. Shouldn't be higher than the number of logical processors. Please see the details.
The Scaillet test is a kernel-based goodness-of-fit test with a fixed smoothing parameter. For the copula density \(c(\mathbf{u}, \theta)\), the corresponding kernel estimator is given by $$c_n(\mathbf{u}) = \frac{1}{n} \sum_{i=1}^n K_H[\mathbf{u} - (U_{i1}, \dots, U_{id})^{\top}], $$ where \(U_{ij}\) for \(i = 1, \dots,n\); \(j = 1, \dots,d\) are the pseudo observations, \(\mathbf{u} \in [0,1]^d\) and \(K_H(y) = K(H^{-1}y)/\det(H)\) for which a bivariate quadratic kernel is used, as in Scaillet (2007). The matrix of smoothing parameters is \(H = 2.6073n^{-1/6} \hat{\Sigma}^{1/2}\) with \(\hat{\Sigma}\) the sample covariance matrix. The test statistic is then given by $$T = \int_{[0,1]^d} \{c_n(\mathbf{u}) - K_H * c(\mathbf{u}, \theta_n)\} \omega(\mathbf{u}) d \mathbf{u}, $$ where \(*\) denotes the convolution operator and \(\omega\) is a weight function, see Zhang et al. (2015). The bivariate Gauss-Legendre quadrature method is used to compute the integral in the test statistic numerically, see Scaillet (2007).
The approximate p-value is computed by the formula $$\sum_{b=1}^M \mathbf{I}(|T_b| \geq |T|) / M,$$
For small values of M
, initializing the parallelisation via
processes
does not make sense. The registration of the parallel
processes increases the computation time. Please consider to enable
parallelisation just for high values of M
.
Zhang, S., Okhrin, O., Zhou, Q., and Song, P.. Goodness-of-fit
Test For Specification of Semiparametric Copula Dependence Models.
Journal of Econometrics, 193, 2016, pp. 215-233
tools:::Rd_expr_doi("10.1016/j.jeconom.2016.02.017")
Scaillet, O.
(2007). Kernel based goodness-of-fit tests for copulas with fixed smoothing
parameters. Journal of Multivariate Analysis, 98:533-543
data(IndexReturns2D)
gofKernel("normal", IndexReturns2D, M = 5, MJ = 5)
Run the code above in your browser using DataLab