CKT.kendallReg.fit: Fit Kendall's regression, a GLM-type model for conditional Kendall's tau

Description

The function CKT.kendallReg.fit fits a regression-type model for the conditional Kendall's tau between two variables $X_1$ and $X_2$ conditionally to some predictors Z. More precisely, it fits the model $$\Lambda(\tau_{X_1, X_2 | Z = z}) = \sum_{j=1}^{p'} \beta_j \psi_j(z),$$ where $\tau_{X_1, X_2 | Z = z}$ is the conditional Kendall's tau between $X_1$ and $X_2$ conditionally to $Z=z$, $\Lambda$ is a function from $]-1, 1]$ to $R$, $(\beta_1, \dots, \beta_p)$ are unknown coefficients to be estimated and $\psi_1, \dots, \psi_{p'})$ are a dictionary of functions. To estimate $beta$, we used the penalized estimator which is defined as the minimizer of the following criteria $$\frac{1}{2n'} \sum_{i=1}^{n'} [\Lambda(\hat\tau_{X_1, X_2 | Z = z_i}) - \sum_{j=1}^{p'} \beta_j \psi_j(z_i)]^2 + \lambda * |\beta|_1,$$ where the $z_i$ are a second sample (here denoted by ZToEstimate).

The function CKT.kendallReg.predict predicts the conditional Kendall's tau between two variables $X_1$ and $X_2$ given $Z=z$ for some new values of $z$.

Usage

CKT.kendallReg.fit(
  X1 = NULL,
  X2 = NULL,
  Z = NULL,
  ZToEstimate,
  designMatrixZ = cbind(ZToEstimate, ZToEstimate^2, ZToEstimate^3),
  newZ = designMatrixZ,
  h_kernel,
  Lambda = identity,
  Lambda_inv = identity,
  lambda = NULL,
  Kfolds_lambda = 10,
  l_norm = 1,
  h_lambda = h_kernel,
  ...,
  observedX1 = NULL,
  observedX2 = NULL,
  observedZ = NULL
)
CKT.kendallReg.predict(fit, newZ, lambda = NULL, Lambda_inv = identity)

Value

The function CKT.kendallReg.fit returns a list with the following components:

estimatedCKT: the estimated CKT at the new data points newZ.
fit: the fitted model, of S3 class glmnet (see glmnet::glmnet for more details).
lambda: the value of the penalized parameter used. (i.e. either the one supplied by the user or the one determined by cross-validation)

CKT.kendallReg.predict returns the predicted values of conditional Kendall's tau.

Arguments

X1: a vector of n observations of the first variable $X_1$.
X2: a vector of n observations of the second variable $X_2$.
Z: a vector of n observations of the conditioning variable, or a matrix with n rows of observations of the conditioning vector (if $Z$ is multivariate).
ZToEstimate: the intermediary dataset of observations of $Z$ at which the conditional Kendall's tau should be estimated.
designMatrixZ: the transformation of the ZToEstimate that will be used as predictors. By default, no transformation is applied.
newZ: the new observations of the conditioning variable.
h_kernel: bandwidth used for the first step of kernel smoothing.
Lambda: the function to be applied on conditional Kendall's tau. By default, the identity function is used.
Lambda_inv: the functional inverse of Lambda. By default, the identity function is used.
lambda: the regularization parameter. If NULL, then it is chosen by K-fold cross validation. Internally, cross-validation is performed by the function CKT.KendallReg.LambdaCV.
Kfolds_lambda: the number of folds used in the cross-validation procedure to choose lambda.
l_norm: type of norm used for selection of the optimal lambda by cross-validation. l_norm=1 corresponds to the sum of absolute values of differences between predicted and estimated conditional Kendall's tau while l_norm=2 corresponds to the sum of squares of differences.
h_lambda: the smoothing bandwidth used in the cross-validation procedure to choose lambda.
...: other arguments to be passed to CKT.kernel for the first step (kernel-based) estimator of conditional Kendall's tau.
observedX1, observedX2, observedZ: old parameter names for X1, X2, Z. Support for this will be removed at a later version.
fit: the fitted model, obtained by a call to CKT.kendallReg.fit.

References

Derumigny, A., & Fermanian, J. D. (2020). On Kendall’s regression. Journal of Multivariate Analysis, 178, 104610. tools:::Rd_expr_doi("10.1016/j.jmva.2020.104610")

Examples

Run this code

# We simulate from a conditional copula
set.seed(1)
N = 400
Z = rnorm(n = N, mean = 5, sd = 2)
conditionalTau = -0.9 + 1.8 * pnorm(Z, mean = 5, sd = 2)
simCopula = VineCopula::BiCopSim(N=N , family = 1,
    par = VineCopula::BiCopTau2Par(1 , conditionalTau ))
X1 = qnorm(simCopula[,1])
X2 = qnorm(simCopula[,2])

newZ = seq(2, 10, by = 0.1)
estimatedCKT_kendallReg <- CKT.kendallReg.fit(
   X1 = X1, X2 = X2, Z = Z,
   ZToEstimate = newZ, h_kernel = 0.07)

coef(estimatedCKT_kendallReg$fit,
     s = estimatedCKT_kendallReg$lambda)

# Comparison between true Kendall's tau (in black)
# and estimated Kendall's tau (in red)
trueConditionalTau = -0.9 + 1.8 * pnorm(newZ, mean = 5, sd = 2)
plot(newZ, trueConditionalTau , col="black",
   type = "l", ylim = c(-1, 1))
lines(newZ, estimatedCKT_kendallReg$estimatedCKT, col = "red")