Learn R Programming

GiniDistance (version 0.1.1)

gCor: Gini Distance Covariance and Correlation Statistics

Description

Computes Gini distance covariance and correlation statistics, in which Xs are quantitative, Y are categorical, alpha is exponent on the Euclidean distance and returns the measures of dependence.

Usage

gCor(x, y, alpha)

Value

gCor returns the sample Gini distance covariacne and correlation between x and y.

Arguments

x

data

y

label of data or univariate response variable

alpha

exponent on Euclidean distance, in (0,2)

Details

gCor compute Gini distance correlation statistics. It is a self-contained R function returning a measure of dependence statistics.

The sample size (number of rows) of the data must agree with the length of the label vector, and samples must not contain missing values. Arguments x, y are treated as data and labels. alpha if missing by default is 1, otherwise it is exponent on the Euclidean distance.

Suppose a sample data \( {\mathcal{D}} =\{(\mathbf{x}_i,y_i)\} \) for \(i = 1,...,n\) available. The sample counterparts can be easily computed. Let \({\mathcal{I}}_k \) be the index set of sample points with \(y_i =L_k\), then \(p_k\) is estimated by the sample proportion of that category, that is, \(\hat{p}_k= \frac{n_k}{n}\) where \(n_k\) is the number of elements in \({\mathcal{I}}_k\). With a given \(\alpha \in (0,2)\), a point estimator of \(\rho_g(\alpha)\) is given as follows. $$\hat{\Delta}_k(\alpha)= {n_k \choose 2}^{-1} \sum_{i<j \in {\mathcal{I}}_k} \|\mathbf{x}_i -\mathbf{x}_j\| ^{\alpha},$$ $$\hat{\Delta}(\alpha)={n \choose 2}^{-1} \sum_{1=i<j=n} \|\mathbf{x}_i -\mathbf{x}_j\| ^{\alpha},$$ $$gCor=\hat{\rho}_g (\alpha)= 1-\frac{\sum_{k=1}^K \hat p_k \hat{\Delta}_k(\alpha)}{\hat{\Delta}(\alpha)}.$$

References

Dang, X., Nguyen, D., Chen, Y. and Zhang, J. (2019). A new Gini correlation between quantitative and qualitative variables. Submitted to Journal of American Statistics Association.

See Also

gmd gCov KgCov KgCor

Examples

Run this code
  x <- iris[,1:4]
  y <- unclass(iris[,5])
  gCor(x, y, alpha = 1)

Run the code above in your browser using DataLab