Learn R Programming

mbgraphic (version 1.0.1)

dcor2d: Distance correlation for pairs of variables

Description

Calculates the bivariate distance correlation for a given pair of variables, a numeric matrix or a data frame.

Usage

dcor2d(x, y = NULL, binning = FALSE, b = 50, anchor = "min",parallel=FALSE)

Arguments

x

A numeric vector, a numeric matrix or a data frame. In case of a data frame only the numeric variables are used.

y

A numeric vector.

binning

A logical value. Whether or not binning should be used. TRUE, "equi" for equidistant binng, "quant" for quantile based binning or "hexb" for hexagonal binning. Default is FALSE.

b

A positive integer. Number of bins in each variable.

anchor

A chraracter string or a numeric value. How should the anchor point be chosen? "min" (default) for the minimum of each variable, "ggplot" for the method used in ggplot graphics, "nice" for a "pretty"" anchorpoint, or a user specified value.

parallel

A logical value. Whether or not parallelization should be used. Default is FALSE.

Value

A numeric value describing the value of the measure if a pair of vectors is given. Otherwise a data frame with the following variables:

splines2d

Value of the measure.

x1

Number of first variable

x2

Number of second variable.

nx1

Name of first variable (missing if x is not a data frame).

nx2

Name of second variable (missing if x is not a data frame).

References

G. J. Szekely, M. L. Rizzo und N. K. Bakirov (2007) Measuring and testing dependence by correlation of distances.The Annals of Statistics 35(6) 2769--2794.

A. Pilhoefer und A. Unwin (2013) New Approaches in Visualization of Categorical Data: R Package extracat Journal of Statistical Software 53(1) 1--25.

See Also

splines2d

Examples

Run this code
# NOT RUN {
data(Election2005)
# }
# NOT RUN {
# distance correlation for all pairs of variables
dcor <- dcor2d(Election2005)
# put the pairs in decreasing order
o_dcor <- dcor[order(dcor$dcor2d,decreasing=TRUE),]

# Show the 10 pairs with highest values
o_dcor[1:10,]

# Show the 4 scatterplots with highest values
par(mfrow=c(2,2))
for(i in 1:4){
plot(with(Election2005,get(as.character(o_dcor$nx1[i]))),
  with(Election2005,get(as.character(o_dcor$nx2[i]))), 
  xlab=paste(o_dcor$nx1[i]),ylab=paste(o_dcor$nx2[i]),pch=19)
}
# }

Run the code above in your browser using DataLab