cdplot

Conditional Density Plots

Computes and plots conditional densities describing how the conditional distribution of a categorical variable y changes over a numerical variable x.

Keywords
hplot
Usage
cdplot(x, …)

# S3 method for default cdplot(x, y, plot = TRUE, tol.ylab = 0.05, ylevels = NULL, bw = "nrd0", n = 512, from = NULL, to = NULL, col = NULL, border = 1, main = "", xlab = NULL, ylab = NULL, yaxlabels = NULL, xlim = NULL, ylim = c(0, 1), …)

# S3 method for formula cdplot(formula, data = list(), plot = TRUE, tol.ylab = 0.05, ylevels = NULL, bw = "nrd0", n = 512, from = NULL, to = NULL, col = NULL, border = 1, main = "", xlab = NULL, ylab = NULL, yaxlabels = NULL, xlim = NULL, ylim = c(0, 1), …, subset = NULL)

Arguments
x

an object, the default method expects a single numerical variable (or an object coercible to this).

y

a "factor" interpreted to be the dependent variable

formula

a "formula" of type y ~ x with a single dependent "factor" and a single numerical explanatory variable.

data

an optional data frame.

plot

logical. Should the computed conditional densities be plotted?

tol.ylab

convenience tolerance parameter for y-axis annotation. If the distance between two labels drops under this threshold, they are plotted equidistantly.

ylevels

a character or numeric vector specifying in which order the levels of the dependent variable should be plotted.

bw, n, from, to, …

arguments passed to density

col

a vector of fill colors of the same length as levels(y). The default is to call gray.colors.

border

border color of shaded polygons.

main, xlab, ylab

character strings for annotation

yaxlabels

character vector for annotation of y axis, defaults to levels(y).

xlim, ylim

the range of x and y values with sensible defaults.

subset

an optional vector specifying a subset of observations to be used for plotting.

Details

cdplot computes the conditional densities of x given the levels of y weighted by the marginal distribution of y. The densities are derived cumulatively over the levels of y.

This visualization technique is similar to spinograms (see spineplot) and plots \(P(y | x)\) against \(x\). The conditional probabilities are not derived by discretization (as in the spinogram), but using a smoothing approach via density.

Note, that the estimates of the conditional densities are more reliable for high-density regions of \(x\). Conversely, the are less reliable in regions with only few \(x\) observations.

Value

The conditional density functions (cumulative over the levels of y) are returned invisibly.

References

Hofmann, H., Theus, M. (2005), Interactive graphics for visualizing conditional distributions, Unpublished Manuscript.

See Also

spineplot, density

Aliases
  • cdplot
  • cdplot.default
  • cdplot.formula
Examples
library(graphics) # NOT RUN { ## NASA space shuttle o-ring failures fail <- factor(c(2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 2, 1, 2, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1), levels = 1:2, labels = c("no", "yes")) temperature <- c(53, 57, 58, 63, 66, 67, 67, 67, 68, 69, 70, 70, 70, 70, 72, 73, 75, 75, 76, 76, 78, 79, 81) ## CD plot cdplot(fail ~ temperature) cdplot(fail ~ temperature, bw = 2) cdplot(fail ~ temperature, bw = "SJ") ## compare with spinogram (spineplot(fail ~ temperature, breaks = 3)) ## highlighting for failures cdplot(fail ~ temperature, ylevels = 2:1) ## scatter plot with conditional density cdens <- cdplot(fail ~ temperature, plot = FALSE) plot(I(as.numeric(fail) - 1) ~ jitter(temperature, factor = 2), xlab = "Temperature", ylab = "Conditional failure probability") lines(53:81, 1 - cdens[[1]](53:81), col = 2) # }
Documentation reproduced from package graphics, version 3.5.0, License: Part of R 3.5.0

Community examples

Looks like there are no examples yet.