B_03_histogram
Histograms and Kernel Density Plots
Draw Histograms and Kernel Density Plots, possibly conditioned on other variables.
- Keywords
- hplot
Usage
histogram(x, data, …)
densityplot(x, data, …)
# S3 method for formula
histogram(x,
data,
allow.multiple, outer = TRUE,
auto.key = FALSE,
aspect = "fill",
panel = lattice.getOption("panel.histogram"),
prepanel, scales, strip, groups,
xlab, xlim, ylab, ylim,
type = c("percent", "count", "density"),
nint = if (is.factor(x)) nlevels(x)
else round(log2(length(x)) + 1),
endpoints = extend.limits(range(as.numeric(x),
finite = TRUE), prop = 0.04),
breaks,
equal.widths = TRUE,
drop.unused.levels =
lattice.getOption("drop.unused.levels"),
…,
lattice.options = NULL,
default.scales = list(),
default.prepanel =
lattice.getOption("prepanel.default.histogram"),
subscripts,
subset)# S3 method for numeric
histogram(x, data = NULL, xlab, …)
# S3 method for factor
histogram(x, data = NULL, xlab, …)
# S3 method for formula
densityplot(x,
data,
allow.multiple = is.null(groups) || outer,
outer = !is.null(groups),
auto.key = FALSE,
aspect = "fill",
panel = lattice.getOption("panel.densityplot"),
prepanel, scales, strip, groups, weights,
xlab, xlim, ylab, ylim,
bw, adjust, kernel, window, width, give.Rkern,
n = 512, from, to, cut, na.rm,
drop.unused.levels =
lattice.getOption("drop.unused.levels"),
…,
lattice.options = NULL,
default.scales = list(),
default.prepanel =
lattice.getOption("prepanel.default.densityplot"),
subscripts,
subset)
# S3 method for numeric
densityplot(x, data = NULL, xlab, …)
do.breaks(endpoints, nint)
Arguments
- x
The object on which method dispatch is carried out.
For the
formula
method,x
can be a formula of the form~ x | g1 * g2 * …
, indicating that histograms or kernel density estimates of thex
variable should be produced conditioned on the levels of the (optional) variablesg1
,g2
, ….x
should be numeric (or possibly a factor in the case ofhistogram
), and each ofg1
,g2
, … should be either factors or shingles.As a special case, the right hand side of the formula can contain more than one term separated by ‘+’ signs (e.g.,
~ x1 + x2 | g1 * g2
). What happens in this case is described in the documentation forxyplot
. Note that in either form, all the terms in the formula must have the same length after evaluation.For the
numeric
andfactor
methods,x
is the variable whose histogram or Kernel density estimate is drawn. Conditioning is not allowed in these cases.- data
For the
formula
method, an optional data source (usually a data frame) in which variables are to be evaluated (seexyplot
for details).data
should not be specified for the other methods, and is ignored with a warning if it is.- type
A character string indicating the type of histogram that is to be drawn.
"percent"
and"count"
give relative frequency and frequency histograms respectively, and can be misleading when breakpoints are not equally spaced."density"
produces a density histogram.type
defaults to"density"
when the breakpoints are unequally spaced, and whenbreaks
isNULL
or a function, and to"percent"
otherwise.- nint
An integer specifying the number of histogram bins, applicable only when
breaks
is unspecified orNULL
in the call. Ignored when the variable being plotted is a factor.- endpoints
A numeric vector of length 2 indicating the range of x-values that is to be covered by the histogram. This applies only when
breaks
is unspecified and the variable being plotted is not a factor. Indo.breaks
, this specifies the interval that is to be divided up.- breaks
Usually a numeric vector of length (number of bins + 1) defining the breakpoints of the bins. Note that when breakpoints are not equally spaced, the only value of
type
that makes sense is density.When
breaks
is unspecified, the value oflattice.getOption("histogram.breaks")
is first checked. If this value isNULL
, then the default is to usebreaks = seq_len(1 + nlevels(x)) - 0.5
when
x
is a factor, andbreaks = do.breaks(endpoints, nint)
otherwise. Breakpoints calculated in such a manner are used in all panels. If the retrieved value is not
NULL
, or ifbreaks
is explicitly specified, it affects the display in each panel independently. Valid values are those accepted as thebreaks
argument inhist
. In particular, this allows specification ofbreaks
as an integer giving the number of bins (similar tonint
), as a character string denoting a method, or as a function.When specified explicitly, a special value of
breaks
isNULL
, in which case the number of bins is determined bynint
and then breakpoints are chosen according to the value ofequal.widths
.- equal.widths
A logical flag, relevant only when
breaks=NULL
. IfTRUE
, equally spaced bins will be selected, otherwise, approximately equal area bins will be selected (typically producing unequally spaced breakpoints).- n
Integer, giving the number of points at which the kernel density is to be evaluated. Passed on as an argument to
density
.- panel
A function, called once for each panel, that uses the packet (subset of panel variables) corresponding to the panel to create a display. The default panel functions
panel.histogram
andpanel.densityplot
are documented separately, and have arguments that can be used to customize its output in various ways. Such arguments can usually be directly supplied to the high-level function.- allow.multiple, outer
See
xyplot
.- auto.key
See
xyplot
.- aspect
See
xyplot
.- prepanel
See
xyplot
.- scales
See
xyplot
.- strip
See
xyplot
.- groups
See
xyplot
. Note that the default panel function forhistogram
does not support grouped displays, whereas the one fordensityplot
does.- xlab, ylab
See
xyplot
.- xlim, ylim
See
xyplot
.- drop.unused.levels
See
xyplot
.- lattice.options
See
xyplot
.- default.scales
See
xyplot
.- subscripts
See
xyplot
.- subset
See
xyplot
.- default.prepanel
Fallback prepanel function. See
xyplot
.- weights
numeric vector of weights for the density calculations, evaluated in the non-standard manner used for
groups
and terms in the formula, if any. If this is specified, it is subsetted usingsubscripts
inside the panel function to match it to the correspondingx
values.At the time of writing,
weights
do not work in conjunction with an extended formula specification (this is not too hard to fix, so just bug the maintainer if you need this feature).- bw, adjust, width
Arguments controlling bandwidth. Passed on as arguments to
density
.- kernel, window
The choice of kernel. Passed on as arguments to
density
.- give.Rkern
Logical flag, passed on as argument to
density
. This argument is made available only for ease of implementation, and will produce an error ifTRUE
.- from, to, cut
Controls range over which density is evaluated. Passed on as arguments to
density
.- na.rm
Logical flag specifying whether
NA
values should be ignored. Passed on as argument todensity
, but unlike indensity
, the default isTRUE
.- …
Further arguments. See corresponding entry in
xyplot
for non-trivial details.
Details
histogram
draws Conditional Histograms, and densityplot
draws Conditional Kernel Density Plots. The default panel function
uses the density
function to compute the density
estimate, and all arguments accepted by density
can be
specified in the call to densityplot
to control the output.
See documentation of density
for details.
These and all other high level Trellis functions have several
arguments in common. These are extensively documented only in the
help page for xyplot
, which should be consulted to learn more
detailed usage.
do.breaks
is an utility function that calculates breakpoints
given an interval and the number of pieces to break it into.
Value
An object of class "trellis"
. The
update
method can be used to
update components of the object and the
print
method (usually called by
default) will plot it on an appropriate plotting device.
Note
The form of the arguments accepted by the default panel function
panel.histogram
is different from that in S-PLUS. Whereas
S-PLUS calculates the heights inside histogram
and passes only
the breakpoints and the heights to the panel function, lattice
simply passes along the original variable x
along with the
breakpoints. This approach is more flexible; see the example below
with an estimated density superimposed over the histogram.
References
Sarkar, Deepayan (2008) Lattice: Multivariate Data Visualization with R, Springer. http://lmdvr.r-forge.r-project.org/
See Also
xyplot
,
panel.histogram
,
density
,
panel.densityplot
,
panel.mathdensity
,
Lattice
Examples
# NOT RUN {
require(stats)
histogram( ~ height | voice.part, data = singer, nint = 17,
endpoints = c(59.5, 76.5), layout = c(2,4), aspect = 1,
xlab = "Height (inches)")
histogram( ~ height | voice.part, data = singer,
xlab = "Height (inches)", type = "density",
panel = function(x, ...) {
panel.histogram(x, ...)
panel.mathdensity(dmath = dnorm, col = "black",
args = list(mean=mean(x),sd=sd(x)))
} )
densityplot( ~ height | voice.part, data = singer, layout = c(2, 4),
xlab = "Height (inches)", bw = 5)
# }