hist
Histograms
The generic function hist
computes a histogram of the given
data values. If plot = TRUE
, the resulting object of
class "histogram"
is plotted by
plot.histogram
, before it is returned.
 Keywords
 hplot, distribution, dplot
Usage
hist(x, ...)
"hist"(x, breaks = "Sturges", freq = NULL, probability = !freq, include.lowest = TRUE, right = TRUE, density = NULL, angle = 45, col = NULL, border = NULL, main = paste("Histogram of" , xname), xlim = range(breaks), ylim = NULL, xlab = xname, ylab, axes = TRUE, plot = TRUE, labels = FALSE, nclass = NULL, warn.unused = TRUE, ...)
Arguments
 x
 a vector of values for which the histogram is desired.
 breaks
 one of:
 a vector giving the breakpoints between histogram cells,
 a function to compute the vector of breakpoints,
 a single number giving the number of cells for the histogram,
 a character string naming an algorithm to compute the number of cells (see ‘Details’),
 a function to compute the number of cells.
In the last three cases the number is a suggestion only; the breakpoints will be set to
pretty
values. Ifbreaks
is a function, thex
vector is supplied to it as the only argument.  freq
 logical; if
TRUE
, the histogram graphic is a representation of frequencies, thecounts
component of the result; ifFALSE
, probability densities, componentdensity
, are plotted (so that the histogram has a total area of one). Defaults toTRUE
if and only ifbreaks
are equidistant (andprobability
is not specified).  probability
 an alias for
!freq
, for S compatibility.  include.lowest
 logical; if
TRUE
, anx[i]
equal to thebreaks
value will be included in the first (or last, forright = FALSE
) bar. This will be ignored (with a warning) unlessbreaks
is a vector.  right
 logical; if
TRUE
, the histogram cells are rightclosed (left open) intervals.  density
 the density of shading lines, in lines per inch.
The default value of
NULL
means that no shading lines are drawn. Nonpositive values ofdensity
also inhibit the drawing of shading lines.  angle
 the slope of shading lines, given as an angle in degrees (counterclockwise).
 col
 a colour to be used to fill the bars.
The default of
NULL
yields unfilled bars.  border
 the color of the border around the bars. The default is to use the standard foreground color.
 main, xlab, ylab
 these arguments to
title
have useful defaults here.  xlim, ylim
 the range of x and y values with sensible defaults.
Note that
xlim
is not used to define the histogram (breaks), but only for plotting (whenplot = TRUE
).  axes
 logical. If
TRUE
(default), axes are draw if the plot is drawn.  plot
 logical. If
TRUE
(default), a histogram is plotted, otherwise a list of breaks and counts is returned. In the latter case, a warning is used if (typically graphical) arguments are specified that only apply to theplot = TRUE
case.  labels
 logical or character string. Additionally draw labels on top
of bars, if not
FALSE
; seeplot.histogram
.  nclass
 numeric (integer). For S(PLUS) compatibility only,
nclass
is equivalent tobreaks
for a scalar or character argument.  warn.unused
 logical. If
plot = FALSE
andwarn.unused = TRUE
, a warning will be issued when graphical parameters are passed tohist.default()
.  ...
 further arguments and graphical parameters passed to
plot.histogram
and thence totitle
andaxis
(ifplot = TRUE
).
Details
The definition of histogram differs by source (with
countryspecific biases). R's default with equispaced breaks (also
the default) is to plot the counts in the cells defined by
breaks
. Thus the height of a rectangle is proportional to
the number of points falling into the cell, as is the area
provided the breaks are equallyspaced.
The default with nonequispaced breaks is to give a plot of area one, in which the area of the rectangles is the fraction of the data points falling in the cells.
If right = TRUE
(default), the histogram cells are intervals
of the form (a, b]
, i.e., they include their righthand endpoint,
but not their left one, with the exception of the first cell when
include.lowest
is TRUE
.
For right = FALSE
, the intervals are of the form [a, b)
,
and include.lowest
means ‘include highest’.
A numerical tolerance of $1e7$ times the median bin size
(for more than four bins, otherwise the median is substituted)
is applied when counting entries on the edges of bins. This is not
included in the reported breaks
nor (as from R 2.11.0) in the
calculation of density
.
The default for breaks
is "Sturges"
: see
nclass.Sturges
. Other names for which algorithms
are supplied are "Scott"
and "FD"
/
"FreedmanDiaconis"
(with corresponding functions
nclass.scott
and nclass.FD
).
Case is ignored and partial matching is used.
Alternatively, a function can be supplied which
will compute the intended number of breaks or the actual breakpoints
as a function of x
.
Value

an object of class
 breaks
 the $n+1$ cell boundaries (=
breaks
if that was a vector). These are the nominal breaks, not with the boundary fuzz.  counts
 $n$ integers; for each cell, the number of
x[]
inside.  density
 values $f^(x[i])$, as estimated
density values. If
all(diff(breaks) == 1)
, they are the relative frequenciescounts/n
and in general satisfy $sum[i; f^(x[i]) (b[i+1]b[i])] = 1$, where $b[i]$ =breaks[i]
.  mids
 the $n$ cell midpoints.
 xname
 a character string with the actual
x
argument name.  equidist
 logical, indicating if the distances between
breaks
are all the same. Prior to R 3.0.0 there was a component
"histogram"
which is a list with components:
intensities
, the same
as density
, for longterm back compatibility.
References
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.
Venables, W. N. and Ripley. B. D. (2002) Modern Applied Statistics with S. Springer.
See Also
nclass.Sturges
, stem
,
density
, truehist
in package
\href{https://CRAN.Rproject.org/package=#1}{\pkg{#1}}MASSMASS.
Typical plots with vertical bars are not histograms. Consider
barplot
or plot(*, type = "h")
for such bar plots.
Examples
library(graphics)
op < par(mfrow = c(2, 2))
hist(islands)
utils::str(hist(islands, col = "gray", labels = TRUE))
hist(sqrt(islands), breaks = 12, col = "lightblue", border = "pink")
## For nonequidistant breaks, counts should NOT be graphed unscaled:
r < hist(sqrt(islands), breaks = c(4*0:5, 10*3:5, 70, 100, 140),
col = "blue1")
text(r$mids, r$density, r$counts, adj = c(.5, .5), col = "blue3")
sapply(r[2:3], sum)
sum(r$density * diff(r$breaks)) # == 1
lines(r, lty = 3, border = "purple") # > lines.histogram(*)
par(op)
require(utils) # for str
str(hist(islands, breaks = 12, plot = FALSE)) #> 10 (~= 12) breaks
str(hist(islands, breaks = c(12,20,36,80,200,1000,17000), plot = FALSE))
hist(islands, breaks = c(12,20,36,80,200,1000,17000), freq = TRUE,
main = "WRONG histogram") # and warning
require(stats)
set.seed(14)
x < rchisq(100, df = 4)
## Comparing data with a model distribution should be done with qqplot()!
qqplot(x, qchisq(ppoints(x), df = 4)); abline(0, 1, col = 2, lty = 2)
## if you really insist on using hist() ... :
hist(x, freq = FALSE, ylim = c(0, 0.2))
curve(dchisq(x, df = 4), col = 2, lty = 2, lwd = 2, add = TRUE)