
Last chance! 50% off unlimited learning
Sale ends in
hist
computes a histogram of the given
data values. If plot = TRUE
, the resulting object of
class "histogram"
is plotted by
plot.histogram
, before it is returned.hist(x, ...)## S3 method for class 'default':
hist(x, breaks = "Sturges",
freq = NULL, probability = !freq,
include.lowest = TRUE, right = TRUE,
density = NULL, angle = 45, col = NULL, border = NULL,
main = paste("Histogram of" , xname),
xlim = range(breaks), ylim = NULL,
xlab = xname, ylab,
axes = TRUE, plot = TRUE, labels = FALSE,
nclass = NULL, warn.unused = TRUE, ...)
pretty
values. If
breaks
is a function, the x
vector is supplied to it
as the only argument.TRUE
, the histogram graphic is a
representation of frequencies, the counts
component of
the result; if FALSE
, probability densities, component
density
, are plotted (so that the histogram has a total area
of one). Defaults to TRUE
if and only if breaks
are
equidistant (and probability
is not specified).!freq
, for S compatibility.TRUE
, an x[i]
equal to
the breaks
value will be included in the first (or last, for
right = FALSE
) bar. This will be ignored (with a warning)
unless breaks
is a vector.TRUE
, the histogram cells are
right-closed (left open) intervals.NULL
means that no shading lines
are drawn. Non-positive values of density
also inhibit the
drawing of shading lines.NULL
yields unfilled bars.title
have useful
defaults here.xlim
is not used to define the histogram (breaks),
but only for plotting (when plot = TRUE
).TRUE
(default), axes are draw if the
plot is drawn.TRUE
(default), a histogram is
plotted, otherwise a list of breaks and counts is returned. In the
latter case, a warning is used if (typically graphical) arguments
are specified that only apply to the plot = TRUE
case.FALSE
; see plot.histogram
.nclass
is equivalent to breaks
for a scalar or
character argument.plot = FALSE
and
warn.unused = TRUE
, a warning will be issued when graphical
parameters are passed to hist.default()
.plot.histogram
and thence to title
and
axis
(if plot = TRUE
)."histogram"
which is a list with components:breaks
if that
was a vector). These are the nominal breaks, not with the boundary fuzz.x[]
inside.all(diff(breaks) == 1)
, they are the
relative frequencies counts/n
and in general satisfy
$\sum_i \hat f(x_i) (b_{i+1}-b_i) = 1$, where $b_i$ = breaks[i]
.x
argument name.breaks
are all the same.breaks
. Thus the height of a rectangle is proportional to
the number of points falling into the cell, as is the area
provided the breaks are equally-spaced.The default with non-equi-spaced breaks is to give a plot of area one, in which the area of the rectangles is the fraction of the data points falling in the cells.
If right = TRUE
(default), the histogram cells are intervals
of the form (a, b]
, i.e., they include their right-hand endpoint,
but not their left one, with the exception of the first cell when
include.lowest
is TRUE
.
For right = FALSE
, the intervals are of the form [a, b)
,
and include.lowest
means
A numerical tolerance of $10^{-7}$ times the median bin size
(for more than four bins, otherwise the median is substituted) is
applied when counting entries on the edges of bins. This is not
included in the reported breaks
nor in the calculation of
density
.
The default for breaks
is "Sturges"
: see
nclass.Sturges
. Other names for which algorithms
are supplied are "Scott"
and "FD"
/
"Freedman-Diaconis"
(with corresponding functions
nclass.scott
and nclass.FD
).
Case is ignored and partial matching is used.
Alternatively, a function can be supplied which
will compute the intended number of breaks or the actual breakpoints
as a function of x
.
Venables, W. N. and Ripley. B. D. (2002) Modern Applied Statistics with S. Springer.
nclass.Sturges
, stem
,
density
, truehist
in package
Typical plots with vertical bars are not histograms. Consider
barplot
or plot(*, type = "h")
for such bar plots.
op <- par(mfrow = c(2, 2))
hist(islands)
utils::str(hist(islands, col = "gray", labels = TRUE))
hist(sqrt(islands), breaks = 12, col = "lightblue", border = "pink")
##-- For non-equidistant breaks, counts should NOT be graphed unscaled:
r <- hist(sqrt(islands), breaks = c(4*0:5, 10*3:5, 70, 100, 140),
col = "blue1")
text(r$mids, r$density, r$counts, adj = c(.5, -.5), col = "blue3")
sapply(r[2:3], sum)
sum(r$density * diff(r$breaks)) # == 1
lines(r, lty = 3, border = "purple") # -> lines.histogram(*)
par(op)
require(utils) # for str
str(hist(islands, breaks = 12, plot = FALSE)) #-> 10 (~= 12) breaks
str(hist(islands, breaks = c(12,20,36,80,200,1000,17000), plot = FALSE))
hist(islands, breaks = c(12,20,36,80,200,1000,17000), freq = TRUE,
main = "WRONG histogram") # and warning
require(stats)
set.seed(14)
x <- rchisq(100, df = 4)
op <- par(mfrow = 2:1, mgp = c(1.5, 0.6, 0), mar = .1 + c(3,3:1))
## Comparing data with a model distribution should be done with qqplot()!
qqplot(x, qchisq(ppoints(x), df = 4)); abline(0, 1, col = 2, lty = 2)
## if you really insist on using hist() ... :
hist(x, freq = FALSE, ylim = c(0, 0.2))
curve(dchisq(x, df = 4), col = 2, lty = 2, lwd = 2, add = TRUE)
par(op)
Run the code above in your browser using DataLab