
Last chance! 50% off unlimited learning
Sale ends in
stat_dens1d_filter
Filters-out/filters-in observations in
regions of a plot panel with high density of observations, based on the
values mapped to one of x
and y
aesthetics.
stat_dens1d_filter_g
does the same filtering by group instead of by
panel. This second stat is useful for highlighting observations, while the
first one tends to be most useful when the aim is to prevent clashes among
text labels.
stat_dens1d_filter(
mapping = NULL,
data = NULL,
geom = "point",
position = "identity",
...,
keep.fraction = 0.1,
keep.number = Inf,
keep.sparse = TRUE,
invert.selection = FALSE,
bw = "SJ",
kernel = "gaussian",
adjust = 1,
n = 512,
orientation = "x",
na.rm = TRUE,
show.legend = FALSE,
inherit.aes = TRUE
)stat_dens1d_filter_g(
mapping = NULL,
data = NULL,
geom = "point",
position = "identity",
keep.fraction = 0.1,
keep.number = Inf,
keep.sparse = TRUE,
invert.selection = FALSE,
na.rm = TRUE,
show.legend = FALSE,
inherit.aes = TRUE,
bw = "SJ",
adjust = 1,
kernel = "gaussian",
n = 512,
orientation = "x",
...
)
A layer specific dataset - only needed if you want to override the plot defaults.
The geometric object to use display the data.
The position adjustment to use for overlapping points on this layer
numeric [0..1]. The fraction of the observations (or
rows) in data
to be retained.
integer Set the maximum number of observations to retain,
effective only if obeying keep.fraction
would result in a larger
number.
logical If TRUE
, the default, observations from the
more sparse regions are retained, if FALSE
those from the densest
regions.
logical If TRUE
, the complement of the
selected rows are returned.
numeric or character The smoothing bandwidth to be used. If
numeric, the standard deviation of the smoothing kernel. If character, a
rule to choose the bandwidth, as listed in bw.nrd
.
character See density
for details.
numeric A multiplicative bandwidth adjustment. This makes it
possible to adjust the bandwidth while still using the a bandwidth
estimator through an argument passed to bw
. The larger the value
passed to adjust
the stronger the smoothing, hence decreasing
sensitivity to local changes in density.
numeric Number of equally spaced points at which the density is to
be estimated for applying the cut point. See density
for
details.
character The aesthetic along which density is computed. Given explicitly by setting orientation to either "x" or "y".
a logical value indicating whether NA values should be stripped before the computation proceeds.
logical. Should this layer be included in the legends?
NA
, the default, includes if any aesthetics are mapped. FALSE
never includes, and TRUE
always includes.
If FALSE
, overrides the default aesthetics, rather
than combining with them. This is most useful for helper functions that
define both data and aesthetics and shouldn't inherit behaviour from the
default plot specification, e.g. borders
.
A plot layer instance. Using as output data
a subset of the
rows in input data
retained based on a 1D filtering criterion.
density
used internally.
Other statistics returning a subset of data:
stat_dens1d_labels()
,
stat_dens2d_filter()
,
stat_dens2d_labels()
# NOT RUN {
library(ggrepel)
random_string <- function(len = 6) {
paste(sample(letters, len, replace = TRUE), collapse = "")
}
# Make random data.
set.seed(1001)
d <- tibble::tibble(
x = rnorm(100),
y = rnorm(100),
group = rep(c("A", "B"), c(50, 50)),
lab = replicate(100, { random_string() })
)
d$xg <- d$x
d$xg[51:100] <- d$xg[51:100] + 1
# highlight the 1/10 of observations in sparsest regions of the plot
ggplot(data = d, aes(x, y)) +
geom_point() +
geom_rug(sides = "b") +
stat_dens1d_filter(colour = "red") +
stat_dens1d_filter(geom = "rug", colour = "red", sides = "b")
# highlight the 1/4 of observations in densest regions of the plot
ggplot(data = d, aes(x, y)) +
geom_point() +
geom_rug(sides = "b") +
stat_dens1d_filter(colour = "blue",
keep.fraction = 1/4, keep.sparse = FALSE) +
stat_dens1d_filter(geom = "rug", colour = "blue",
keep.fraction = 1/4, keep.sparse = FALSE,
sides = "b")
# switching axes
ggplot(data = d, aes(x, y)) +
geom_point() +
geom_rug(sides = "l") +
stat_dens1d_filter(colour = "red", orientation = "y") +
stat_dens1d_filter(geom = "rug", colour = "red", orientation = "y",
sides = "l")
# highlight 1/10 plus 1/10 observations in high and low density regions
ggplot(data = d, aes(x, y)) +
geom_point() +
geom_rug(sides = "b") +
stat_dens1d_filter(colour = "red") +
stat_dens1d_filter(geom = "rug", colour = "red", sides = "b") +
stat_dens1d_filter(colour = "blue", keep.sparse = FALSE) +
stat_dens1d_filter(geom = "rug",
colour = "blue", keep.sparse = FALSE, sides = "b")
# selecting the 1/10 observations in sparsest regions and their complement
ggplot(data = d, aes(x, y)) +
stat_dens1d_filter(colour = "red") +
stat_dens1d_filter(geom = "rug", colour = "red", sides = "b") +
stat_dens1d_filter(colour = "blue", invert.selection = TRUE) +
stat_dens1d_filter(geom = "rug",
colour = "blue", invert.selection = TRUE, sides = "b")
# density filtering done jointly across groups
ggplot(data = d, aes(xg, y, colour = group)) +
geom_point() +
geom_rug(sides = "b", colour = "black") +
stat_dens1d_filter(shape = 1, size = 3, keep.fraction = 1/4, adjust = 2)
# density filtering done independently for each group
ggplot(data = d, aes(xg, y, colour = group)) +
geom_point() +
geom_rug(sides = "b") +
stat_dens1d_filter_g(shape = 1, size = 3, keep.fraction = 1/4, adjust = 2)
# density filtering done jointly across groups by overriding grouping
ggplot(data = d, aes(xg, y, colour = group)) +
geom_point() +
geom_rug(sides = "b") +
stat_dens1d_filter_g(colour = "black",
shape = 1, size = 3, keep.fraction = 1/4, adjust = 2)
# label observations
ggplot(data = d, aes(x, y, label = lab, colour = group)) +
geom_point() +
stat_dens1d_filter(geom = "text", hjust = "outward")
# repulsive labels with ggrepel::geom_text_repel()
ggplot(data = d, aes(x, y, label = lab, colour = group)) +
geom_point() +
stat_dens1d_filter(geom = "text_repel")
# }
Run the code above in your browser using DataLab