Hmisc (version 4.0-0)

ggfreqScatter: Frequency Scatterplot

Description

Uses ggplot2 to plot a scatterplot or dot-like chart for the case where there is a very large number of overlapping values. This works for continuous and categorical x and y. For continuous variables it serves the same purpose as hexagonal binning. Counts for overlapping points are grouped into quantile groups and level of transparency and rainbow colors are used to provide count information.

The result can also be passed to ggplotly. Actual cell frequencies are added to the hover text in that case.

Usage

ggfreqScatter(x, y, bins=50, g=10, xtrans = function(x) x, ytrans = function(y) y, xbreaks = pretty(x, 10), ybreaks = pretty(y, 10), xminor  = NULL, yminor = NULL, xlab = as.character(substitute(x)), ylab = as.character(substitute(y)), fcolors = viridis::viridis(10), nsize=FALSE, html=FALSE, ...)

Arguments

x
x-variable
y
y-variable
bins
for continuous x or y is the number of bins to create by rounding. Ignored for categorical variables. If a 2-vector, the first element corresponds to x and the second to y.
g
number of quantile groups to make for frequency counts. Use g=0 to use frequencies continuously for color and alpha coding. This is recommended only when using plotly.
xtrans,ytrans
functions specifying transformations to be made before binning and plotting
xbreaks,ybreaks
vectors of values to label on axis, on original scale
xminor,yminor
values at which to put minor tick marks, on original scale
xlab,ylab
axis labels. If not specified and variable has a label, that label will be used.
fcolors
colors argument to pass to scale_color_gradientn to color code frequencies
nsize
set to TRUE to not vary color or transparency but instead to size the symbols in relation to the number of points. Best with both x and y are discrete. ggplot2 size is taken as the fourth root of the frequency. If there are 15 or unique frequencies all the unique frequencies are used, otherwise g quantile groups of frequencies are used.
html
set to TRUE to use html in axis labels instead of plotmath
...
arguments to pass to geom_point such as shape and size

Value

ggplot object

See Also

cut2

Examples

Run this code
set.seed(1)
x <- rnorm(1000)
y <- rnorm(1000)
count <- sample(1:100, 1000, TRUE)
x <- rep(x, count)
y <- rep(y, count)
g <- ggfreqScatter(x, y) +   # might add g=0 if using plotly
      ggtitle("Using Deciles of Frequency Counts, 2500 Bins")
g
# plotly::ggplotly(g, tooltip='label')  # use plotly, hover text = freq. only
# Plotly makes it somewhat interactive, with hover text tooltips

# Try with x categorical
x1 <- sample(c('cat', 'dog', 'giraffe'), length(x), TRUE)
ggfreqScatter(x1, y)

# Try with y categorical
y1 <- sample(LETTERS[1:10], length(x), TRUE)
ggfreqScatter(x, y1)

# Both categorical, larger point symbols, box instead of circle
ggfreqScatter(x1, y1, shape=15, size=7)
# Vary box size instead
ggfreqScatter(x1, y1, nsize=TRUE, shape=15)

Run the code above in your browser using DataLab