histboxp: Use plotly to Draw Stratified Spike Histogram and Box Plot Statistics

Description

Uses plotly to draw horizontal spike histograms stratified by group, plus the mean (solid dot) and vertical bars for these quantiles: 0.05 (red, short), 0.25 (blue, medium), 0.50 (black, long), 0.75 (blue, medium), and 0.95 (red, short). The robust dispersion measure Gini's mean difference and the SD may optionally be added. These are shown as horizontal lines starting at the minimum value of x having a length equal to the mean difference or SD. Even when Gini's and SD are computed, they are not drawn unless the user clicks on their legend entry.

Spike histograms have the advantage of effectively showing the raw data for both small and huge datasets, and unlike box plots allow multi-modality to be easily seen.

histboxpM plots multiple histograms stacked vertically, for variables in a data frame having a common group variable (if any) and combined using plotly::subplot.

dhistboxp is like histboxp but no plotly graphics are actually drawn. Instead, a data frame suitable for use with plotlyM is returned. For dhistboxp an additional level of stratification strata is implemented. group causes a different result here to produce back-to-back histograms (in the case of two groups) for each level of strata.

Usage

histboxp(p = plotly::plot_ly(height=height), x, group = NULL,
         xlab=NULL, gmd=TRUE, sd=FALSE, bins = 100, wmax=190, mult=7,
         connect=TRUE, showlegend=TRUE)
dhistboxp(x, group = NULL, strata=NULL, xlab=NULL, 
          gmd=FALSE, sd=FALSE, bins = 100, nmin=5, ff1=1, ff2=1)
histboxpM(p=plotly::plot_ly(height=height, width=width), x, group=NULL,
          gmd=TRUE, sd=FALSE, width=NULL, nrows=NULL, ncols=NULL, ...)

Arguments

plotly graphics object if already begun

a numeric vector, or for histboxpM a numeric vector or a data frame of numeric vectors, hopefully with label and units attributes

group

a discrete grouping variable. If omitted, defaults to a vector of ones

strata

a discrete numeric stratification variable. Values are also used to space out different spike histograms. Defaults to a vector of ones.

xlab

x-axis label, defaults to labelled version include units of measurement if any

gmd

set to FALSE to not compute Gini's mean difference

set to TRUE to compute the SD

width

width in pixels

nrows

number of rows for layout of multiple plots

ncols

number of columns for layout of multiple plots. At most one of nrows,ncols should be specified.

bins

number of equal-width bins to use for spike histogram. If the number of distinct values of x is less than bins, the actual values of x are used.

nmin

minimum number of non-missing observations for a group-stratum combination before the spike histogram and quantiles are drawn

ff1,ff2

fudge factors for position and bar length for spike histograms

wmax,mult

tweaks for margin to allocate

connect

set to FALSE to suppress lines connecting quantiles

showlegend

used if producing multiple plots to be combined with subplot; set to FALSE for all but one plot

…

other arguments for histboxpM that are passed to histboxp

Value

a plotly object. For dhistboxp a data frame as expected by plotlyM

Examples

Run this code

# NOT RUN {
dist <- c(rep(1, 500), rep(2, 250), rep(3, 600))
Distribution <- factor(dist, 1 : 3, c('Unimodal', 'Bimodal', 'Trimodal'))
x <- c(rnorm(500, 6, 1),
       rnorm(200, 3, .7), rnorm(50, 7, .4),
       rnorm(200, 2, .7), rnorm(300, 5.5, .4), rnorm(100, 8, .4))
histboxp(x=x, group=Distribution, sd=TRUE)
X <- data.frame(x, x2=runif(length(x)))
histboxpM(x=X, group=Distribution, ncols=2)  # separate plots
# }

Run the code above in your browser using DataLab

Description

Usage

Arguments

Value

See Also

Examples