- data_frame
A data_frame where columns are features and rows are observations you might wish to visualize.
- var
Single string representing the name of a column of data_frame
that contains the discrete data you wish to quantify as frequencies.
- sample.by
Single string representing the name of a column of data_frame
that contains an indicator of which sample each observation belongs to.
Note that when this is not provided, there will only be one data point per grouping.
A warning can be expected then for all plots
options except "jitter"
.
- group.by
Single string representing the name of a column of data_frame
containing discrete data to use for separating the data points into groups.
- color.by
Single string representing the name of a column of data_frame
containing discrete data to use for setting data representation color fills.
This data does not need to be the same as group.by
, which is great for highlighting supersets or subgroups when wanted, but it defaults to group.by
so the input can often be skipped.
- vars.use
String or string vector naming a subset of the values of var
-data which should be shown.
If left as NULL
, all values are shown.
Hint: use colLevels
or unique(data_frame[,var])
to assess options.
Note: When var.labels.rename
is jointly utilized to update how the var
-values are shown, the updated values must be used.
- scale
"count" or "percent". Sets whether data should be shown as counts versus percentage.
- max.normalize
Logical which sets whether the data for each var
-data value (each facet) should be normalized to have the same maximum value.
When set to TRUE
, lower frequency var
-values will make use of just as much plot space as higher frequency vars.
Note: Similarly equal plot space utilization can be achieved by using split.adjust = list(scales = "free_y")
, and that alternative route retains original values of the data.
- plots
String vector which sets the types of plots to include: possibilities = "jitter", "boxplot", "vlnplot", "ridgeplot".
Order matters: c("vlnplot", "boxplot", "jitter") will put a violin plot in the back, boxplot in the middle, and then individual dots in the front.
See details section for more info.
- split.nrow, split.ncol
Integers which set the dimensions of the facet grid.
- split.adjust
A named list which allows extra parameters to be pushed through to the faceting function call.
List elements should be valid inputs to the faceting function facet_wrap
, e.g. `list(scales = "free_y")`.
See facet_wrap
for options.
- rows.use
String vector of rownames of data_frame
OR an integer vector specifying the row-indices of data points which should be plotted.
Alternatively, a Logical vector, the same length as the number of rows in data_frame
, where TRUE
values indicate which rows to plot.
- data.out
Logical. When set to TRUE
, changes the output, from the plot alone, to a list containing the plot (p
), its underlying data (data
).
- data.only
Logical. When set to TRUE
, the underlying data will be returned, but not the plot itself.
- do.hover
Logical which sets whether the ggplot output should be converted to a ggplotly object with data about individual bars displayed when you hover your cursor over them.
- hover.round.digits
Integer number specifying the number of decimal digits to round displayed numeric values to, when do.hover
is set to TRUE
.
- color.panel
String vector which sets the colors to draw from for data representation fills.
Default = dittoColors()
.
A named vector can be used if names are matched to the distinct values of the color.by
data.
- colors
Integer vector, the indexes / order, of colors from color.panel
to actually use.
Useful for quickly swapping around colors of the default set (when not using names for color matching).
- y.breaks
Numeric vector, a set of breaks that should be used as major grid lines. c(break1,break2,break3,etc.).
- min, max
Scalars which control the zoom on the continuous axis of the plot.
- var.labels.rename
String vector for renaming the distinct identities of var
-values.
This vector must be the same length as the number of levels or unique values in the var
-data.
Hint: use colLevels
or unique(data_frame[,var])
to original values.
- var.labels.reorder
Integer vector. A sequence of numbers, from 1 to the number of distinct var
-value identities, for rearranging the order of facets within the plot space.
Method: Make a first plot without this input.
Then, treating the top-left-most grouping as index 1, and the bottom-right-most as index n.
Values of var.labels.reorder
should be these indices, but in the order that you would like them rearranged to be.
- x.labels
String vector, c("label1","label2","label3",...) which overrides the names of groupings.
- x.labels.rotate
Logical which sets whether the labels should be rotated.
Default: TRUE
for violin and box plots, but FALSE
for ridgeplots.
- x.reorder
Integer vector. A sequence of numbers, from 1 to the number of groupings, for rearranging the order of x-axis groupings.
Method: Make a first plot without this input.
Then, treating the leftmost grouping as index 1, and the rightmost as index n.
Values of x.reorder should be these indices, but in the order that you would like them rearranged to be.
Recommendation for advanced users: If you find yourself coming back to this input too many times, an alternative solution that can be easier long-term
is to make the target data into a factor, and to put its levels in the desired order: factor(data, levels = c("level1", "level2", ...))
.
- theme
A ggplot theme which will be applied before internal adjustments.
Default = theme_classic()
.
See https://ggplot2.tidyverse.org/reference/ggtheme.html for other options and ideas.
- xlab
String which sets the grouping-axis label (=x-axis for box and violin plots, y-axis for ridgeplots).
Set to NULL
to remove.
- ylab
String, sets the continuous-axis label (=y-axis for box and violin plots, x-axis for ridgeplots).
Default = "make" and if left as make, this title will be automatically generated.
- main
String, sets the plot title. Default = "make" and if left as make, a title will be automatically generated. To remove, set to NULL
.
- sub
String, sets the plot subtitle.
- jitter.size
Scalar which sets the size of the jitter shapes.
- jitter.width
Scalar that sets the width/spread of the jitter in the x direction. Ignored in ridgeplots.
Note for when color.by
is used to split x-axis groupings into additional bins: ggplot does not shrink jitter widths accordingly, so be sure to do so yourself!
Ideally, needs to be 0.5/num_subgroups.
- jitter.color
String which sets the color of the jitter shapes
- jitter.position.dodge
Scalar which adjusts the relative distance between jitter widths when multiple subgroups exist per group.by
grouping (a.k.a. when group.by
and color.by
are not equal).
Similar to boxplot.position.dodge
input & defaults to the value of that input so that BOTH will actually be adjusted when only, say, boxplot.position.dodge = 0.3
is given.
- do.raster
Logical. When set to TRUE
, rasterizes the jitter plot layer, changing it from individually encoded points to a flattened set of pixels.
This can be useful for editing in external programs (e.g. Illustrator) when there are many thousands of data points.
- raster.dpi
Number indicating dots/pixels per inch (dpi) to use for rasterization. Default = 300.
- boxplot.width
Scalar which sets the width/spread of the boxplot in the x direction
- boxplot.color
String which sets the color of the lines of the boxplot
- boxplot.show.outliers
Logical, whether outliers should by including in the boxplot.
Default is FALSE
when there is a jitter plotted, TRUE
if there is no jitter.
- boxplot.outlier.size
Scalar which adjusts the size of points used to mark outliers.
- boxplot.fill
Logical, whether the boxplot should be filled in or not.
Known bug: when boxplot fill is turned off, outliers do not render.
- boxplot.position.dodge
Scalar which adjusts the relative distance between boxplots when multiple are drawn per grouping (a.k.a. when group.by
and color.by
are not equal).
By default, this input actually controls the value of jitter.position.dodge
unless the jitter
version is provided separately.
- boxplot.lineweight
Scalar which adjusts the thickness of boxplot lines.
- vlnplot.lineweight
Scalar which sets the thickness of the line that outlines the violin plots.
- vlnplot.width
Scalar which sets the width/spread of violin plots in the x direction
- vlnplot.scaling
String which sets how the widths of the of violin plots are set in relation to each other.
Options are "area", "count", and "width". If the default is not right for your data, I recommend trying "width".
For an explanation of each, see geom_violin
.
- vlnplot.quantiles
Single number or numeric vector of values in [0,1] naming quantiles at which to draw a horizontal line within each violin plot. Example: c(0.1, 0.5, 0.9)
- ridgeplot.lineweight
Scalar which sets the thickness of the ridgeplot outline.
- ridgeplot.scale
Scalar which sets the distance/overlap between ridgeplots.
A value of 1 means the tallest density curve just touches the baseline of the next higher one.
Higher numbers lead to greater overlap. Default = 1.25
- ridgeplot.ymax.expansion
Scalar which adjusts the minimal space between the topmost grouping and the top of the plot in order to ensure the curve is not cut off by the plotting grid.
The larger the value, the greater the space requested.
When left as NA, dittoViz will attempt to determine an ideal value itself based on the number of groups & linear interpolation between these goal posts: #groups of 3 or fewer: 0.6; #groups=12: 0.1; #groups or 34 or greater: 0.05.
- ridgeplot.shape
Either "smooth" or "hist", sets whether ridges will be smoothed (the typical, and default) versus rectangular like a histogram.
(Note: as of the time shape "hist" was added, combination of jittered points is not supported by the stat_binline
that dittoViz relies on.)
- ridgeplot.bins
Integer which sets how many chunks to break the x-axis into when ridgeplot.shape = "hist"
.
Overridden by ridgeplot.binwidth
when that input is provided.
- ridgeplot.binwidth
Integer which sets the width of chunks to break the x-axis into when ridgeplot.shape = "hist"
.
Takes precedence over ridgeplot.bins
when provided.
- add.line
numeric value(s) where one or multiple line(s) should be added
- line.linetype
String which sets the type of line for add.line
.
Defaults to "dashed", but any ggplot linetype will work.
- line.color
String that sets the color(s) of the add.line
line(s)
- legend.show
Logical. Whether the legend should be displayed. Default = TRUE
.
- legend.title
String or NULL
, sets the title for the main legend which includes colors and data representations.