# grouped_ggscatterstats

##### Scatterplot with marginal distributions for all levels of a grouping variable

Grouped scatterplots from `ggplot2`

combined with marginal
histograms/boxplots/density plots with statistical details added as a
subtitle.

##### Usage

```
grouped_ggscatterstats(data, x, y, type = "pearson", conf.level = 0.95,
bf.prior = 0.707, bf.message = TRUE, label.var = NULL,
label.expression = NULL, grouping.var, title.prefix = NULL,
xlab = NULL, ylab = NULL, method = "lm", method.args = list(),
formula = y ~ x, point.color = "black", point.size = 3,
point.alpha = 0.4, line.size = 1.5, point.width.jitter = 0,
point.height.jitter = 0, line.color = "blue", marginal = TRUE,
marginal.type = "histogram", marginal.size = 5, margins = c("both",
"x", "y"), package = "wesanderson", palette = "Royal1",
direction = 1, xfill = "#009E73", yfill = "#D55E00", xalpha = 1,
yalpha = 1, xsize = 0.7, ysize = 0.7, centrality.para = NULL,
results.subtitle = TRUE, stat.title = NULL, caption = NULL,
subtitle = NULL, nboot = 100, beta = 0.1, k = 2,
axes.range.restrict = FALSE, ggtheme = ggplot2::theme_bw(),
ggstatsplot.layer = TRUE, ggplot.component = NULL, return = "plot",
messages = TRUE, ...)
```

##### Arguments

- data
A dataframe (or a tibble) from which variables specified are to be taken. A matrix or tables will

**not**be accepted.- x
The column in

`data`

containing the explanatory variable to be plotted on the x axis. Can be entered either as a character string (e.g.,`"x"`

) or as a bare expression (e.g,`x`

).- y
The column in

`data`

containing the response (outcome) variable to be plotted on the y axis. Can be entered either as a character string (e.g.,`"y"`

) or as a bare expression (e.g,`y`

).- type
Type of association between paired samples required ("

`"parametric"`

: Pearson's product moment correlation coefficient" or "`"nonparametric"`

: Spearman's rho" or "`"robust"`

: percentage bend correlation coefficient" or "`"bayes"`

: Bayes Factor for Pearson's*r*"). Corresponding abbreviations are also accepted:`"p"`

(for parametric/pearson's),`"np"`

(nonparametric/spearman),`"r"`

(robust),`"bf"`

(for bayes factor), resp.- conf.level
Scalar between 0 and 1. If unspecified, the defaults return

`95%`

lower and upper confidence intervals (`0.95`

).- bf.prior
A number between 0.5 and 2 (default

`0.707`

), the prior width to use in calculating Bayes factors.- bf.message
Logical that decides whether to display Bayes Factor in favor of the

*null*hypothesis. This argument is relevant only**for parametric test**(Default:`TRUE`

).- label.var
Variable to use for points labels. Can be entered either as a character string (e.g.,

`"var1"`

) or as a bare expression (e.g,`var1`

).- label.expression
An expression evaluating to a logical vector that determines the subset of data points to label. This argument can be entered either as a character string (e.g.,

`"y < 4 & z < 20"`

) or as a bare expression (e.g.,`y < 4 & z < 20`

).- grouping.var
A single grouping variable (can be entered either as a bare name

`x`

or as a string`"x"`

).- title.prefix
Character string specifying the prefix text for the fixed plot title (name of each factor level) (Default:

`NULL`

). If`NULL`

, the variable name entered for`grouping.var`

will be used.- xlab
Labels for

`x`

and`y`

axis variables. If`NULL`

(default), variable names for`x`

and`y`

will be used.- ylab
Labels for

`x`

and`y`

axis variables. If`NULL`

(default), variable names for`x`

and`y`

will be used.- method
Smoothing method (function) to use, accepts either a character vector, e.g.

`"auto"`

,`"lm"`

,`"glm"`

,`"gam"`

,`"loess"`

or a function, e.g.`MASS::rlm`

or`mgcv::gam`

,`stats::lm`

, or`stats::loess`

.For

`method = "auto"`

the smoothing method is chosen based on the size of the largest group (across all panels).`loess()`

is used for less than 1,000 observations; otherwise`mgcv::gam()`

is used with`formula = y ~ s(x, bs = "cs")`

. Somewhat anecdotally,`loess`

gives a better appearance, but is \(O(N^{2})\) in memory, so does not work for larger datasets.If you have fewer than 1,000 observations but want to use the same

`gam()`

model that`method = "auto"`

would use, then set`method = "gam", formula = y ~ s(x, bs = "cs")`

.- method.args
List of additional arguments passed on to the modelling function defined by

`method`

.- formula
Formula to use in smoothing function, eg.

`y ~ x`

,`y ~ poly(x, 2)`

,`y ~ log(x)`

- point.color
Aesthetics specifying geom point (defaults:

`point.color = "black"`

,`point.size = 3`

,`point.alpha = 0.4`

).- point.size
Aesthetics specifying geom point (defaults:

`point.color = "black"`

,`point.size = 3`

,`point.alpha = 0.4`

).- point.alpha
Aesthetics specifying geom point (defaults:

`point.color = "black"`

,`point.size = 3`

,`point.alpha = 0.4`

).- line.size
Size for the regression line.

- point.width.jitter
Degree of jitter in

`x`

and`y`

direction, respectively. Defaults to`0`

(0 data.- point.height.jitter
Degree of jitter in

`x`

and`y`

direction, respectively. Defaults to`0`

(0 data.- line.color
color for the regression line.

- marginal
Decides whether

`ggExtra::ggMarginal()`

plots will be displayed; the default is`TRUE`

.- marginal.type
Type of marginal distribution to be plotted on the axes (

`"histogram"`

,`"boxplot"`

,`"density"`

,`"violin"`

,`"densigram"`

).- marginal.size
Integer describing the relative size of the marginal plots compared to the main plot. A size of

`5`

means that the main plot is 5x wider and 5x taller than the marginal plots.- margins
Character describing along which margins to show the plots. Any of the following arguments are accepted:

`"both"`

,`"x"`

,`"y"`

.- package
Name of package from which the palette is desired as string or symbol.

- palette
Name of palette as string or symbol.

- direction
Either

`1`

or`-1`

. If`-1`

the palette will be reversed.- xfill
Character describing color fill for

`x`

and`y`

axes marginal distributions (default:`"#009E73"`

(for`x`

) and`"#D55E00"`

(for`y`

)). If set to`NULL`

, manual specification of colors will be turned off and 2 colors from the specified`palette`

from`package`

will be selected.- yfill
Character describing color fill for

`x`

and`y`

axes marginal distributions (default:`"#009E73"`

(for`x`

) and`"#D55E00"`

(for`y`

)). If set to`NULL`

, manual specification of colors will be turned off and 2 colors from the specified`palette`

from`package`

will be selected.- xalpha
Numeric deciding transparency levels for the marginal distributions. Any numbers from

`0`

(transparent) to`1`

(opaque). The default is`1`

for both axes.- yalpha
Numeric deciding transparency levels for the marginal distributions. Any numbers from

`0`

(transparent) to`1`

(opaque). The default is`1`

for both axes.- xsize
Size for the marginal distribution boundaries (Default:

`0.7`

).- ysize
Size for the marginal distribution boundaries (Default:

`0.7`

).- centrality.para
Decides

*which*measure of central tendency (`"mean"`

or`"median"`

) is to be displayed as vertical (for`x`

) and horizontal (for`y`

) lines.- results.subtitle
Decides whether the results of statistical tests are to be displayed as a subtitle (Default:

`TRUE`

). If set to`FALSE`

, only the plot will be returned.- stat.title
A character describing the test being run, which will be added as a prefix in the subtitle. The default is

`NULL`

. An example of a`stat.title`

argument will be something like`"Student's t-test: "`

.- caption
The text for the plot caption.

- subtitle
The text for the plot subtitle. Will work only if

`results.subtitle = FALSE`

.- nboot
Number of bootstrap samples for computing confidence interval for the effect size (Default:

`100`

).- beta
bending constant (Default:

`0.1`

). For more, see`?WRS2::pbcor`

.- k
Number of digits after decimal point (should be an integer) (Default:

`k = 2`

).- axes.range.restrict
Logical that decides whether to restrict the axes values ranges to

`min`

and`max`

values of the axes variables (Default:`FALSE`

), only relevant for functions where axes variables are of numeric type.- ggtheme
A function,

`ggplot2`

theme name. Default value is`ggplot2::theme_bw()`

. Any of the`ggplot2`

themes, or themes from extension packages are allowed (e.g.,`ggthemes::theme_fivethirtyeight()`

,`hrbrthemes::theme_ipsum_ps()`

, etc.).- ggstatsplot.layer
Logical that decides whether

`theme_ggstatsplot`

theme elements are to be displayed along with the selected`ggtheme`

(Default:`TRUE`

).- ggplot.component
A

`ggplot`

component to be added to the plot prepared by`ggstatsplot`

. This argument is primarily helpful for`grouped_`

variant of the current function. Default is`NULL`

. The argument should be entered as a function. If the given function has an argument`axes.range.restrict`

and if it has been set to`TRUE`

, the added ggplot component*might*not work as expected.- return
Character that describes what is to be returned: can be

`"plot"`

(default) or`"subtitle"`

or`"caption"`

. Setting this to`"subtitle"`

will return the expression containing statistical results, which will be a`NULL`

if you set`results.subtitle = FALSE`

. Setting this to`"caption"`

will return the expression containing details about Bayes Factor analysis, but valid only when`type = "p"`

and`bf.message = TRUE`

, otherwise this will return a`NULL`

.- messages
Decides whether messages references, notes, and warnings are to be displayed (Default:

`TRUE`

).- ...
Arguments passed on to

`combine_plots`

- title.text
String or plotmath expression to be drawn as title for the

*combined plot*.- title.color
Text color for title.

- title.size
Point size of title text.

- title.vjust
Vertical justification for title. Default =

`0.5`

(centered on`y`

).`0`

= baseline at`y`

,`1`

= ascender at`y`

.- title.hjust
Horizontal justification for title. Default =

`0.5`

(centered on`x`

).`0`

= flush-left at x,`1`

= flush-right.- title.fontface
The font face (

`"plain"`

,`"bold"`

(default),`"italic"`

,`"bold.italic"`

) for title.- caption.text
String or plotmath expression to be drawn as the caption for the

*combined plot*.- caption.color
Text color for caption.

- caption.size
Point size of title text.

- caption.vjust
Vertical justification for caption. Default =

`0.5`

(centered on y).`0`

= baseline at y,`1`

= ascender at y.- caption.hjust
Horizontal justification for caption. Default =

`0.5`

(centered on x).`0`

= flush-left at x,`1`

= flush-right.- caption.fontface
The font face (

`"plain"`

(default),`"bold"`

,`"italic"`

,`"bold.italic"`

) for caption.- sub.text
The label with which the

*combined plot*should be annotated. Can be a plotmath expression.- sub.color
Text color for annotation label (Default:

`"black"`

).- sub.size
Point size of annotation text (Default:

`12`

).- sub.x
The x position of annotation label (Default:

`0.5`

).- sub.y
The y position of annotation label (Default:

`0.5`

).- sub.hjust
Horizontal justification for annotation label (Default:

`0.5`

).- sub.vjust
Vertical justification for annotation label (Default:

`0.5`

).- sub.vpadding
Vertical padding. The total vertical space added to the label, given in grid units. By default, this is added equally above and below the label. However, by changing the y and vjust parameters, this can be changed (Default:

`grid::unit(1, "lines")`

).- sub.fontface
The font face (

`"plain"`

(default),`"bold"`

,`"italic"`

,`"bold.italic"`

) for the annotation label.- sub.angle
Angle at which annotation label is to be drawn (Default:

`0`

).- sub.lineheight
Line height of annotation label.

- title.caption.rel.heights
Numerical vector of relative columns heights while combining (title, plot, caption).

- title.rel.heights
Numerical vector of relative columns heights while combining (title, plot).

- caption.rel.heights
Numerical vector of relative columns heights while combining (plot, caption).

##### References

https://indrajeetpatil.github.io/ggstatsplot/articles/web_only/ggscatterstats.html

##### See Also

##### Examples

```
# NOT RUN {
# }
# NOT RUN {
# to ensure reproducibility
set.seed(123)
# basic function call
ggstatsplot::grouped_ggscatterstats(
data = dplyr::filter(
ggstatsplot::movies_long,
genre == "Comedy" |
genre == "Drama"
),
x = length,
y = rating,
method = "lm",
formula = y ~ x + I(x^3),
grouping.var = genre
)
# using labeling
# (also show how to modify basic plot from within function call)
ggstatsplot::grouped_ggscatterstats(
data = dplyr::filter(ggplot2::mpg, cyl != 5),
x = displ,
y = hwy,
grouping.var = cyl,
title.prefix = "Cylinder count",
type = "robust",
label.var = manufacturer,
label.expression = hwy > 25 & displ > 2.5,
xfill = NULL,
ggplot.component = ggplot2::scale_y_continuous(sec.axis = ggplot2::dup_axis()),
package = "yarrr",
palette = "appletv",
messages = FALSE
)
# labeling without expression
ggstatsplot::grouped_ggscatterstats(
data = dplyr::filter(
.data = ggstatsplot::movies_long,
rating == 7,
genre %in% c("Drama", "Comedy")
),
x = budget,
y = length,
grouping.var = genre,
bf.message = FALSE,
label.var = "title",
marginal = FALSE,
title.prefix = "Genre",
caption.text = "All movies have IMDB rating equal to 7."
)
# }
```

*Documentation reproduced from package ggstatsplot, version 0.0.11, License: GPL-3 | file LICENSE*