visualize: Visualize statistical inference

Description

Visualize the distribution of the simulation-based inferential statistics or the theoretical distribution (or both!).

Usage

visualize(data, bins = 15, method = "simulation",
  dens_color = "black", obs_stat = NULL, obs_stat_color = "red2",
  pvalue_fill = "pink", direction = NULL, endpoints = NULL,
  endpoints_color = "mediumaquamarine", ci_fill = "turquoise", ...)
visualise(data, bins = 15, method = "simulation",
  dens_color = "black", obs_stat = NULL, obs_stat_color = "red2",
  pvalue_fill = "pink", direction = NULL, endpoints = NULL,
  endpoints_color = "mediumaquamarine", ci_fill = "turquoise", ...)

Arguments

data

The output from calculate().

bins

The number of bins in the histogram.

method

A string giving the method to display. Options are "simulation", "theoretical", or "both" with "both" corresponding to "simulation" and "theoretical".

dens_color

A character or hex string specifying the color of the theoretical density curve.

obs_stat

A numeric value or 1x1 data frame corresponding to what the observed statistic is. Deprecated (see Details).

obs_stat_color

A character or hex string specifying the color of the observed statistic as a vertical line on the plot. Deprecated (see Details).

pvalue_fill

A character or hex string specifying the color to shade the p-value. In previous versions of the package this was the shade_color argument. Deprecated (see Details).

direction

A string specifying in which direction the shading should occur. Options are "less", "greater", or "two_sided" for p-value. Can also give "left", "right", or "both" for p-value. For confidence intervals, use "between" and give the endpoint values in endpoints. Deprecated (see Details).

endpoints

A 2 element vector or a 1 x 2 data frame containing the lower and upper values to be plotted. Most useful for visualizing conference intervals. Deprecated (see Details).

endpoints_color

A character or hex string specifying the color of the observed statistic as a vertical line on the plot. Deprecated (see Details).

ci_fill

A character or hex string specifying the color to shade the confidence interval. Deprecated (see Details).

...

Other arguments passed along to {ggplot2} functions.

Value

A ggplot object showing the simulation-based distribution as a histogram or bar graph. Also used to show the theoretical curves.

Details

In order to make visualization workflow more straightforward and explicit visualize() now only should be used to plot statistics directly. That is why arguments not related to this task are deprecated and will be removed in a future release of {infer}.

To add to plot information related to p-value use shade_p_value(). To add to plot information related to confidence interval use shade_confidence_interval().

Examples

Run this code

# NOT RUN {
# Permutations to create a simulation-based null distribution for
# one numerical response and one categorical predictor
# using t statistic
mtcars %>%
  dplyr::mutate(am = factor(am)) %>%
  specify(mpg ~ am) %>% # alt: response = mpg, explanatory = am
  hypothesize(null = "independence") %>%
  generate(reps = 100, type = "permute") %>%
  calculate(stat = "t", order = c("1", "0")) %>%
  visualize(method = "simulation") #default method

# Theoretical t distribution for
# one numerical response and one categorical predictor
# using t statistic
mtcars %>%
  dplyr::mutate(am = factor(am)) %>%
  specify(mpg ~ am) %>% # alt: response = mpg, explanatory = am
  hypothesize(null = "independence") %>%
  # generate() is not needed since we are not doing simulation
  calculate(stat = "t", order = c("1", "0")) %>%
  visualize(method = "theoretical")

# Overlay theoretical distribution on top of randomized t-statistics
mtcars %>%
  dplyr::mutate(am = factor(am)) %>%
  specify(mpg ~ am) %>% # alt: response = mpg, explanatory = am
  hypothesize(null = "independence") %>%
  generate(reps = 100, type = "permute") %>%
  calculate(stat = "t", order = c("1", "0")) %>%
  visualize(method = "both")

# }