Creates a scatter plot and calculates a correlation between two variables.
scatterplot(
data = NULL,
x_var_name = NULL,
y_var_name = NULL,
dot_label_var_name = NULL,
weight_var_name = NULL,
alpha = 1,
annotate_stats = TRUE,
annotate_y_pos = 5,
annotated_stats_color = "green4",
annotated_stats_font_size = 6,
annotated_stats_font_face = "bold",
line_of_fit_type = "lm",
ci_for_line_of_fit = FALSE,
line_of_fit_color = "blue",
dot_color = "black",
x_axis_label = NULL,
y_axis_label = NULL,
dot_size = 2,
dot_label_size = NULL,
dot_size_range = c(3, 12),
jitter_x_percent = 0,
jitter_y_percent = 0,
cap_axis_lines = FALSE,
color_dots_by = NULL
)
the output will be a scatter plot, a ggplot object.
a data object (a data frame or a data.table)
name of the variable that will go on the x axis
name of the variable that will go on the y axis
name of the variable that will be used to label individual observations
name of the variable by which to weight the individual observations for calculating correlation and plotting the line of fit
opacity of the dots (0 = completely transparent, 1 = completely opaque)
if TRUE
, the correlation and p-value will
be annotated at the top of the plot (default = TRUE)
position of the annotated stats, expressed
as a percentage of the range of y values by which the annotated
stats will be placed above the maximum value of y in the data set
(default = 5). If annotate_y_pos = 5
, and the minimum and
maximum y values in the data set are 0 and 100, respectively,
the annotated stats will be placed at 5% of the y range (100 - 0)
above the maximum y value, y = 0.05 * (100 - 0) + 100 = 105.
color of the annotated stats (default = "green4").
font size of the annotated stats (default = 6).
font face of the annotated stats (default = "bold").
if line_of_fit_type = "lm"
, a regression
line will be fit; if line_of_fit_type = "loess"
, a local
regression line will be fit; if line_of_fit_type = "none"
,
no line will be fit
if ci_for_line_of_fit = TRUE
,
confidence interval for the line of fit will be shaded
color of the line of fit (default = "blue")
color of the dots (default = "black")
alternative label for the x axis
alternative label for the y axis
size of the dots on the plot (default = 2)
size for dots' labels on the plot. If no
input is entered for this argument, it will be set as
dot_label_size = 5
by default. If the plot is to be
weighted by some variable, this argument will be ignored, and
dot sizes will be determined by the argument dot_size_range
minimum and maximum size for dots on the plot when they are weighted
horizontally jitter dots by a percentage of the range of x values
vertically jitter dots by a percentage of the range of y values
logical. Should the axis lines be capped at the outer tick marks? (default = TRUE)
name of the variable that will determine colors of the dots
If a weighted correlation is to be calculated, the following package(s) must be installed prior to running the function: Package 'weights' v1.0 (or possibly a higher version) by John Pasek (2018), https://cran.r-project.org/package=weights
if (FALSE) {
scatterplot(data = mtcars, x_var_name = "wt", y_var_name = "mpg")
scatterplot(
data = mtcars, x_var_name = "wt", y_var_name = "mpg",
dot_label_var_name = "hp", weight_var_name = "drat",
annotate_stats = TRUE)
scatterplot(
data = mtcars, x_var_name = "wt", y_var_name = "mpg",
dot_label_var_name = "hp", weight_var_name = "cyl",
dot_label_size = 7, annotate_stats = TRUE)
}
Run the code above in your browser using DataLab