Learn R Programming

rfriend (version 1.0.0)

f_boxplot: Generate a Boxplot Report of a data.frame

Description

Generates boxplots for all numeric variables in a given dataset, grouped by factor variables. The function automatically detects numeric and factor variables. It allows two output formats ('pdf', 'Word') and includes an option to add a general explanation about interpreting boxplots.

Usage

f_boxplot(
  data = NULL,
  formula = NULL,
  fancy_names = NULL,
  output_type = "pdf",
  output_file = NULL,
  output_dir = NULL,
  save_in_wdir = FALSE,
  close_generated_files = FALSE,
  open_generated_files = TRUE,
  boxplot_explanation = TRUE,
  detect_factors = TRUE,
  jitter = FALSE,
  width = 8,
  height = 7,
  units = "in",
  res = 300,
  las = 2
)

Value

Generates a report file ('pdf' or 'Word') with boxplots and, optionally, opens it with the default program. Returns NULL (no R object) when generating 'pdf' or 'Word' files. Can also return R Markdown code or 'PNG' files depending on the output format.

Arguments

data

A data.frame containing the data to be used for creating boxplots.

formula

A formula specifying the factor to be plotted. More response variables can be added using - or + (e.g., response1 + response2 ~ predictor) to generate multiple boxplots. If the formula is omitted and only data is provided all data will be used for creating boxplots.

fancy_names

An optional named vector mapping column names in data to more readable names for display in plots (name map). Defaults to NULL.

output_type

Character string, specifying the output format: "pdf", "word", "rmd" or "png". Default is "pdf".

output_file

A character string, specifying the name of the output file (without extension). If NULL, a default name based on the dataset is generated.

output_dir

Character string specifying the name of the directory of the output file. Default is tempdir(). If the output_file already contains a directory name output_dir can be omitted, if used it overwrites the dir specified in output_file.

save_in_wdir

Logical. If TRUE, saves the file in the working directory Default is FALSE, to avoid unintended changes to the global environment. If the output_dir is specified save_in_wdir is overwritten with output_dir.

close_generated_files

Logical. If TRUE, closes open 'Word' files depending on the output format. This to be able to save the newly generated files. 'Pdf' files should also be closed before using the function and cannot be automatically closed.

open_generated_files

Logical. If TRUE, Opens the generated output files ('pdf', 'Word' or 'png') files depending on the output format. This to directly view the results after creation. Files are stored in tempdir(). Default is TRUE.

boxplot_explanation

A logical value indicating whether to include an explanation of how to interpret boxplots in the report. Defaults to TRUE.

detect_factors

A logical value indicating whether to automatically detect factor variables in the dataset. Defaults to TRUE.

jitter

A logical value, if TRUE all data per boxplot is shown, if FALSE (default) individual data points (except for outliers) are omitted.

width

Numeric, png figure width default 8 inch

height

Numeric, png figure height default 7 inch

units

Character string, png figure units default "in" = inch, other options are: "px" = Pixels, "cm" = centimeters, "mm" = millimeters.

res

Numeric, png figure resolution default 300 dpi

las

An integer (0 t/m 3), las = 0: Axis labels are parallel to the axis. las = 1: Axis labels are always horizontal. las = 2: Axis labels are perpendicular to the axis. (default setting). las = 3: Axis labels are always vertical.

Author

Sander H. van Delden plantmind@proton.me

Details

The function performs the following steps:

  • Detects numeric and factor variables in the dataset.

  • Generates boxplots for each numeric variable grouped by each factor variable.

  • Outputs the report in the specified format ('pdf', 'Word' or 'Rmd').

If output_type = "rmd" is used it is adviced to use it in a chunk with {r, echo=FALSE, results='asis'}

If no factor variables are detected, the function stops with an error message since factors are required for creating boxplots.

This function will plot all numeric and factor candidates, use the function subset() to prepare a selection of columns before submitting to f_boxplot().

Note that there is an optional jitter option to plot all individual data points over the boxplots.

This function requires [Pandoc](https://github.com/jgm/pandoc/releases/tag) (version 1.12.3 or higher), a universal document converter.

Windows: Install Pandoc and ensure the installation folder
(e.g., "C:/Users/your_username/AppData/Local/Pandoc") is added to your system PATH.

macOS: If using Homebrew, Pandoc is typically installed in "/usr/local/bin". Alternatively, download the .pkg installer and verify that the binary’s location is in your PATH.

Linux: Install Pandoc through your distribution’s package manager (commonly installed in "/usr/bin" or "/usr/local/bin") or manually, and ensure the directory containing Pandoc is in your PATH.

If Pandoc is not found, this function may not work as intended.

Examples

Run this code
# \donttest{
# Example usage:
data(iris)

new_names = c(
  "Sepal.Length" = "Sepal length (cm)" ,
  "Sepal.Width" = "Sepal width (cm)",
  "Petal.Length" = "Petal length (cm)",
  "Petal.Width" = "Petal width (cm)",
  "Species" = "Cultivar"
)

# Use the whole data.frame to generate a pdf report and don't open the pdf.
f_boxplot(iris, fancy_names = new_names, output_type = "pdf", open_generated_files = FALSE) #

# Use a formula to plot several response parameters (response 1 + response 2 etc)
# and generate a rmd output without boxplot_explanation.
data(mtcars)
f_boxplot(hp + disp ~ gear*cyl,
           data=mtcars,
           boxplot_explanation = FALSE,
           output_type = "word",
           open_generated_files = FALSE) # Do not automatically open the 'Word' file.
# }

Run the code above in your browser using DataLab