quick_ttest: Quick t-test with Automatic Visualization

Description

Perform t-test or Wilcoxon test (automatically selected based on data characteristics and sample size) with publication-ready visualization. Designed for comparing two groups only.

Usage

quick_ttest(
  data,
  group,
  value,
  method = c("auto", "t.test", "wilcox.test"),
  paired = FALSE,
  id,
  alternative = c("two.sided", "less", "greater"),
  var.equal = NULL,
  conf.level = 0.95,
  plot_type = c("boxplot", "violin", "both"),
  add_jitter = TRUE,
  point_size = 2,
  point_alpha = 0.6,
  show_p_value = TRUE,
  p_label = c("p.signif", "p.format"),
  palette = "qual_vivid",
  verbose = TRUE,
  ...
)

Value

An object of class quick_ttest_result containing:

plot: A ggplot object with the comparison visualization
test_result: The htest object from t.test() or wilcox.test()
method_used: Character string of the test method used
normality_tests: List of Shapiro-Wilk test results for each group
variance_test: Levene's test result (if applicable)
descriptive_stats: Data frame with descriptive statistics by group
auto_decision: Details about automatic method selection
timestamp: POSIXct timestamp of analysis

Arguments

data: A data frame containing the variables.
group: Column name for the grouping variable (must have exactly 2 levels). Supports both quoted and unquoted names via NSE.
value: Column name for the numeric values to compare. Supports both quoted and unquoted names via NSE.
method: Character. Test method: "auto" (default), "t.test", or "wilcox.test". When "auto", the function intelligently selects based on normality and sample size.
paired: Logical. Whether to perform a paired test. Default is FALSE. If TRUE, the id parameter must be specified to match pairs.
id: Column name for the pairing ID variable (required when paired = TRUE). Each unique ID should appear exactly once in each group. Supports both quoted and unquoted names via NSE.
alternative: Character. Alternative hypothesis: "two.sided" (default), "less", or "greater".
var.equal: Logical or NULL. Assume equal variances? If NULL (default), automatically tested using Levene's test (ignored for paired tests).
conf.level: Numeric. Confidence level for the interval. Default is 0.95.
plot_type: Character. Type of plot: "boxplot" (default), "violin", or "both".
add_jitter: Logical. Add jittered points to the plot? Default is TRUE.
point_size: Numeric. Size of the points. Default is 2.
point_alpha: Numeric. Transparency of points (0-1). Default is 0.6.
show_p_value: Logical. Display p-value on the plot? Default is TRUE.
p_label: Character. P-value label format: "p.signif" (stars, default) or "p.format" (numeric p-value).
palette: Character. Color palette name from evanverse palettes. Default is "qual_vivid". Set to NULL to use ggplot2 defaults.
verbose: Logical. Print diagnostic messages? Default is TRUE.
...: Additional arguments (currently unused, reserved for future extensions).

Important Notes

Two groups only: This function requires exactly 2 levels in the grouping variable.
Sample size warnings: The function will warn if sample sizes are very small (< 5) or highly unbalanced (ratio > 3:1).
Missing values: Automatically removed with a warning.

Details

"Quick" means easy to use, not simplified or inaccurate.

This function performs full statistical testing with proper assumption checking:

Automatic Method Selection (method = "auto")

The function uses an intelligent algorithm that considers both normality and sample size:

Large samples (n >= 100 per group): Prefers t-test due to Central Limit Theorem, even if Shapiro-Wilk rejects normality (which becomes overly sensitive in large samples).
Medium samples (30 <= n < 100): Uses Shapiro-Wilk test with a stricter threshold (p < 0.01) to avoid false positives.
Small samples (n < 30): Strictly checks normality with standard threshold (p < 0.05).

This approach avoids the common pitfall of automatically switching to non-parametric tests for large samples where t-test is actually more appropriate.

Variance Equality Check

When var.equal = NULL and t-test is selected, Levene's test is performed. If variances are unequal (p < 0.05), Welch's t-test is used automatically.

Visualization

The plot includes:

Boxplot, violin plot, or both (based on plot_type)
Individual data points (if add_jitter = TRUE)
Statistical comparison with p-value
Publication-ready styling

Examples

Run this code

# Example 1: Basic usage with automatic method selection
set.seed(123)
data <- data.frame(
  group = rep(c("Control", "Treatment"), each = 30),
  expression = c(rnorm(30, mean = 5), rnorm(30, mean = 6))
)

result <- quick_ttest(data, group = group, value = expression)
print(result)

# Example 2: Paired samples (e.g., before/after)
paired_data <- data.frame(
  patient = rep(1:20, 2),
  timepoint = rep(c("Before", "After"), each = 20),
  score = c(rnorm(20, 50, 10), rnorm(20, 55, 10))
)

result <- quick_ttest(paired_data,
                      group = timepoint,
                      value = score,
                      paired = TRUE,
                      id = patient)

# Example 3: Non-normal data with manual method selection
skewed_data <- data.frame(
  group = rep(c("A", "B"), each = 25),
  value = c(rexp(25, rate = 0.5), rexp(25, rate = 1))
)

result <- quick_ttest(skewed_data,
                      group = group,
                      value = value,
                      method = "wilcox.test",
                      verbose = TRUE)

# Example 4: Customize visualization
result <- quick_ttest(data,
                      group = group,
                      value = expression,
                      plot_type = "both",
                      palette = "qual_balanced",
                      p_label = "p.format")

# Access components
result$plot              # ggplot object
result$test_result       # htest object
summary(result)          # Detailed summary

Run the code above in your browser using DataLab