Learn R Programming

sumvar (version 0.1)

dist_sum: Explore a continuous variable.

Description

Summarises the median, interquartile range, mean, standard deviation, confidence intervals of the mean and produces a density plot, stratified by a second grouping variable.

Provides frequentist hypothesis tests for comparison between groups: T test and Wilcoxon rank sum for 2 groups, Anova and Kruskall wallis test for 3 or more groups.

The function accepts an input from a dplyr pipe "%>%" and outputs the results as a tibble.

Usage

dist_sum(data, var, by = NULL)

Value

A tibble with a summary of the variable frequency (n), number of missing observations (n_miss), median, interquartile range, mean, SD, 95% confidence intervals of the mean (using the Z distribution), and density plots.

Shows the T test (p_ttest) and Wilcoxon rank sum (p_wilcox) hypothesis tests when there are two groups And an Anova test (p_anova) and Kruskal-Wallis test (p_kruskal) when there are three or more groups.

Arguments

data

The data frame or tibble

var

The variable you would like to summarise

by

The grouping variable

Examples

Run this code
example_data <- dplyr::tibble(id = 1:100, age = rnorm(100, mean = 30, sd = 10),
                              group = sample(c("a", "b", "c", "d"),
                              size = 100, replace = TRUE))
dist_sum(example_data, age, group)
example_data <- dplyr::tibble(id = 1:100, age = rnorm(100, mean = 30, sd = 10),
                             sex = sample(c("male", "female"),
                             size = 100, replace = TRUE))
dist_sum(example_data, age, sex)
summary <- dist_sum(example_data, age, sex) # Save summary statistics as a tibble.

Run the code above in your browser using DataLab