Learn R Programming

⚠️There's a newer version (0.5.0) of this package.Take me there.

distributional

The distributional package allows distributions to be used in a vectorised context. It provides methods which are minimal wrappers to the standard d, p, q, and r distribution functions which are applied to each distribution in the vector. Additional distributional statistics can be computed, including the mean(), median(), variance(), and intervals with hilo().

The distributional nature of a model’s predictions is often understated, with defaults of predict() methods usually only producing point predictions. The forecast() function from the forecast package goes further in illustrating uncertainty by producing point forecasts and intervals by default, however the user’s ability to interact with them is limited. This package vectorises distributions and provides methods for working with them, making entire distributions suitable prediction outputs for model functions.

Installation

You can install the released version of distributional from CRAN with:

install.packages("distributional")

The development version can be installed from GitHub with:

# install.packages("remotes")
remotes::install_github("mitchelloharawild/distributional")

Examples

Distributions are created using dist_*() functions. Currently only the normal distribution is supported for testing purposes.

library(distributional)
#> 
#> Attaching package: 'distributional'
#> The following object is masked from 'package:grDevices':
#> 
#>     pdf
my_dist <- c(dist_normal(mu = 0, sigma = 1), dist_student_t(df = 10))
my_dist
#> <distribution[2]>
#> [1] N(0, 1)     t(10, 0, 1)

The standard four distribution functions in R are usable via these generics:

density(my_dist, 0) # c(dnorm(0, mean = 0, sd = 1), dt(0, df = 10))
#> [1] 0.3989423 0.3891084
cdf(my_dist, 5) # c(pnorm(5, mean = 0, sd = 1), pt(5, df = 10))
#> [1] 0.9999997 0.9997313
quantile(my_dist, 0.1) # c(qnorm(0.1, mean = 0, sd = 1), qt(0.1, df = 10))
#> [1] -1.281552 -1.372184
generate(my_dist, 10) # list(rnorm(10, mean = 0, sd = 1), rt(10, df = 10))
#> [[1]]
#>  [1]  1.262954285 -0.326233361  1.329799263  1.272429321  0.414641434
#>  [6] -1.539950042 -0.928567035 -0.294720447 -0.005767173  2.404653389
#> 
#> [[2]]
#>  [1]  0.99165484 -1.36999677 -0.40943004 -0.85261144 -1.37728388  0.81020460
#>  [7] -1.82965813 -0.06142032 -1.33933588 -0.28491414

You can also compute intervals using hilo()

hilo(my_dist, 0.95)
#> <hilo[2]>
#> [1] [-0.01190677, 0.01190677]0.95 [-0.01220773, 0.01220773]0.95

Additionally, some distributions may support other methods such as mathematical operations and summary measures. If the methods aren’t supported, a transformed distribution will be created.

my_dist
#> <distribution[2]>
#> [1] N(0, 1)     t(10, 0, 1)
my_dist*3 + 2
#> <distribution[2]>
#> [1] N(2, 9)        t(t(10, 0, 1))
mean(my_dist)
#> [1] 0 0
variance(my_dist)
#> [1] 1.00 1.25

You can also visualise the distribution(s) using the ggdist package.

library(ggdist)
library(ggplot2)

df <- data.frame(
  name = c("Gamma(2,1)", "Normal(5,1)", "Mixture"),
  dist = c(dist_gamma(2,1), dist_normal(5,1),
           dist_mixture(dist_gamma(2,1), dist_normal(5, 1), weights = c(0.4, 0.6)))
)

ggplot(df, aes(y = factor(name, levels = rev(name)))) +
  stat_dist_halfeye(aes(dist = dist)) + 
  labs(title = "Density function for a mixture of distributions", y = NULL, x = NULL)

Related work

There are several packages which unify interfaces for distributions in R:

  • stats provides functions to work with possibly multiple distributions (comparisons made below).
  • distributions3 represents singular distributions using S3, with particularly nice documentation. This package makes use of some code and documentation from this package.
  • distr represents singular distributions using S4.
  • distr6 represents singular distributions using R6.
  • Many more in the CRAN task view

This package differs from the above libraries by storing the distributions in a vectorised format. It does this using vctrs, so it should play nicely with the tidyverse (try putting distributions into a tibble!).

Copy Link

Version

Install

install.packages('distributional')

Monthly Downloads

80,367

Version

0.2.1

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Mitchell O'Hara-Wild

Last Published

October 6th, 2020

Functions in distributional (0.2.1)

guide_level

Level shade bar guide
dist_burr

The Burr distribution
dist_student_t

The (non-central) location-scale Student t Distribution
dist_gumbel

The Gumbel distribution
dist_hypergeometric

The Hypergeometric distribution
dist_cauchy

The Cauchy distribution
dist_exponential

The Exponential Distribution
dist_multivariate_normal

The multivariate normal distribution
dist_negative_binomial

The Negative Binomial distribution
dist_chisq

The (non-central) Chi-Squared Distribution
dist_degenerate

The degenerate distribution
dist_f

The F Distribution
dist_inflated

Inflate a value of a probability distribution
autoplot.distribution

Plot a distribution
dist_inverse_exponential

The Inverse Exponential distribution
dist_poisson_inverse_gaussian

The Poisson-Inverse Gaussian distribution
dist_poisson

The Poisson Distribution
distributional-package

distributional: Vectorised Probability Distributions
dist_logarithmic

The Logarithmic distribution
dist_multinomial

The Multinomial distribution
variance.distribution

Variance of a probability distribution
new_hilo

Construct hilo intervals
guide_train.level_guide

Helper methods for guides
quantile.distribution

Distribution Quantiles
dist_sample

Sampling distribution
variance

Variance
dist_logistic

The Logistic distribution
generate.distribution

Randomly sample values from a distribution
mean.distribution

Mean of a probability distribution
hilo

Compute intervals
kurtosis

Kurtosis of a probability distribution
median.distribution

Median of a probability distribution
dist_studentized_range

The Studentized Range distribution
dist_transformed

Modify a distribution with a transformation
hilo.distribution

Probability intervals of a probability distribution
likelihood

The (log) likelihood of a sample matching a distribution
dist_binomial

The Binomial distribution
dist_beta

The Beta distribution
dist_pareto

The Pareto distribution
dist_percentile

Percentile distribution
dist_inverse_gamma

The Inverse Gamma distribution
dist_uniform

The Uniform distribution
hdr

Compute highest density regions
dist_normal

The Normal distribution
cdf

The cumulative distribution function
dist_inverse_gaussian

The Inverse Gaussian distribution
new_hdr

Construct hdr intervals
hdr.distribution

Highest density regions of probability distributions
new_dist

Create a new distribution
scale_level

level luminance scales
dist_truncated

Truncate a distribution
skewness

Skewness of a probability distribution
density.distribution

The probability density/mass function
dist_gamma

The Gamma distribution
dist_missing

Missing distribution
dist_bernoulli

The Bernoulli distribution
dist_geometric

The Geometric Distribution
dist_mixture

Create a mixture of distributions
dist_weibull

The Weibull distribution
dist_wrap

Create a distribution from p/d/q/r style functions
geom_hilo_linerange

Line ranges for hilo intervals
geom_hilo_ribbon

Ribbon plots for hilo intervals
is_hdr

Is the object a hdr
is_hilo

Is the object a hilo
reexports

Objects exported from other packages
scale_hilo_continuous

Hilo interval scales