Learn R Programming

distributional (version 0.3.0)

dist_categorical: The Categorical distribution

Description

lifecycle::badge("stable")

Usage

dist_categorical(prob, outcomes = NULL)

Arguments

prob

A list of probabilities of observing each outcome category.

outcomes

The values used to represent each outcome.

Details

Categorical distributions are used to represent events with multiple outcomes, such as what number appears on the roll of a dice. This is also referred to as the 'generalised Bernoulli' or 'multinoulli' distribution. The Cateogorical distribution is a special case of the Multinomial() distribution with n = 1.

We recommend reading this documentation on https://pkg.mitchelloharawild.com/distributional/, where the math will render nicely.

In the following, let \(X\) be a Categorical random variable with probability parameters p = \(\{p_1, p_2, \ldots, p_k\}\).

The Categorical probability distribution is widely used to model the occurance of multiple events. A simple example is the roll of a dice, where \(p = \{1/6, 1/6, 1/6, 1/6, 1/6, 1/6\}\) giving equal chance of observing each number on a 6 sided dice.

Support: \(\{1, \ldots, k\}\)

Mean: \(p\)

Variance: \(p \cdot (1 - p) = p \cdot q\)

Probability mass function (p.m.f):

$$ P(X = i) = p_i $$

Cumulative distribution function (c.d.f):

The cdf() of a categorical distribution is undefined as the outcome categories aren't ordered.

Examples

Run this code
dist <- dist_categorical(prob = list(c(0.05, 0.5, 0.15, 0.2, 0.1), c(0.3, 0.1, 0.6)))

dist

generate(dist, 10)

density(dist, 2)
density(dist, 2, log = TRUE)

# The outcomes aren't ordered, so many statistics are not applicable.
cdf(dist, 4)
quantile(dist, 0.7)
mean(dist)
variance(dist)
skewness(dist)
kurtosis(dist)

dist <- dist_categorical(
  prob = list(c(0.05, 0.5, 0.15, 0.2, 0.1), c(0.3, 0.1, 0.6)),
  outcomes = list(letters[1:5], letters[24:26])
)

generate(dist, 10)

density(dist, "a")
density(dist, "z", log = TRUE)

Run the code above in your browser using DataLab