Learn R Programming

distributional (version 0.6.0)

dist_mixture: Create a mixture of distributions

Description

[Maturing]

A mixture distribution combines multiple component distributions with specified weights. The resulting distribution can model complex, multimodal data by representing it as a weighted sum of simpler distributions.

Usage

dist_mixture(..., weights = numeric())

Arguments

...

Distributions to be used in the mixture. Can be any distributional objects.

weights

A numeric vector of non-negative weights that sum to 1. The length must match the number of distributions passed to .... Each weight \(w_i\) represents the probability that a random draw comes from the \(i\)-th component distribution.

Details

In the following, let \(X\) be a mixture random variable composed of \(K\) component distributions \(F_1, F_2, \ldots, F_K\) with corresponding weights \(w_1, w_2, \ldots, w_K\) where \(\sum_{i=1}^K w_i = 1\) and \(w_i \geq 0\) for all \(i\).

Support: The union of the supports of all component distributions

Mean:

For univariate mixtures: $$ E(X) = \sum_{i=1}^K w_i \mu_i $$

where \(\mu_i\) is the mean of the \(i\)-th component distribution.

For multivariate mixtures: $$ E(\mathbf{X}) = \sum_{i=1}^K w_i \boldsymbol{\mu}_i $$

where \(\boldsymbol{\mu}_i\) is the mean vector of the \(i\)-th component distribution.

Variance:

For univariate mixtures: $$ \text{Var}(X) = \sum_{i=1}^K w_i (\mu_i^2 + \sigma_i^2) - \left(\sum_{i=1}^K w_i \mu_i\right)^2 $$

where \(\sigma_i^2\) is the variance of the \(i\)-th component distribution.

Covariance:

For multivariate mixtures: $$ \text{Cov}(\mathbf{X}) = \sum_{i=1}^K w_i \left[ (\boldsymbol{\mu}_i - \bar{\boldsymbol{\mu}})(\boldsymbol{\mu}_i - \bar{\boldsymbol{\mu}})^T + \boldsymbol{\Sigma}_i \right] $$

where \(\bar{\boldsymbol{\mu}} = \sum_{i=1}^K w_i \boldsymbol{\mu}_i\) is the overall mean vector and \(\boldsymbol{\Sigma}_i\) is the covariance matrix of the \(i\)-th component distribution.

Probability density/mass function (p.d.f/p.m.f):

$$ f(x) = \sum_{i=1}^K w_i f_i(x) $$

where \(f_i(x)\) is the density or mass function of the \(i\)-th component distribution.

Cumulative distribution function (c.d.f):

For univariate mixtures: $$ F(x) = \sum_{i=1}^K w_i F_i(x) $$

where \(F_i(x)\) is the c.d.f. of the \(i\)-th component distribution.

For multivariate mixtures, the c.d.f. is approximated numerically.

Quantile function:

For univariate mixtures, the quantile function has no closed form and is computed numerically by inverting the c.d.f. using root-finding (stats::uniroot()).

For multivariate mixtures, quantiles are not yet implemented.

See Also

Examples

Run this code
# Univariate mixture of two normal distributions
dist <- dist_mixture(dist_normal(0, 1), dist_normal(5, 2), weights = c(0.3, 0.7))
dist

mean(dist)
variance(dist)

density(dist, 2)
cdf(dist, 2)
quantile(dist, 0.5)

generate(dist, 10)

Run the code above in your browser using DataLab