Learn R Programming

VGAM (version 1.1-14)

mix2normal: Mixture of Two Univariate Normal Distributions

Description

Estimates the five parameters of a mixture of two univariate normal distributions by maximum likelihood estimation.

Usage

mix2normal(lphi = "logitlink", lmu = "identitylink", lsd =
   "loglink", iphi = 0.5, imu1 = NULL, imu2 = NULL, isd1 =
   NULL, isd2 = NULL, qmu = c(0.2, 0.8), eq.sd = TRUE,
   nsimEIM = 100, zero = "phi")

Arguments

Value

An object of class "vglmff"

(see vglmff-class). The object is used by modelling functions such as vglm, and vgam.

Details

The probability density function can be loosely written as $$f(y) = \phi \, N(\mu_1,\sigma_1) + (1-\phi) \, N(\mu_2, \sigma_2)$$ where \(\phi\) is the probability an observation belongs to the first group. The parameters \(\mu_1\) and \(\mu_2\) are the means, and \(\sigma_1\) and \(\sigma_2\) are the standard deviations. The parameter \(\phi\) satisfies \(0 < \phi < 1\). The mean of \(Y\) is \(\phi \mu_1 + (1-\phi) \mu_2\) and this is returned as the fitted values. By default, the five linear/additive predictors are \((logit(\phi),\mu_1,\log(\sigma_1),\mu_2,\log(\sigma_2))^T\). If eq.sd = TRUE then \(\sigma_1 = \sigma_2\) is enforced.

References

McLachlan, G. J. and Peel, D. (2000). Finite Mixture Models. New York: Wiley.

Everitt, B. S. and Hand, D. J. (1981). Finite Mixture Distributions. London: Chapman & Hall.

See Also

uninormal, Normal, mix2poisson.

Examples

Run this code
if (FALSE)  mu1 <-  99; mu2 <- 150; nn <- 1000
sd1 <- sd2 <- exp(3)
(phi <- logitlink(-1, inverse = TRUE))
rrn <- runif(nn)
mdata <- data.frame(y = ifelse(rrn < phi, rnorm(nn, mu1, sd1),
                                          rnorm(nn, mu2, sd2)))
fit <- vglm(y ~ 1, mix2normal(eq.sd = TRUE), data = mdata)

# Compare the results
cfit <- coef(fit)
round(rbind('Estimated' = c(logitlink(cfit[1], inverse = TRUE),
            cfit[2], exp(cfit[3]), cfit[4]),
            'Truth' = c(phi, mu1, sd1, mu2)), digits = 2)

# Plot the results
xx <- with(mdata, seq(min(y), max(y), len = 200))
plot(xx, (1-phi) * dnorm(xx, mu2, sd2), type = "l", xlab = "y",
     main = "red = estimate, blue = truth",
     col = "blue", ylab = "Density")
phi.est <- logitlink(coef(fit)[1], inverse = TRUE)
sd.est <- exp(coef(fit)[3])
lines(xx, phi*dnorm(xx, mu1, sd1), col = "blue")
lines(xx, phi.est * dnorm(xx, Coef(fit)[2], sd.est), col = "red")
lines(xx, (1-phi.est)*dnorm(xx, Coef(fit)[4], sd.est), col="red")
abline(v = Coef(fit)[c(2,4)], lty = 2, col = "red")
abline(v = c(mu1, mu2), lty = 2, col = "blue")

Run the code above in your browser using DataLab