watson: Fit Mixtures of Watson Distributions

Description

watson fits a finite mixture of multivariate Watson distributions.

Usage

watson(x, k, control = list(), ...)

Value

An object of class "watfit" representing the fitted mixture, which is a list containing the fitted weights, concentrations parameters (kappa), concentrations directions (mu) and further metadata.

Arguments

x: a numeric data matrix, with rows corresponding to observations. Can be a dense matrix, or any of the supported sparse matrices from Matrix package by Rcpp.
k: an integer giving the number of mixture components.
control: a list of control parameters. See Details.
...: a list of control parameters (overriding those specified in control).

Details

watson returns an object of class "watfit" representing the fitted mixture of Watson distributions model. Available methods for such objects include coef, logLik, print and predict. predict has an extra type argument with possible values "class_ids" (default) and "memberships" for indicating hard or soft prediction, respectively.

The mixture of Watson distributions is fitted using EM variants as specified by control option E (specifying the E-step employed), with possible values "softmax" (default), "hardmax" or "stochmax". For "stochmax", class assignments are drawn from the posteriors for each observation in the E-step as outlined as SEM in Celeux and Govaert (1992). The stopping criterion for this algorithm is by default changed to not check for convergence (logical control option converge), but to return the parameters with the maximum likelihood encountered.

In the M-step, the parameters of the respective component distributions are estimated via maximum likelihood, which is accomplished by solving the equation $$g(\alpha,\beta, \kappa)=r,$$ where $$0<\alpha<\beta, \ 0\leq r\leq 1$$ and $$g(\alpha, \beta, \kappa) = (\alpha/\beta)M(\alpha+1, \beta+1, \kappa)/M(\alpha, \beta, \kappa),$$ with M being the Kummer's function. Via control argument M, one can specify how to (approximately) solve these equations. The possible methods are:

"Sra_2007" uses the approximation of Sra (2007).

"BBG" uses the approximation of Bijral et al. (2007), without the correction term.

"BBG_c" uses the approximation of Bijral et al. (2007), with the correction term.

"Sra_Karp_2013" uses the bounds derived in Sra and Karp (2013), with the decision rule .

"bisection" uses a bisection to solve the problem using evaluation proposed in Writeup1 (2018).

"newton" uses a bracketet type of Neton algorithm to solve the problem using evaluation proposed in Writeup1 (2018). (Default.)

"lognewton" uses a bracketet type of Neton algorithm to solve the problem $log(g((\alpha, \beta, \kappa)) = log(r)$ using evaluation proposed in Writeup1 (2018).

Additional control parameters are as follows.

maxiter an integer giving the maximal number of EM iterations to be performed, default: 100.

reltol the minimum relative improvement of the objective function per iteration. If improvement is less, the EM algorithm will stop under the assumption that no further significant improvement can be made, defaults to sqrt(.Machine$double.eps).

ids either a vector of class memberships or TRUE which implies that the class memberships are obtained from the attribute named "id" of the input data; these class memberships are used for initializing the EM algorithm and the algorithm is stopped after the first iteration.

init_iter a numeric vector setting the number of diametrical clustering iterations to do, before the EM starts, default: 0.

start a specification of the starting values to be employed. Can be a list of matrices giving the memberships of objects to components. This information is combined with the init_iter parameter and together form the initialization procedure. If nothing is given, the starting values are drwan randomly.

If several starting values are specified, the EM algorithm is performed individually to each starting value, and the best solution found is returned.

nruns an integer giving the number of EM runs to be performed. Default: 1. Only used if start is not given.

minweight a numeric indicating the minimum prior probability. Components falling below this threshold are removed during the iteration. If is greater than 1, the value is taken as the minimal number of observations in a component, default: 0 if E = "softmax" and 2 if other type of E-method is used .

converge a logical, if TRUE the EM algorithm is stopped if the reltol criterion is met and the current parameter estimate is returned. If FALSE the EM algorithm is run for maxiter iterations and the parametrizations with the maximum likelihood encountered during the EM algorithm is returned. Default: TRUE, changed to FALSE if E="stochmax".

N an integer indicating number of iteration used when the Kummer function is approximate, default: 30.

verbose a logical indicating whether to provide some output on algorithmic progress, default: FALSE.

References

Sablica, L., Hornik, K. and Leydold, J. (2025). watson: An R Package for Fitting Mixtures of Watson Distributions. Journal of Statistical Software, 115(4), 1--30. tools:::Rd_expr_doi("10.18637/jss.v115.i04")

Examples

Run this code

# \donttest{
## Generate a sample with two orthogonal circles (negative kappas)
a <- rmwat(n = 200, weights = c(0.5,0.5), kappa = c(-200,-200),
                        mu = matrix(c(1,1,1,-1,1,1),nrow = 3))
## Fit basic model
q <- watson(a, 2)
## Fit the models, giving the true categories
q <- watson(a, 2, ids=TRUE)
## Fit a model with hard-assignment, and 50 runs
q <- watson(a, 2, E = "hard", nruns = 50)
## Print details
q
## Extract coefficients
coef(q)
## Calculate likelihood for new data
a2 <- rmwat(n = 20, weights = c(0.5,0.5), kappa = c(-200,-200),
                        mu = matrix(c(1,1,1,-1,1,1),nrow = 3))
logLik(q, a2)
## Compare the fitted classes to the true ones:
table(True = attr(a, "id"), Fitted = predict(q))
# }

Run the code above in your browser using DataLab