mst.fit: Fitting multivariate skew-t distributions

Description

Fits a multivariate skew-t (MST) distribution to data, or fits a linear regression model with multivariate skew-t errors, using maximum likelihood estimation. The outcome is then displayed in graphical form.

Usage

mst.fit(X, y, freq, start, fixed.df=NA,  plot.it=TRUE, 
        trace=FALSE, ...)

Arguments

a matrix or a vector. If y is a matrix, its rows refer to observations, and its columns to components of the multivariate distribution. If y is a vector, it is converted to a one-column matrix, and a scalar skew-t distribution i

a matrix of covariate values. If missing, a one-column matrix of 1's is created; otherwise, it must have the same number of rows of y.

freq

a vector of weights. If missing, a vector of 1's is created; otherwise it must have the same number of rows of y.

fixed.df

a scalar value containing the degrees of freedom (df), if these must be taked as fixed, or NA (deafult value) if df is a parameter to be estimated.

start

a list contaning the components beta,Omega, alpha, df of the type described below. The dp component of the returned list from a previous call has the required format.

plot.it

logical value which controls the graphical output (default=TRUE); see below for description.

trace

logical value which controls printing of the algorithm convergence. If trace=TRUE, details are printed. Default value is FALSE.

...

additional parameters passed to mst.mle; in practice, the start parameter can be passed.

Value

A list containing the following components:
calla string containing the calling statement.
dpa list containing the direct parameters beta, Omega, alpha, code{df}. Here, beta is a matrix of regression coefficients with dim(beta)=c(nrow(X),ncol(y)), Omega is a covariance matrix of order ncol(y), alpha is a vector of shape parameters of length ncol(y), code{df} is a positive scalar.
logLlog-likelihood evaluated at dp.
sea list containing the components beta, alpha, info. Here, beta and alpha are the standard errors for the corresponding point estimates; info is the observed information matrix for the working parameter, as explained below.
optimthe list returned by the optimizer optim; see the documentation of this function for explanation of its components.
test.normalitya list with elements test and p.value, which are the value of the likelihood ratio test statistic for normality (i.e. test that all components of the shape parameter are 0), and the corresponding p-value.

Side Effects

Graphical output is produced if (plot.it & missing(freq))=TRUE and a suitable device is active. Three plots are produced, and the programs pauses between each two of them, waiting for the key to be pressed.

The first plot uses the variable y if X is missing, otherwise it uses the residuals from the regression. The form of this plot depends on the value of d=ncol(y); if d=1, an histogram is plotted with the fitted distribution superimposed. If d>1, a matrix of scatterplots is produced, with superimposed the corresponding bivariate densities of the fitted distribution.

The second plot has two panels, each representing a QQ-plot of Mahalanobis distances. The first of these refers to the fitting of a multivariate normal distribution, a standard statistical procedure; the second panel gives the corresponding QQ-plot of suitable Mahalanobis distances for the multivariate skew-normal fit.

The third plot is similar to the previous one, except that PP-plots are produced.

Background

The family of multivariate skew-t distributions is an extension of the multivariate Student's t family, via the introduction of a shape parameter which regulates skewness; when shape=0, the skew-t distribution reduces to the usual t distribution. When df=Inf the distribution reduces to the multivariate skew-normal one; see dmsn. See the reference below for additional information.

Details

For computing the maximum likelihood estimates, mst.fit invokes mst.mle which does the actual computational work; then, mst.fit displays the results in graphical form. The documentation of mst.mle gives details of the numerical procedure for maximum likelihood estimation.

References

Azzalini, A. and Capitanio, A. (2003). Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t distribution. J.Roy. Statist. Soc. B 65, 367--389.

Examples

Run this code

data(ais, package="sn")
attach(ais)
# a simple-sample case
b <- mst.fit(y=cbind(Ht,Wt))
#
# a regression case:
a <- mst.fit(X=cbind(1,Ht,Wt), y=bmi, control=list(x.tol=1e-6))
#
# refine the previous outcome
a1 <- mst.fit(X=cbind(1,Ht,Wt), y=bmi, control=list(x.tol=1e-9), start=a$dp)

Run the code above in your browser using DataLab