sn.em: Fitting Skew-normal variables using the EM algorithm

Description

Fits a skew-normal (SN) distribution to data, or fits a linear regression model with skew-normal errors, using the EM algorithm to locate the MLE estimate. The estimation procedure can be global or it can fix some components of the parameters vector.

Usage

sn.em(X, y, fixed, p.eps=0.0001, l.eps=0.01, trace=FALSE, data=FALSE)

Arguments

a vector contaning the observed variable. This is the response variable in case of linear regression.

a matrix of explanatory variables. If X is missing, then a one-column matrix of all 1's is created. If X is supplied, and an intercept term is required, then it must include a column of 1's.

fixed

a vector of length 3, indicating which components of the parameter vector must be regarded as fixed. In fixed=c(NA,NA,NA), which is the default setting, a global maximization is performed. If the 3rd component is given a value, then maximizat

p.eps

numerical value which regulates the parameter convergence tolerance.

l.eps

numerical value which regulates the log-likelihood convergence tolerance.

trace

logical value which controls printing of the algorithm convergence. If trace=TRUE, details are printed. Default value is F.

data

logical value. If data=TRUE, the returned list includes the original data. Default value is data=FALSE.

Value

a list with the following components:
dpa vector of the direct parameters, as explained in the references below.
cpa vector of the centred parameters, as explained in the references below.
logLthe log-likelihood at congergence.
dataoptionally (if data=TRUE), a list containing X and y, as supplied on input, and a vector of residuals, which should have an approximate SN distribution with location=0 and scale=1, in the direct parametrization.

Background

Background information on the SN distribution is given by Azzalini (1985). See Azzalini and Capitanio (1999) for a more detailed discussion of the direct and centred parametrizations.

Details

The function works using the direct parametrization; on convergence, the output is then given in both parametrizations.

This function is based on the EM algorithm; it is generally quite slow, but it appears to be very robust. See sn.mle for an alternative method, which also returns standard errors.

References

Azzalini, A. (1985). A class of distributions which includes the normal ones. Scand. J. Statist. 12, 171-178.

Azzalini, A. and Capitanio, A. (1999). Statistical applications of the multivariate skew-normal distribution. J.Roy.Statist.Soc. B 61, 579--602.

Examples

Run this code

data(ais, package="sn")
attach(ais)
#
a<-sn.em(y=bmi)
#
a<-sn.em(X=cbind(1,lbm,lbm^2),y=bmi)
#
M<-model.matrix(~lbm+I(ais$sex))
b<-sn.em(M,bmi)
#
fit <- sn.em(y=bmi, fixed=c(NA, 2, 3), l.eps=0.001)

Run the code above in your browser using DataLab