selm: Fitting linear models with skew-elliptical error term

Description

Function selm fits a linear model with skew-elliptical error term. The term skew-elliptical distribution is an abbreviated equivalent of skew-elliptically contoured (SEC) distribution. The function works for univariate and multivariate response variables.

Usage

selm(formula, family = "SN", data, weights, subset, na.action, 
  start = NULL, fixed.param = list(), method = "MLE",  penalty=NULL, 
  offset, model = TRUE, x = FALSE, y = FALSE,  ...)

Arguments

formula

an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted, using the same syntax used for the similar parameter of e.g. "

family

a character string which selects the parametric family of SEC type assumed for the error term. It must one of "SN" (default), "ST" or "SC", which correspond to the skew-normal, the skew-<

data

an optional data frame containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which selm is called.

weights

a numeric vector of weights associated to individual observations. Weights are supposed to represent frequencies, hence must be non-negative integers (not all 0) and length(weights) must equal the number of observations. If not assi

subset

an optional vector specifying a subset of observations to be used in the fitting process.

na.action

a function which indicates what should happen when the data contain NAs. The default is set by the na.action setting of options. The factory-fresh

start

a vector (in the univariate case) or a list (in the multivariate case) of initial values for the search of the parameter estimates. If start=NULL (default), initial values are selected by the procedure.

fixed.param

a list of assignments of parameter values which must be kept fixed in the estimation process. Currently, there only two types of admissible constraint: one is to set alpha=0 to impose a symmetry condition of the distribution; th

method

a character string which selects the estimation method to be used for fitting. Currently two options exist: "MLE" (default) and "MPLE", corresponding to standard maximum likelihood and maximum penalized likekelihood esti

penalty

a character string which denotes the penalty function to be subtracted to the log-likelihood function, when method="MPLE"; if penalty=NULL (default), a pre-defined function is adopted. See Details for a

offset

this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be NULL or a numeric vector of length equal to the number of cases. One or more

model, x, y

logicals. If TRUE, the corresponding components of the fit are returned.

...

optional control parameters, as follows.

trace: a logical value which indicates whether intermediate evaluations of the optimization process are printed (default:FALSE).
info.type: a charac

Value

an S4 object of class selm or mselm, depending on whether the response variable of the fitted model is univariate or multivariate. These objects are described in the selm class.

encoding

UTF-8

concept

regression
skew-elliptical distribution

Warning

The estimates are obtained by numerical optimization methods and, as usual in similar cases, there is no guarantee that the maximum of the objective function is achieved. Both consideration of model simplicity and numerical experience indicate that models with SN error terms generally produce more reliable results compared to those with the ST family. Take into account that models involving a traditional Student's $t$ distribution with unknown degres of freedom can already be problematic; the presence of the (multivariate) slant parameter $\alpha$ in the ST family cannot make things any simpler. Consequently, care must be exercised, especially so if one works with the (multivariate) ST family. Consider re-fitting a model with different starting values and, in the ST case, building the profile log-likelihood for a range of $\nu$ values. Details on the numerical optimization which has produced object obj can be estracted with slot(obj, "opt.method"). Be aware that occasionally optim and nlminb declare successful completion of a regular minimization problem at a point where the Hessian matrix is not positive-definite. A case of this sort is presented in the final portion of the examples below.

Details

By default, selm fits the selected model by maximum likelihood estimation (MLE), making use of some numerical optimization method. Maximization is performed in one parameterization, usually DP, and then the estimates are mapped to other parameter sets, CP and pseudo-CP; see dp2cp for more information on parameterizations. These parameter transformations are carried out trasparently to the user. The observed information matrix is used to obtain the estimated variance matrix of the MLE's and from this the standard errors. Background information on MLE in the context of SEC distributions is provided by Azzalini and Capitanio (2014); see specifically Chapter 3, Sections 4.3, 5.2, 6.2.5--6. For additional information, see the original research work referenced therein. Although the density functionof SEC distributions are expressed using DP parameter sets, the methods associated to the objects created by this function communicate, by default, their outcomes in the CP parameter set, or its variant form pseudo-CP when CP does not exist; the Note at summary.selm explains why. A more detailed discussion is available in Sections 3.1.4--6 and 5.2.3 of Azzalini and Capitanio (2014) and in Section 4 of Arellano-Valle and Azzalini (2008). There is a known open issue which affects computation of the information matrix of the multivariate skew-normal distribution when the slant parameter $\alpha$ approaches the null vector; see p.149 of Azzalini and Capitanio (2014). Consequently, if a model with multivariate response is fitted with family="SN" and the estimate alpha of $\alpha$ is at the origin or neary so, the information matrix and the standard errors are not computed and a warning message is issued. In this unusual circumstance, a simple work-around is to re-fit the model with family="ST", which will work except in remote cases when (i) the estimated degrees of freedom nu diverge and (ii) still alpha remains at the origin.

The optional argument fixed.param=list(alpha=0) imposes the constraint $\alpha=0$ in the estimation process; in the multivariate case, the expression is interpreted in the sense that all components of the vector $alpha$ are zero, which implies symmetry of the error distribution, irrespectively of the parameterization subsequently adopted for summaries and diagnostics. When this restriction is selected, the estimation method cannot be set to "MPLE". Under the constraint $\alpha=0$, if family="SN", the model is fitted similarly to lm, except that here MLE is used for estimation of the covariance matrix. If family="ST" or family="SC", a symmetric Student's $t$ of Cauchy distribution is adopted. Under the constraint $\alpha=0$, the location parameter $\xi$ coincides with the mode and the mean of the distribution, when the latter exists; in addition, when the covariance matrix exists, it differs from $\Omega$ only by a multiplicative factor. For this reason, the summaries of a model of this sort automatically adopt the DP parametrization. The other possible form of constraint allows to fix the degrees of freedom when family="ST". The two constraints can be combined writing, for instance, fixed.param=list(alpha=0, nu=6). The constraint nu=1 is equivalent to select family="SC". In practice, an expression of type fixed.param=list(..) can be abbreviated to fixed=list(..). In some cases, especially for small sample size, the MLE occurs on the frontier of the parameter space, leading to DP estimates with alpha=Inf or to a similar situation in the multivariate case or in an alternative parameterization. Such outcome is regared by many as unsatisfactory; surely it prevents using the observed information matrix to compute standard errors. This problem motivates the use of maximum penalized likelihood estimation (MPLE), where the regular log-likelihood function $\log~L$ is penalized by subtracting an amount $Q$, say, increasingly large as $|\alpha|$ increases. Hence the function which is maximized at the optimization stage is now $\log\,L~-~Q$. If method="MPLE" and penalty=NULL, the default function Qpenalty is used, which implements the penalization: $$Q(\alpha) = c_1 \log(1 + c_2 \alpha_*^2)$$ where $c_1$ and $c_2$ are positive constants, which depends on the degrees of freedom nu in the ST case, $$\alpha_*^2 = \alpha^\top \bar\Omega \alpha$$ and $\bar\Omega$ denotes the correlation matrix associated to the scale matrix Omega described in connection with makeSECdistr. In the univariate case $\bar\Omega=1$, so that $\alpha_*^2=\alpha^2$. Further information on MPLE and this choice of the penalty function is given in Section 3.1.8 and p.111 of Azzalini and Capitanio (2014); for a more detailed account, see Azzalini and Arellano-Valle (2013) and references therein.

It is possible to change the penalty function, to be declared via the argument penalty. For instance, if the calling statement includes penalty="anotherQ", the user must have defined

anotherQ <- function(alpha_etc, nu = NULL, der = 0)

with the following arguments.

alpha_etc: in the univariate case, a single valuealpha; in the multivariate case, a two-component list whose first component is the vectoralpha, the second one is matrix equal tocov2cor(Omega). % \eqn{\bar\Omega}{corOmega}.
nu: degrees of freedom, only relevant iffamily="ST".
der: a numeric value which indicates the required order of derivation; ifder=0(default value), only the penaltyQneeds to be retuned by the function; ifder=1,attr(Q, "der1")must represent the first order derivative ofQwith respect toalpha; ifder=2, alsoattr(Q, "der2")must be assigned, containing the second derivative (only required in the univariate case).

This function must return a single numeric value, possibly with required attributes when is called with der>1. Since sn imports functions grad and hessian from package numDeriv, one can rely on them for numerical evaluation of the derivatives, if they are not available in an explicit form.

This penalization scheme allows to introduce a prior distribution $\pi$ for $\alpha$ by setting $Q=-\log\pi$, leading to a maximum a posteriori estimate in the stated sense. See Qpenalty for more information and an illustration. The actual computations are not performed within selm which only sets-up ingredients for work of selm.fit and other functions further below this one. See selm.fit for more information.

References

Arellano-Valle, R. B., and Azzalini, A. (2008). The centred parametrization for the multivariate skew-normal distribution. J. Multiv. Anal. 99, 1362--1382. Corrigendum: 100 (2009), 816.

Azzalini, A. with the collaboration of Capitanio, A. (2014). The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.

Azzalini, A. and Arellano Valle, R. V. (2013, available on line 30 June 2012). Maximum penalized likelihood estimation for skew-normal and skew-t distributions. J. Stat. Planning & Inference 143, 419--433.

Examples

Run this code

data(ais)
m1 <- selm(log(Fe) ~ BMI + LBM, family="SN", data=ais)
print(m1)
summary(m1)
s<- summary(m1, "DP", cov=TRUE, cor=TRUE)
plot(m1)
plot(m1, param.type="DP")
logLik(m1)
coef(m1)
coef(m1, "DP")
var <- vcov(m1)
#
m1a <- selm(log(Fe) ~ BMI + LBM, family="SN", method="MPLE", data=ais)
m1b <- selm(log(Fe) ~ BMI + LBM, family="ST", fixed.param=list(nu=8), data=ais)
#
data(barolo)
attach(barolo)
A75 <- (reseller=="A" & volume==75)
logPrice <- log(price[A75],10) 
m <- selm(logPrice ~ 1, family="ST")
summary(m)
plot(m, which=2, col=4, main="Barolo log10(price)")
# cfr Figure 4.7 of Azzalini & Capitanio (2014), p.107
detach(barolo)
#-----
# examples with multivariate response
#
m3 <- selm(cbind(BMI, LBM) ~ WCC + RCC, family="SN", data=ais)
plot(m3, col=2, which=2)
summary(m3, "dp")
coef(m3)
coef(m3, vector=FALSE)
#
data(wines)
m28 <- selm(cbind(chloride, glycerol, magnesium) ~ 1, family="ST", 
            subset=(wine=="Grignolino"), data=wines)            
dp28 <- coef(m28, "DP", vector=FALSE) 
pcp28 <- coef(m28, "pseudo-CP", vector=FALSE) 
plot(m28)
#
m4 <- selm(cbind(alcohol,sugar)~1, family="ST", data=wines)
m5 <- selm(cbind(alcohol,sugar)~1, family="ST", data=wines, fixed=list(alpha=0))
print(1 - pchisq(2*as.numeric(logLik(m4)-logLik(m5)), 2)) # test for symmetry
#
# illustrate final passage of 'Warning' section above
m31 <- selm(cbind(BMI, LBM)~ Ht + Wt, family="ST", data=ais)
# Warning message...
slot(m31, "opt.method")$convergence
m32 <- selm(cbind(BMI, LBM) ~ Ht + Wt, family="ST", data=ais, opt.method="BFGS")
# Warning message...
slot(m32, "opt.method")$convergence

Run the code above in your browser using DataLab