selm fits a linear model
with skew-elliptical error term.
The term selm(formula, family = "SN", data, weights, subset, na.action,
start = NULL, fixed.param = list(), method = "MLE", penalty=NULL,
offset, model = TRUE, x = FALSE, y = FALSE, ...)"formula"
(or one that can be coerced to that class): a symbolic description of the
model to be fitted, using the same syntax used for the similar parameter of
e.g. ""SN" (default), "ST" or "SC", which correspond to the
skew-normal, the skew-<data, the variables are taken from
environment(formula), typically the environment from which
selm is called.length(weights) must equal the
number of observations. If not assiNAs. The default is set by the
na.action setting of options.
The start=NULL (default), initial values are selected by
the procedure.alpha=0 to impose a symmetry condition of the distribution;
th"MLE" (default) and
"MPLE", corresponding to standard maximum likelihood and maximum
penalized likekelihood estimethod="MPLE"; if
penalty=NULL (default), a pre-defined function is adopted. See
NULL or a numeric vector of length equal to the number of
cases. One or more TRUE, the corresponding components
of the fit are returned.trace: a logical value which indicates whether intermediate
evaluations of the optimization process are printed (default:FALSE).info.type: a characselm or mselm, depending on whether
the response variable of the fitted model is univariate or multivariate.
These objects are described in the selm class.obj can be estracted with slot(obj, "opt.method").
Be aware that occasionally optim and nlminb declare successful
completion of a regular minimization problem at a point where the Hessian
matrix is not positive-definite. A case of this sort is presented in the
final portion of the examples below.selm fits the selected model by maximum
likelihood estimation (MLE), making use of some numerical
optimization method. Maximization is performed in one
parameterization, usually DP, and then the estimates are mapped to
other parameter sets, CP and pseudo-CP;
see dp2cp for more information on parameterizations.
These parameter transformations are carried out trasparently to the user.
The observed information matrix is used to obtain the estimated variance
matrix of the MLE's and from this the standard errors.
Background information on MLE in the context of SEC
distributions is provided by Azzalini and Capitanio (2014);
see specifically Chapter 3, Sections 4.3, 5.2, 6.2.5--6. For additional
information, see the original research work referenced therein.
Although the density functionof SEC distributions are expressed using
DP parameter sets, the methods associated to the objects created
by this function communicate, by default, their outcomes in the CP
parameter set, or its variant form pseudo-CP when CP
does not exist; the summary.selm explains why.
A more detailed discussion is available in Sections 3.1.4--6 and 5.2.3 of
Azzalini and Capitanio (2014) and in Section 4 of Arellano-Valle and
Azzalini (2008).
There is a known open issue which affects computation of the information
matrix of the multivariate skew-normal distribution when the slant
parameter $\alpha$ approaches the null vector; see p.149 of
Azzalini and Capitanio (2014). Consequently, if a model with
multivariate response is fitted with family="SN" and the estimate
alpha of $\alpha$ is at the origin or neary so, the
information matrix and the standard errors are not computed and a
warning message is issued. In this unusual circumstance, a simple
work-around is to re-fit the model with family="ST", which will
work except in remote cases when (i) the estimated degrees of freedom
nu diverge and (ii) still alpha remains at the origin. The optional argument fixed.param=list(alpha=0) imposes the
constraint $\alpha=0$ in the estimation process; in the multivariate
case, the expression is interpreted in the sense that all components
of the vector $alpha$ are zero, which implies symmetry of the
error distribution, irrespectively of the parameterization
subsequently adopted for summaries and diagnostics.
When this restriction is selected, the estimation method cannot be
set to "MPLE". Under the constraint $\alpha=0$,
if family="SN", the model is fitted similarly to lm, except
that here MLE is used for estimation of the covariance matrix.
If family="ST" or family="SC", a symmetric Student's $t$
of Cauchy distribution is adopted.
Under the constraint $\alpha=0$, the location parameter $\xi$
coincides with the mode and the mean of the distribution, when the latter
exists; in addition, when the covariance matrix exists, it differs from
$\Omega$ only by a multiplicative factor. For this reason,
the summaries of a model of this sort automatically adopt the DP
parametrization.
The other possible form of constraint allows to fix the degrees of
freedom when family="ST". The two constraints can be combined
writing, for instance, fixed.param=list(alpha=0, nu=6).
The constraint nu=1 is equivalent to select family="SC".
In practice, an expression of type fixed.param=list(..) can be
abbreviated to fixed=list(..).
In some cases, especially for small sample size, the MLE occurs on
the frontier of the parameter space, leading to DP estimates with
alpha=Inf or to a similar situation in the multivariate case or in an
alternative parameterization. Such outcome is regared by many as
unsatisfactory; surely it prevents using the observed information matrix to
compute standard errors. This problem motivates the use of maximum penalized
likelihood estimation (MPLE), where the regular log-likelihood
function $\log~L$ is penalized by subtracting an amount
$Q$, say, increasingly large as $|\alpha|$ increases.
Hence the function which is maximized at the optimization stage is now
$\log\,L~-~Q$. If method="MPLE" and
penalty=NULL, the default function Qpenalty is used,
which implements the penalization:
$$Q(\alpha) = c_1 \log(1 + c_2 \alpha_*^2)$$
where $c_1$ and $c_2$ are positive constants, which
depends on the degrees of freedom nu in the ST case,
$$\alpha_*^2 = \alpha^\top \bar\Omega \alpha$$
and $\bar\Omega$ denotes the correlation matrix
associated to the scale matrix Omega described in connection with
makeSECdistr. In the univariate case
$\bar\Omega=1$,
so that $\alpha_*^2=\alpha^2$. Further information
on MPLE and this choice of the penalty function is given in
Section 3.1.8 and p.111 of Azzalini and Capitanio (2014); for a more
detailed account, see Azzalini and Arellano-Valle (2013) and references
therein.
It is possible to change the penalty function, to be declared via the
argument penalty. For instance, if the calling statement includes
penalty="anotherQ", the user must have defined
anotherQ <- function(alpha_etc, nu = NULL, der = 0)
with the following arguments.
alpha_etc: in the univariate case, a single valuealpha;
in the multivariate case, a two-component list whose first component is
the vectoralpha, the second one is matrix equal tocov2cor(Omega).
% \eqn{\bar\Omega}{corOmega}.nu: degrees of freedom, only relevant iffamily="ST".der: a numeric value which indicates the required order of
derivation; ifder=0(default value), only the penaltyQneeds to be retuned by the function;
ifder=1,attr(Q, "der1")must represent the
first order derivative ofQwith respect toalpha; ifder=2, alsoattr(Q, "der2")must be assigned, containing
the second derivative (only required in the univariate case).der>1.
Since grad and
hessian from package This penalization scheme allows to introduce a prior distribution
$\pi$ for $\alpha$ by setting $Q=-\log\pi$,
leading to a maximum a posteriori estimate in the stated sense.
See Qpenalty for more information and an illustration.
The actual computations are not performed within selm which only
sets-up ingredients for work of selm.fit and other functions
further below this one. See selm.fit for more information.
Azzalini, A. with the collaboration of Capitanio, A. (2014). The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.
Azzalini, A. and Arellano Valle, R. V. (2013, available on line 30 June 2012). Maximum penalized likelihood estimation for skew-normal and skew-t distributions. J. Stat. Planning & Inference 143, 419--433.
selm -classfor classes"selm"and"mselm",summary.selmfor summaries,plot.selmfor plots,residuals.selmfor residuals and fitted valuescoef,logLik,vcov.selm.fitand those further downQpenaltyextractSECdistrto extract theSECerror distribution from an object returned byselmdata(ais)
m1 <- selm(log(Fe) ~ BMI + LBM, family="SN", data=ais)
print(m1)
summary(m1)
s<- summary(m1, "DP", cov=TRUE, cor=TRUE)
plot(m1)
plot(m1, param.type="DP")
logLik(m1)
coef(m1)
coef(m1, "DP")
var <- vcov(m1)
#
m1a <- selm(log(Fe) ~ BMI + LBM, family="SN", method="MPLE", data=ais)
m1b <- selm(log(Fe) ~ BMI + LBM, family="ST", fixed.param=list(nu=8), data=ais)
#
data(barolo)
attach(barolo)
A75 <- (reseller=="A" & volume==75)
logPrice <- log(price[A75],10)
m <- selm(logPrice ~ 1, family="ST")
summary(m)
plot(m, which=2, col=4, main="Barolo log10(price)")
# cfr Figure 4.7 of Azzalini & Capitanio (2014), p.107
detach(barolo)
#-----
# examples with multivariate response
#
m3 <- selm(cbind(BMI, LBM) ~ WCC + RCC, family="SN", data=ais)
plot(m3, col=2, which=2)
summary(m3, "dp")
coef(m3)
coef(m3, vector=FALSE)
#
data(wines)
m28 <- selm(cbind(chloride, glycerol, magnesium) ~ 1, family="ST",
subset=(wine=="Grignolino"), data=wines)
dp28 <- coef(m28, "DP", vector=FALSE)
pcp28 <- coef(m28, "pseudo-CP", vector=FALSE)
plot(m28)
#
m4 <- selm(cbind(alcohol,sugar)~1, family="ST", data=wines)
m5 <- selm(cbind(alcohol,sugar)~1, family="ST", data=wines, fixed=list(alpha=0))
print(1 - pchisq(2*as.numeric(logLik(m4)-logLik(m5)), 2)) # test for symmetry
#
# illustrate final passage of 'Warning' section above
m31 <- selm(cbind(BMI, LBM)~ Ht + Wt, family="ST", data=ais)
# Warning message...
slot(m31, "opt.method")$convergence
m32 <- selm(cbind(BMI, LBM) ~ Ht + Wt, family="ST", data=ais, opt.method="BFGS")
# Warning message...
slot(m32, "opt.method")$convergenceRun the code above in your browser using DataLab