alm: Advanced Linear Model

Description

Function estimates model based on the selected distribution

Usage

alm(formula, data, subset, na.action, distribution = c("dnorm", "dlaplace",
  "ds", "dgnorm", "dlogis", "dt", "dalaplace", "dlnorm", "dllaplace", "dls",
  "dlgnorm", "dbcnorm", "dfnorm", "dinvgauss", "dpois", "dnbinom", "dbeta",
  "plogis", "pnorm"), loss = c("likelihood", "MSE", "MAE", "HAM", "LASSO",
  "RIDGE"), occurrence = c("none", "plogis", "pnorm"), ar = 0,
  parameters = NULL, fast = FALSE, ...)

Arguments

formula

an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted.

data

a data frame or a matrix, containing the variables in the model.

subset

an optional vector specifying a subset of observations to be used in the fitting process.

na.action

a function which indicates what should happen when the data contain NAs. The default is set by the na.action setting of options, and is na.fail if that is unset. The factory-fresh default is na.omit. Another possible value is NULL, no action. Value na.exclude can be useful.

distribution

what density function to use in the process. The full name of the distribution should be provided here. Values with "d" in the beginning of the name refer to the density function, while "p" stands for "probability" (cumulative distribution function). The names align with the names of distribution functions in R. For example, see dnorm.

loss

The type of Loss Function used in optimization. loss can be:

likelihood - the model is estimated via the maximisation of the likelihood of the function specified in distribution;
MSE (Mean Squared Error),
MAE (Mean Absolute Error),
HAM (Half Absolute Moment),
LASSO - use LASSO to shrink the parameters of the model;
RIDGE - use RIDGE to shrink the parameters of the model;

In case of LASSO / RIDGE, the variables are not normalised prior to the estimation, but the parameters are divided by the mean values of explanatory variables. Note that you are expected to provide the parameter lambda.

A user can also provide their own function here as well, making sure that it accepts parameters actual, fitted and B. Here is an example: lossFunction <- function(actual, fitted, B, xreg) return(mean(abs(actual-fitted))) loss=lossFunction

See vignette("alm","greybox") for some details on losses and distributions.

occurrence

what distribution to use for occurrence variable. Can be "none", then nothing happens; "plogis" - then the logistic regression using alm() is estimated for the occurrence part; "pnorm" - then probit is constructed via alm() for the occurrence part. In both of the latter cases, the formula used is the same as the formula for the sizes. Finally, an "alm" model can be provided and its estimates will be used in the model construction.

If this is not "none", then the model is estimated in two steps: 1. Occurrence part of the model; 2. Sizes part of the model (excluding zeroes from the data).

the order of AR to include in the model. Only non-seasonal orders are accepted.

parameters

vector of parameters of the linear model. When NULL, it is estimated.

fast

if TRUE, then the function won't check whether the data has variability and whether the regressors are correlated. Might cause trouble, especially in cases of multicollinearity.

...

additional parameters to pass to distribution functions. This includes:

alpha - value for Asymmetric Laplace distribution;
size - the size for the Negative Binomial distribution;
nu - the number of degrees of freedom for Chi-Squared and Student's t;
lambda - the meta parameter for LASSO / RIDGE. Should be between 0 and 1, regulating the strength of shrinkage, where 0 means don't shrink parameters (use MSE) and 1 means shrink everything (ignore MSE);
lambdaBC - lambda for Box-Cox transform parameter in case of Box-Cox Normal Distribution.
FI=TRUE will make the function also produce Fisher Information matrix, which then can be used to calculated variances of smoothing parameters and initial states of the model. This is used in the vcov method;

You can also pass parameters to the optimiser:

B - the vector of starting values of parameters for the optimiser, should correspond to the ordering of the explanatory variables;
algorithm - the algorithm to use in optimisation ("NLOPT_LN_SBPLX" by default).
maxeval - maximum number of evaluations to carry out (default is 100);
maxtime - stop, when the optimisation time (in seconds) exceeds this;
xtol_rel - the precision of the optimiser (the default is 1E-6);
xtol_abs - the absolute precision of the optimiser (the default is 1E-8);
ftol_rel - the stopping criterion in case of the relative change in the loss function (the default is 1E-4);
ftol_abs - the stopping criterion in case of the absolute change in the loss function (the default is 0 - not used);
print_level - the level of output for the optimiser (0 by default). If equal to 41, then the detailed results of the optimisation are returned.

You can read more about these parameters by running the function nloptr.print.options.

Value

Function returns model - the final model of the class "alm", which contains:

coefficients - estimated parameters of the model,
FI - Fisher Information of parameters of the model. Returned only when FI=TRUE,
fitted - fitted values,
residuals - residuals of the model,
mu - the estimated location parameter of the distribution,
scale - the estimated scale parameter of the distribution,
distribution - distribution used in the estimation,
logLik - log-likelihood of the model. Only returned, when loss="likelihood" and in several other special cases of distribution and loss combinations (e.g. loss="MSE", distribution="dnorm"),
loss - the type of the loss function used in the estimation,
lossFunction - the loss function, if the custom is provided by the user,
lossValue - the value of the loss function,
df.residual - number of degrees of freedom of the residuals of the model,
df - number of degrees of freedom of the model,
call - how the model was called,
rank - rank of the model,
data - data used for the model construction,
occurrence - the occurrence model used in the estimation,
B - the value of the optimised parameters. Typically, this is a duplicate of coefficients,
other - the list of all the other parameters either passed to the function or estimated in the process, but not included in the standard output (e.g. alpha for Asymmetric Laplace).

Details

This is a function, similar to lm, but for the cases of several non-normal distributions. These include:

dnorm - Normal distribution,
dlaplace - Laplace distribution,
ds - S-distribution,
dgnorm - Generalised Normal distribution,
dlogis - Logistic Distribution,
dt - T-distribution,
dalaplace - Asymmetric Laplace distribution,
dlnorm - Log normal distribution,
dllaplace - Log Laplace distribution,
dls - Log S-distribution,
dlgnorm - Log Generalised Normal distribution,
dfnorm - Folded normal distribution,
dbcnorm - Box-Cox normal distribution,
dinvgauss - Inverse Gaussian distribution,
dbeta - Beta distribution,
dpois - Poisson Distribution,
dnbinom - Negative Binomial Distribution,
plogis - Cumulative Logistic Distribution,
pnorm - Cumulative Normal distribution.

This function can be considered as an analogue of glm, but with the focus on time series. This is why, for example, the function has ar and i parameters and produces time series analysis plots with plot(alm(...)).

This function is slower than lm, because it relies on likelihood estimation of parameters, hessian calculation and matrix multiplication. So think twice when using distribution="dnorm" here.

The estimation is done using likelihood of respective distributions.

See more details and examples in the vignette "ALM": vignette("alm","greybox")

Examples

Run this code

# NOT RUN {
### An example with mtcars data and factors
mtcars2 <- within(mtcars, {
   vs <- factor(vs, labels = c("V", "S"))
   am <- factor(am, labels = c("automatic", "manual"))
   cyl  <- factor(cyl)
   gear <- factor(gear)
   carb <- factor(carb)
})
# The standard model with Log Normal distribution
ourModel <- alm(mpg~., mtcars2[1:30,], distribution="dlnorm")
summary(ourModel)
# }
# NOT RUN {
plot(ourModel)
# }
# NOT RUN {
# Produce predictions with the one sided interval (upper bound)
predict(ourModel, mtcars2[-c(1:30),], interval="p", side="u")


### Artificial data for the other examples
xreg <- cbind(rlaplace(100,10,3),rnorm(100,50,5))
xreg <- cbind(100+0.5*xreg[,1]-0.75*xreg[,2]+rlaplace(100,0,3),xreg,rnorm(100,300,10))
colnames(xreg) <- c("y","x1","x2","Noise")

# An example with Laplace distribution
ourModel <- alm(y~x1+x2, xreg, subset=c(1:80), distribution="dlaplace")
summary(ourModel)
plot(predict(ourModel,xreg[-c(1:80),]))

# And another one with Asymmetric Laplace distribution (quantile regression)
# with optimised alpha
ourModel <- alm(y~x1+x2, xreg, subset=c(1:80), distribution="dalaplace")
summary(ourModel)
plot(predict(ourModel,xreg[-c(1:80),]))

# An example with AR(1) order
ourModel <- alm(y~x1+x2, xreg, subset=c(1:80), distribution="dnorm", ar=1)
summary(ourModel)
plot(predict(ourModel,xreg[-c(1:80),]))

### Examples with the count data
xreg[,1] <- round(exp(xreg[,1]-70),0)

# Negative Binomial distribution
ourModel <- alm(y~x1+x2, xreg, subset=c(1:80), distribution="dnbinom")
summary(ourModel)
predict(ourModel,xreg[-c(1:80),],interval="p",side="u")

# Poisson distribution
ourModel <- alm(y~x1+x2, xreg, subset=c(1:80), distribution="dpois")
summary(ourModel)
predict(ourModel,xreg[-c(1:80),],interval="p",side="u")


### Examples with binary response variable
xreg[,1] <- round(xreg[,1] / (1 + xreg[,1]),0)

# Logistic distribution (logit regression)
ourModel <- alm(y~x1+x2, xreg, subset=c(1:80), distribution="plogis")
summary(ourModel)
plot(predict(ourModel,xreg[-c(1:80),],interval="c"))

# Normal distribution (probit regression)
ourModel <- alm(y~x1+x2, xreg, subset=c(1:80), distribution="pnorm")
summary(ourModel)
plot(predict(ourModel,xreg[-c(1:80),],interval="p"))

# }

Run the code above in your browser using DataLab