lmCombine: Combine regressions based on information criteria

Description

Function combines parameters of linear regressions of the first variable on all the other provided data.

Usage

lmCombine(data, ic = c("AICc", "AIC", "BIC", "BICc"),
  bruteForce = FALSE, silent = TRUE, distribution = c("dnorm",
  "dfnorm", "dlnorm", "dlaplace", "ds", "dchisq", "dlogis", "plogis",
  "pnorm"))

Arguments

data

Data frame containing dependent variable in the first column and the others in the rest.

Information criterion to use.

bruteForce

If TRUE, then all the possible models are generated and combined. Otherwise the best model is found and then models around that one are produced and then combined.

silent

If FALSE, then nothing is silent, everything is printed out. TRUE means that nothing is produced.

distribution

Distribution to pass to alm().

Value

Function returns model - the final model of the class "greyboxC". The list of variables:

coefficients - combined parameters of the model,
se - combined standard errors of the parameters of the model,
actuals - actual values of the response variable,
fitted.values - the fitted values,
residuals - residual of the model,
distribution - distribution used in the estimation,
logLik - combined log-likelihood of the model,
IC - the values of the combined information criterion,
df.residual - number of degrees of freedom of the residuals of the combined model,
df - number of degrees of freedom of the combined model,
importance - importance of the parameters,
call - call used in the function,
rank - rank of the combined model,
data - the data used in the model,
mu - the location value of the distribution.
combination - the table, indicating which variables were used in every model construction and what were the weights for each model.

Details

The algorithm uses alm() to fit different models and then combines the models based on the selected IC. The parameters are combined so that if they are not present in some of models, it is assumed that they are equal to zero. Thus, there is a shrinkage effect in the combination.

References

Burnham Kenneth P. and Anderson David R. (2002). Model Selection and Multimodel Inference. A Practical Information-Theoretic Approach. Springer-Verlag New York. DOI: [10.1007/b97636](http://dx.doi.org/10.1007/b97636).

Examples

Run this code

# NOT RUN {
### Simple example
xreg <- cbind(rnorm(100,10,3),rnorm(100,50,5))
xreg <- cbind(100+0.5*xreg[,1]-0.75*xreg[,2]+rnorm(100,0,3),xreg,rnorm(100,300,10))
colnames(xreg) <- c("y","x1","x2","Noise")
inSample <- xreg[1:80,]
outSample <- xreg[-c(1:80),]
# Combine all the possible models
ourModel <- lmCombine(inSample,bruteForce=TRUE)
predict(ourModel,outSample)
plot(predict(ourModel,outSample))

### Fat regression example
xreg <- matrix(rnorm(5000,10,3),50,100)
xreg <- cbind(100+0.5*xreg[,1]-0.75*xreg[,2]+rnorm(50,0,3),xreg,rnorm(50,300,10))
colnames(xreg) <- c("y",paste0("x",c(1:100)),"Noise")
inSample <- xreg[1:40,]
outSample <- xreg[-c(1:40),]
# Combine only the models close to the optimal
ourModel <- lmCombine(inSample,ic="BICc",bruteForce=FALSE)
summary(ourModel)
plot(predict(ourModel,outSample))

# }

Run the code above in your browser using DataLab