rsm: Fit a Regression-Scale Model

Description

Produces an object of class rsm which is a regression-scale model fit of the data.

Usage

rsm(formula = formula(data), family = gaussian, 
    data = sys.frame(sys.parent()), dispersion = NULL, 
    weights = NULL, subset = NULL, na.action = na.fail, 
    offset = NULL, method = "rsm.surv", 
    control = glm.control(maxit=100, trace=FALSE), 
    model = FALSE, x = FALSE, y = TRUE, contrasts = NULL, ...)

Arguments

formula

a formula expression as for other linear regression models, of the form response ~ predictors where the predictors are separated by suitable operators. See the documentation of lm

family

a family.rsm object, i.e. a list of functions and expressions characterizing the error distribution. Families supported are gaussian, student (Student's t), extreme (Gumbel or extreme value)

data

an optional data frame in which to interpret the variables occurring in the model formula, or in the subset and the weights arguments. If this is missing, then the variables in the formula should be on the search

dispersion

if NULL, the scale parameter is taken to be unknown. If known, the numerical value can be passed. The default is NULL. Huber's least favourable distribution represents a special case. If dispersion

weights

the optional weights for the fitting criterion. If supplied, the response variable and the covariates are multiplied by the weights in the IRLS algorithm. The length of the weights argument must be the same as

subset

expression saying which subset of the rows of the data should be used in the fit. This can be a logical vector (which is replicated to have length equal to the number of observations), or a numeric vector indicating which observation numbe

na.action

a function to filter missing data. This is applied to the model frame after any subset argument has been used. The default (with na.fail) is to create an error if any missing value is found. A possible alternati

offset

this can be used to specify an a priori known component to be included in the linear predictor during fitting. An offset term can be included in the formula instead or as well, and if both are specified their sum is us

method

the fitting method to be used; the default is rsm.fit. The method model.frame simply returns the model frame.

control

a list of iteration and algorithmic constants. See glm.control for their names and default values.

model

if TRUE, the model frame is returned; default is FALSE.

if TRUE, the model matrix is returned; default is FALSE.

if TRUE, the response variable is returned; default is TRUE.

contrasts

a list of contrasts to be used for some or all of the factors appearing as variables in the model formula. The names of the list should be the names of the corresponding variables, and the elements should either be contrast-type matrices (m

...

absorbs any additional argument.

Value

an object of class rsm is returned which inherits from glm and lm. See rsm.object for details.
The output can be examined by print, summary, rsm.diag.plots and anova. Components can be extracted using fitted, residuals, formula and family. It can be modified using update. It has most of the components of a glm object, with a few more. Use rsm.object for further details.

Details

The model is fitted using Iteratively Reweighted Least Squares, IRLS for short (Green, 1984, Jorgensen, 1984). The working response and iterative weights are computed using the functions contained in the family.rsm object.

The two workhorses of rsm are rsm.fit and rsm.surv, which expect an X and Y argument rather then a formula. The first function is used for the families student with df $<$ 3="" and="" Huber; the second one, based on the survreg.fit routine for fitting parametric survival models, is used in case of extreme, logistic, logWeibull, logExponential, logRayleigh and student (with df > 2) error distributions. In the presence of a user-defined error distribution the rsm.fit routine is used. The rsm.null function is invoked to fit an empty (null) model.

The details are given in Brazzale (2000, Section 6.3.1).

References

Brazzale, A. R. (2000) Practical Small-Sample Parametric Inference. Ph.D. Thesis N. 2230, Department of Mathematics, Swiss Federal Institute of Technology Lausanne.

Green, P. J. (1984) Iteratively reweighted least squares for maximum likelihood estimation, and some robust and resistant alternatives (with Discussion). J. R. Statist. Soc. B, 46, 149--192. Jorgensen, B. (1984) The delta algorithm and GLIM. Int. Stat. Rev., 52, 283--300.

Examples

Run this code

## House Price Data
data(houses)
houses.rsm <- rsm(price ~ ., family = student(5), data = houses)
## model fit including all covariates
houses.rsm <- rsm(price ~ ., family = student(5), data = houses, 
                  method = "rsm.fit", control = glm.control(trace = TRUE))
## prints information about the iterative procedure at each iteration
update(houses.rsm, ~ . - bdroom + offset(7 * bdroom))
## "bdroom" is included as offset variable with fixed (= 7) coefficient

## Sea Level Data
data(venice)
attach(venice)
Year <- 1:51/51
venice.2.rsm <- rsm(sea ~ Year + I(Year^2), family = extreme)
## quadratic model fitted to sea level data
venice.1.rsm <- update(venice.2.rsm, ~. - I(Year^2))
## linear model fit
##
c11 <- cos(2*pi*1:51/11) ; s11 <- sin(2*pi*1:51/11)
c19 <- cos(2*pi*1:51/18.62) ; s19 <- sin(2*pi*1:51/18.62)
venice.rsm <- rsm(sea ~ Year + I(Year^2) + c11 + s11 + c19 + s19, 
                  family = extreme)
## includes 18.62-year astronomical tidal cycle and 11-year sunspot cycle
venice.11.rsm <- rsm(sea ~ Year + I(Year^2) + c11 + s11, family = extreme)
venice.19.rsm <- rsm(sea ~ Year + I(Year^2) + c19 + s19, family = extreme)
## includes either astronomical cycle
##
## comparison of linear, quadratic and periodic (11-year, 19-year) models 
plot(year, sea, ylab = "sea level") 
lines(year, fitted(venice.1.rsm))
lines(year, fitted(venice.2.rsm), col="red")
lines(year, fitted(venice.11.rsm), col="blue")
lines(year, fitted(venice.19.rsm), col="green")
##
detach()

## Darwin's Data on Growth Rates of Plants
data(darwin)
darwin.rsm <- rsm(cross - self ~ pot - 1, family  =  student(3), 
                  data = darwin)
## Maximum likelihood estimates
darwin.rsm <- rsm(cross - self ~ pot - 1, family = Huber, data = darwin)
## M-estimates

Run the code above in your browser using DataLab