normal.vcm: Univariate Normal Distribution as a Varying-Coefficient Model

Description

Maximum likelihood estimation of all the coefficients of a LM where each of the usual regression coefficients is modelled with other explanatory variables via parameter link functions. Thus this is a basic varying-coefficient model.

Usage

normal.vcm(link.list = list("(Default)" = "identitylink"),
           earg.list = list("(Default)" = list()),
           lsd = "loge", lvar = "loge",
           esd = list(), evar = list(),
           var.arg = FALSE, imethod = 1,
           icoefficients = NULL, isd = NULL, zero = "sd",
           sd.inflation.factor = 2.5)

Arguments

link.list, earg.list

Link functions and extra arguments applied to the coefficients of the LM, excluding the standard deviation/variance. See CommonVGAMffArguments for more information. The default is for an identity link to be applied to each of the regression coefficients.

lsd, esd, lvar, evar

Link function and extra argument applied to the standard deviation/variance. See CommonVGAMffArguments for more information. Same as uninormal.

icoefficients

Optional initial values for the coefficients. Recycled to length $M - 1$ (does not include the standard deviation/variance). Try using this argument if there is a link function that is not programmed explicitly to handle range restrictions in the initialize slot.

var.arg, imethod, isd

Same as, or similar to, uninormal.

zero

See CommonVGAMffArguments for more information. The default applies to the last one, viz. the standard deviation/variance parameter.

sd.inflation.factor

Numeric, should be greater than 1. The initial value of the standard deviation is multiplied by this, unless isd is inputted. Experience has shown that it is safer to start off with a larger value rather than a smaller one.

Value

An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm, and vgam.

Warning

This VGAM family function is fragile. One should monitor convergence, and possibly enter initial values especially when there are non-identity-link functions. If the initial value of the standard deviation/variance is too small then numerical problems may occur. One trick is to fit an intercept-only only model and feed its predict() output into argument etastart of a more complicated model. The use of the zero argument is recommended in order to keep models as simple as possible.

Details

This function allows all the usual LM regression coefficients to be modelled as functions of other explanatory variables via parameter link functions. For example, we may want some of them to be positive. Or we may want a subset of them to be positive and add to unity. So a class of such models have been named varying-coefficient models (VCMs).

The usual linear model is specified through argument form2. As with all other VGAM family functions, the linear/additive predictors are specified through argument formula.

The multilogit link allows a subset of the coefficients to be positive and add to unity. Either none or more than one call to multilogit is allowed. The last variable will be used as the baseline/reference group, and therefore excluded from the estimation.

By default, the log of the standard deviation is the last linear/additive predictor. It is recommended that this parameter be estimated as intercept-only, for numerical stability.

Technically, the Fisher information matrix is of unit-rank for all but the last parameter (the standard deviation/variance). Hence an approximation is used that pools over all the observations.

This VGAM family function cannot handle multiple responses. Also, this function will probably not have the full capabilities of the class of varying-coefficient models as described by Hastie and Tibshirani (1993). However, it should be able to manage some simple models, especially involving the following links: identity, loge, logoff, loglog, logit, probit, cauchit. cloglog, rhobit, fisherz.

References

Hastie, T. and Tibshirani, R. (1993) Varying-coefficient models. J. Roy. Statist. Soc. Ser. B, 55, 757--796.

Examples

Run this code

# NOT RUN {
ndata <- data.frame(x2 = runif(nn <- 2000))
# Note that coeff1 + coeff2 + coeff5 == 1. So try a "multilogit" link.
myoffset <- 10
ndata <- transform(ndata,
           coeff1 = 0.25,  # "multilogit" link
           coeff2 = 0.25,  # "multilogit" link
           coeff3 = exp(-0.5),  # "loge" link
# "logoff" link:
           coeff4 = logoff(+0.5, offset = myoffset, inverse = TRUE),
           coeff5 = 0.50,  # "multilogit" link
           coeff6 = 1.00,  # "identitylink" link
           v2 = runif(nn),
           v3 = runif(nn),
           v4 = runif(nn),
           v5 = rnorm(nn),
           v6 = rnorm(nn))
ndata <- transform(ndata,
           Coeff1 =          0.25 - 0 * x2,
           Coeff2 =          0.25 - 0 * x2,
           Coeff3 =   logit(-0.5  - 1 * x2, inverse = TRUE),
           Coeff4 =  loglog( 0.5  - 1 * x2, inverse = TRUE),
           Coeff5 =          0.50 - 0 * x2,
           Coeff6 =          1.00 + 1 * x2)
ndata <- transform(ndata,
                   y1 = coeff1 * 1 +
                        coeff2 * v2 +
                        coeff3 * v3 +
                        coeff4 * v4 +
                        coeff5 * v5 +
                        coeff6 * v6 + rnorm(nn, sd = exp(0)),
                   y2 = Coeff1 * 1 +
                        Coeff2 * v2 +
                        Coeff3 * v3 +
                        Coeff4 * v4 +
                        Coeff5 * v5 +
                        Coeff6 * v6 + rnorm(nn, sd = exp(0)))

# An intercept-only model
fit1 <- vglm(y1 ~ 1,
             form2 = ~ 1 + v2 + v3 + v4 + v5 + v6,
             normal.vcm(link.list = list("(Intercept)" = "multilogit",
                                         "v2"          = "multilogit",
                                         "v3"          = "loge",
                                         "v4"          = "logoff",
                                         "(Default)"   = "identitylink",
                                         "v5"          = "multilogit"),
                        earg.list = list("(Intercept)" = list(),
                                         "v2"          = list(),
                                         "v4"          = list(offset = myoffset),
                                         "v3"          = list(),
                                         "(Default)"   = list(),
                                         "v5"          = list()),
                        zero = c(1:2, 6)),
             data = ndata, trace = TRUE)
coef(fit1, matrix = TRUE)
summary(fit1)
# This works only for intercept-only models:
multilogit(rbind(coef(fit1, matrix = TRUE)[1, c(1, 2)]), inverse = TRUE)

# A model with covariate x2 for the regression coefficients
fit2 <- vglm(y2 ~ 1 + x2,
             form2 = ~ 1 + v2 + v3 + v4 + v5 + v6,
             normal.vcm(link.list = list("(Intercept)" = "multilogit",
                                         "v2"          = "multilogit",
                                         "v3"          = "logit",
                                         "v4"          = "loglog",
                                         "(Default)"   = "identitylink",
                                         "v5"          = "multilogit"),
                        earg.list = list("(Intercept)" = list(),
                                         "v2"          = list(),
                                         "v3"          = list(),
                                         "v4"          = list(),
                                         "(Default)"   = list(),
                                         "v5"          = list()),
                        zero = c(1:2, 6)),
             data = ndata, trace = TRUE)

coef(fit2, matrix = TRUE)
summary(fit2)
# }

Run the code above in your browser using DataLab

Last chance! 50% off unlimited learning