scglr: Function that fits the scglr model

Description

Calculate the components to predict all the dependent variables

Usage

scglr(formula, data, family, K = 1, size = NULL,
    offset = NULL, subset = NULL, na.action = na.omit,
    crit = list())

Arguments

formula

an object of class "Formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted

data

a data frame to be modeled

family

a vector of character of the same length as the number of dependent variables: "bernoulli", "binomial", "poisson" or "gaussian" is allowed.

number of components, default is one.

size

describes the number of trials for the binomial dependent variables. A (number of statistical units * number of binomial dependent variables) matrix is expected.

offset

used for the poisson dependent variables. A vector or a matrix of size: number of observations * number of Poisson dependent variables is expected

subset

an optional vector specifying a subset of observations to be used in the fitting process

na.action

a function which indicates what should happen when the data contain NAs. The default is set to na.omit.

crit

a list of two elements : maxit and tol, describing respectively the maximum number of iterations and the tolerance convergence criterion for the Fisher scoring algorithm. Default is set to 50 and 10e-6 respectively.

Value

an object of the SCGLR class.
umatrix of size (number of regressors * number of components), contains the component-loadings, i.e. the coefficients of the regressors in the linear combination giving each component
compmatrix of size (number of statistical units * number of components) having the components as column vectors
comprmatrix of size (number of statistical units * number of components) having the standardized components as column vectors
gammamatrix of size (number of components * number of dependent variables), contains the coefficients of the regression on the components
betamatrix of size (number of regressors + 1 (intercept) * number of dependent variables), contains the coefficients of the regression on the original regressors $X$
lin.preddata.frame of size (number of statistical units * number of dependent variables), the fitted linear predictor
xFactorsdata.frame containing the nominal regressors
xNumericdata.frame containing the quantitative regressors
inertiamatrix of size (number of components * 2), contains the percentage and cumulative percentage of the overall regressors' variance, captured by each component
deviancevector of length (number of dependent variables), gives the deviance of each $y_k$'s GLM on the components

References

Bry X., Trottier C., Verron T. and Mortier F. (2013) Supervised Component Generalized Linear Regression using a PLS-extension of the Fisher scoring algorithm. Journal of Multivariate Analysis.

Examples

Run this code

library(SCGLR)

# load sample data
data(genus)

# get variable names from dataset
n <- names(genus)
ny <- n[grep("^gen",n)]    # Y <- names that begins with "gen"
nx <- n[-grep("^gen",n)]   # X <- remaining names

# remove "geology" and "surface" from nx
# as surface is offset and we want to use geology as additional covariate
nx <-nx[!nx%in%c("geology","surface")]

# build multivariate formula
# we also add "lat*lon" as computed covariate
form <- multivariateFormula(ny,c(nx,"I(lat*lon)"),c("geology"))

# define family
fam <- rep("poisson",length(ny))

genus.scglr <- scglr(formula=form,data = genus,family=fam, K=4,
 offset=genus$surface)

summary(genus.scglr)

Run the code above in your browser using DataLab