# family

##### Family Objects for Models

Family objects provide a convenient way to specify the details of the
models used by functions such as `glm`

. See the
documentation for `glm`

for the details on how such model
fitting takes place.

- Keywords
- models

##### Usage

`family(object, …)`binomial(link = "logit")
gaussian(link = "identity")
Gamma(link = "inverse")
inverse.gaussian(link = "1/mu^2")
poisson(link = "log")
quasi(link = "identity", variance = "constant")
quasibinomial(link = "logit")
quasipoisson(link = "log")

##### Arguments

- link
a specification for the model link function. This can be a name/expression, a literal character string, a length-one character vector, or an object of class

`"link-glm"`

(such as generated by`make.link`

) provided it is not specified*via*one of the standard names given next.The

`gaussian`

family accepts the links (as names)`identity`

,`log`

and`inverse`

; the`binomial`

family the links`logit`

,`probit`

,`cauchit`

, (corresponding to logistic, normal and Cauchy CDFs respectively)`log`

and`cloglog`

(complementary log-log); the`Gamma`

family the links`inverse`

,`identity`

and`log`

; the`poisson`

family the links`log`

,`identity`

, and`sqrt`

; and the`inverse.gaussian`

family the links`1/mu^2`

,`inverse`

,`identity`

and`log`

.The

`quasi`

family accepts the links`logit`

,`probit`

,`cloglog`

,`identity`

,`inverse`

,`log`

,`1/mu^2`

and`sqrt`

, and the function`power`

can be used to create a power link function.- variance
for all families other than

`quasi`

, the variance function is determined by the family. The`quasi`

family will accept the literal character string (or unquoted as a name/expression) specifications`"constant"`

,`"mu(1-mu)"`

,`"mu"`

,`"mu^2"`

and`"mu^3"`

, a length-one character vector taking one of those values, or a list containing components`varfun`

,`validmu`

,`dev.resids`

,`initialize`

and`name`

.- object
the function

`family`

accesses the`family`

objects which are stored within objects created by modelling functions (e.g.,`glm`

).- …
further arguments passed to methods.

##### Details

`family`

is a generic function with methods for classes
`"glm"`

and `"lm"`

(the latter returning `gaussian()`

).

For the `binomial`

and `quasibinomial`

families the response
can be specified in one of three ways:

As a factor: ‘success’ is interpreted as the factor not having the first level (and hence usually of having the second level).

As a numerical vector with values between

`0`

and`1`

, interpreted as the proportion of successful cases (with the total number of cases given by the`weights`

).As a two-column integer matrix: the first column gives the number of successes and the second the number of failures.

The `quasibinomial`

and `quasipoisson`

families differ from
the `binomial`

and `poisson`

families only in that the
dispersion parameter is not fixed at one, so they can model
over-dispersion. For the binomial case see McCullagh and Nelder
(1989, pp.124--8). Although they show that there is (under some
restrictions) a model with
variance proportional to mean as in the quasi-binomial model, note
that `glm`

does not compute maximum-likelihood estimates in that
model. The behaviour of S is closer to the quasi- variants.

##### Value

An object of class `"family"`

(which has a concise print method).
This is a list with elements

character: the family name.

character: the link name.

function: the link.

function: the inverse of the link function.

function: the variance as a function of the mean.

function giving the deviance for each observation
as a function of `(y, mu, wt)`

, used by the
`residuals`

method when computing
deviance residuals.

function giving the AIC value if appropriate (but `NA`

for the quasi- families). More precisely, this function
returns \(-2\ell + 2 s\), where \(\ell\) is the
log-likelihood and \(s\) is the number of estimated scale
parameters. Note that the penalty term for the location parameters
(typically the “regression coefficients”) is added elsewhere,
e.g., in `glm.fit()`

, or `AIC()`

, see the
AIC example in `glm`

.
See `logLik`

for the assumptions made about the
dispersion parameter.

function: derivative of the inverse-link function with respect to the linear predictor. If the inverse-link function is \(\mu = g^{-1}(\eta)\) where \(\eta\) is the value of the linear predictor, then this function returns \(d(g^{-1})/d\eta = d\mu/d\eta\).

expression. This needs to set up whatever data
objects are needed for the family as well as `n`

(needed for
AIC in the binomial family) and `mustart`

(see `glm`

).

logical function. Returns `TRUE`

if a mean
vector `mu`

is within the domain of `variance`

.

logical function. Returns `TRUE`

if a linear
predictor `eta`

is within the domain of `linkinv`

.

(optional) function `simulate(object, nsim)`

to be
called by the `"lm"`

method of `simulate`

. It will
normally return a matrix with `nsim`

columns and one row for
each fitted value, but it can also return a list of length
`nsim`

. Clearly this will be missing for ‘quasi-’ families.

##### Note

The `link`

and `variance`

arguments have rather awkward
semantics for back-compatibility. The recommended way is to supply
them as quoted character strings, but they can also be supplied
unquoted (as names or expressions). Additionally, they can be
supplied as a length-one character vector giving the name of one of
the options, or as a list (for `link`

, of class
`"link-glm"`

). The restrictions apply only to links given as
names: when given as a character string all the links known to
`make.link`

are accepted.

This is potentially ambiguous: supplying `link = logit`

could mean
the unquoted name of a link or the value of object `logit`

. It
is interpreted if possible as the name of an allowed link, then
as an object. (You can force the interpretation to always be the value of
an object via `logit[1]`

.)

##### References

McCullagh P. and Nelder, J. A. (1989)
*Generalized Linear Models.*
London: Chapman and Hall.

Dobson, A. J. (1983)
*An Introduction to Statistical Modelling.*
London: Chapman and Hall.

Cox, D. R. and Snell, E. J. (1981).
*Applied Statistics; Principles and Examples.*
London: Chapman and Hall.

Hastie, T. J. and Pregibon, D. (1992)
*Generalized linear models.*
Chapter 6 of *Statistical Models in S*
eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.

##### See Also

For binomial *coefficients*, `choose`

;
the binomial and negative binomial *distributions*,
`Binomial`

, and `NegBinomial`

.

##### Examples

`library(stats)`

```
# NOT RUN {
require(utils) # for str
nf <- gaussian() # Normal family
nf
str(nf)
gf <- Gamma()
gf
str(gf)
gf$linkinv
gf$variance(-3:4) #- == (.)^2
## Binomial with default 'logit' link: Check some properties visually:
bi <- binomial()
et <- seq(-10,10, by=1/8)
plot(et, bi$mu.eta(et), type="l")
## show that mu.eta() is derivative of linkinv() :
lines((et[-1]+et[-length(et)])/2, col=adjustcolor("red", 1/4),
diff(bi$linkinv(et))/diff(et), type="l", lwd=4)
## which here is the logistic density:
lines(et, dlogis(et), lwd=3, col=adjustcolor("blue", 1/4))
stopifnot(exprs = {
all.equal(bi$ mu.eta(et), dlogis(et))
all.equal(bi$linkinv(et), plogis(et) -> m)
all.equal(bi$linkfun(m ), qlogis(m)) # logit(.) == qlogis(.) !
})
## Data from example(glm) :
d.AD <- data.frame(treatment = gl(3,3),
outcome = gl(3,1,9),
counts = c(18,17,15, 20,10,20, 25,13,12))
glm.D93 <- glm(counts ~ outcome + treatment, d.AD, family = poisson())
## Quasipoisson: compare with above / example(glm) :
glm.qD93 <- glm(counts ~ outcome + treatment, d.AD, family = quasipoisson())
# }
# NOT RUN {
glm.qD93
anova (glm.qD93, test = "F")
summary(glm.qD93)
## for Poisson results (same as from 'glm.D93' !) use
anova (glm.qD93, dispersion = 1, test = "Chisq")
summary(glm.qD93, dispersion = 1)
# }
# NOT RUN {
## Example of user-specified link, a logit model for p^days
## See Shaffer, T. 2004. Auk 121(2): 526-540.
logexp <- function(days = 1)
{
linkfun <- function(mu) qlogis(mu^(1/days))
linkinv <- function(eta) plogis(eta)^days
mu.eta <- function(eta) days * plogis(eta)^(days-1) *
binomial()$mu.eta(eta)
valideta <- function(eta) TRUE
link <- paste0("logexp(", days, ")")
structure(list(linkfun = linkfun, linkinv = linkinv,
mu.eta = mu.eta, valideta = valideta, name = link),
class = "link-glm")
}
(bil3 <- binomial(logexp(3)))
# }
# NOT RUN {
## in practice this would be used with a vector of 'days', in
## which case use an offset of 0 in the corresponding formula
## to get the null deviance right.
## Binomial with identity link: often not a good idea, as both
## computationally and conceptually difficult:
binomial(link = "identity") ## is exactly the same as
binomial(link = make.link("identity"))
## tests of quasi
x <- rnorm(100)
y <- rpois(100, exp(1+x))
glm(y ~ x, family = quasi(variance = "mu", link = "log"))
# which is the same as
glm(y ~ x, family = poisson)
glm(y ~ x, family = quasi(variance = "mu^2", link = "log"))
# }
# NOT RUN {
glm(y ~ x, family = quasi(variance = "mu^3", link = "log")) # fails
# }
# NOT RUN {
y <- rbinom(100, 1, plogis(x))
# need to set a starting value for the next fit
glm(y ~ x, family = quasi(variance = "mu(1-mu)", link = "logit"), start = c(0,1))
# }
```

*Documentation reproduced from package stats, version 3.6.0, License: Part of R 3.6.0*