R2.resid: Calculate R2.resid

Description

Calculate partial and total R2s for LMM, GLMM, PGLS, and PGLMM using R2.resid, an extension of ordinary least-squares (OLS) R2s. For LMMs and GLMMs, R2.resid is related to the method proposed by Nakagawa and Schielzeth (2013).

Usage

R2.resid(mod = NULL, mod.r = NULL, phy = NULL, sigma2_d = c("s2w",
  "NS", "rNS"))

Arguments

mod

A regression model with one of the following classes: 'lm', 'glm', 'lmerMod', 'glmerMod', 'phylolm', or 'binaryPGLMM'. For 'glmerMod', only family = c('binomial', 'poisson') are supported.

mod.r

A reduced model; if not provided, the total R2 will be given by setting 'mod.r' to the model corresponding to 'mod' with the intercept as the only predictor.

phy

The phylogeny for phylogenetic models (as a 'phylo' object), which must be specified for models of class `phylolm`.

sigma2_d

Distribution-specific variance $\sigma^2_d$ (see Details). For binomial GLMs, GLMMs and PGLMMs with logit link functions, options are c('s2w', 'NS', 'rNS'). For binomial GLMs, GLMMs and PGLMMs with probit link functions, options are c('s2w', 'NS'). Other families use 's2w'.

Value

R2.resid value.

Details

R2.resid works with classes 'lm', 'glm', 'lmerMod', 'glmerMod', 'phylolm', and 'binaryPGLMM'.

LMM (lmerMod):

$$partial R^2 = 1 - \sigma^2_{e.f}/\sigma^2_{e.r}$$

$$total R^2 = 1 - \sigma^2_{e.f}/var(y)$$

where $\sigma^2_{e.f}$ and $\sigma^2_{e.r}$ are the estimated residual variances from the full and reduced LMM, and var(y) is the total variance of the response (dependent) variable.

GLMM (glmerMod):

$$total R^2 = 1 - \sigma^2_d/(\sigma^2_x + \sigma^2_b + \sigma^2_d)$$

where $\sigma^2_x$ and $\sigma^2_b$ are the estimated variances associated with the fixed and random effects. $\sigma^2_d$ is a term that scales the implied 'residual variance' of the GLMM (see Ives 2018, Appendix 1). The default used for $\sigma^2_d$ is $\sigma^2_w$ which is computed from the iterative weights of the GLMM. Specifically,

$$\sigma_{w}^{2}=var(g'(\mu)*(y-\mu))$$

where g'() is the derivative of the link function, and $(y-\mu)$ is the difference between the data y and their predicted values $\mu$. This is the default option specified by sigma2_d = 's2w'. For binomial models with a logit link function, sigma2_d = 'NS' gives the scaling $\sigma^2_d = \pi^2/3$, and sigma2_d = 'rNS' gives $\sigma^2_d = 0.8768809 * \pi^2/3$. For binomial models with a probit link function, sigma2_d = 'NS' gives the scaling $\sigma^2_d = 1$. In general option sigma2_d = 's2w' will give values lower than sigma2_d = 'NS' and 'rNS', but the values will be closer to R2.lik() and R2.pred(). For other forms of sigma2_d from Nakagawa and Schielzeth (2013) and Nakagawa et al. (2017), see the MuMIn package.

Partial R2s are given by the standard formula

$$partial R^2 = 1 - (1 - R^2_{.f})/(1 - R^2_{.r})$$

where R2.f and R2.r are the total R2s for full and reduced models, respectively.

PGLS (phylolm):

$$partial R^2 = 1 - c.f * \sigma^2_{.f}/(c.r * \sigma^2_{.r})$$

where $\sigma^2_{.f}$ and $\sigma^2_{.r}$ are the variances estimated for the PGLS full and reduced models, and c.f and c.r are the scaling values for full and reduce models that equal the total sum of phylogenetic branch length estimates. Note that the phylogeny needs to be specified in R2.resid.

Note that phylolm() can have difficulties in finding solutions when there is no phylogenetic signal; when the estimate indicates no phylogenetic signal, you should refit the model with the corresponding LM.

PGLMM (binaryPGLMM):

The binary PGLMM is computed in the same way as the binomial GLMM, with options sigma_d = c('s2w', 'NS', 'rNS'). The estimated variance of the random effect associated with the phylogeny, $\sigma^2_b$, is multiplied by the diagonal elements of the phylogenetic covariance matrix. For binary models, this covariance matrix should be standardized so that all diagonal elements are the same (a contemporaneous or ultrametric phylogenetic tree) (Ives and Garland 2014). In case this is not done, however, the code takes the geometric average of the diagonal elements.

Note that the version of binaryPGLMM() in the package ape is replaced by a version contained within rr2 that outputs all of the required information for the calculation of R2.resid()

LM (lm) and GLM (glm):

For compatibility and generating reduced models, rr2 will compute R2.resid() for LM and GLM that correspond to LMM/PGLS and GLMM/PGLMM.

References

Ives A.R. and Li D. 2018. rr2: An R package to calculate R2s for regression models. Journal of Open Source Software. DOI:10.21105/joss.01028

Ives A.R. 2018. R2s for Correlated Data: Phylogenetic Models, LMMs, and GLMMs. Systematic Biology. DOI:10.1093/sysbio/syy060

Ives A. R., Garland T., Jr. 2014. Phylogenetic regression for binary dependent variables. In: Garamszegi LZ editor. Modern Phylogenetic Comparative Methods and Their Application in Evolutionary Biology. Berlin Heidelberg, Springer-Verlag, p. 231-261.

Nakagawa S., Schielzeth H. 2013. A general and simple method for obtaining R2 from generalized linear mixed-effects models. Methods in Ecology and Evolution, 4:133-142.

Nakagawa S., Johnson P. C. D., Schielzeth H. 2017. The coefficient of determination R2 and intra-class correlation coefficient from generalized linear mixed-effects models revisited and expanded. Journal of the Royal Society Interface, 14.

Examples

Run this code

# NOT RUN {
library(ape)
library(phylolm)
library(lme4)

#################
# LMM with two fixed and two random effects 
p1 <- 10
nsample <- 10
n <- p1 * nsample

d <- data.frame(x1 = 0, x2 = 0, y = 0, u1 = rep(1:p1, each = nsample), 
                u2 = rep(1:p1, times = nsample))
d$u1 <- as.factor(d$u1)
d$u2 <- as.factor(d$u2)

b1 <- 1
b2 <- -1
sd1 <- 1.5

d$x1 <- rnorm(n = n)
d$x2 <- rnorm(n = n)
d$y <- b1 * d$x1 + b2 * d$x2 + rep(rnorm(n = p1, sd = sd1), each = nsample) + 
       rep(rnorm(n = p1, sd = sd1), times = nsample) + rnorm(n = n)

z.f <- lmer(y ~ x1 + x2 + (1 | u1) + (1 | u2), data = d, REML = FALSE)
z.x <- lmer(y ~ x1 + (1 | u1) + (1 | u2), data = d, REML = FALSE)
z.v <- lmer(y ~ 1 + (1 | u2), data = d, REML = FALSE)
z.0 <- lm(y ~ 1, data = d)

R2.resid(z.f, z.x)
R2.resid(z.f, z.v)
R2.resid(z.f)

#################
# GLMM with one fixed and one random effect

p1 <- 10
nsample <- 10
n <- p1 * nsample

d <- data.frame(x = 0, y = 0, u = rep(1:p1, each = nsample))
d$u <- as.factor(d$u)

b1 <- 1
sd1 <- 1.5

d$x <- rnorm(n = n)
prob <- inv.logit(b1 * d$x + rep(rnorm(n = p1, sd = sd1), each = nsample))
d$y <- rbinom(n = n, size = 1, prob = prob)

z.f <- glmer(y ~ x + (1 | u), data = d, family = 'binomial')
z.x <- glmer(y ~ 1 + (1 | u), data = d, family = 'binomial')
z.v <- glm(y ~ x, data = d, family = 'binomial')

R2.resid(z.f, z.x)
R2.resid(z.f, z.v)
R2.resid(z.f)

#################
# PGLS with a single fixed effect

n <- 100
d <- data.frame(x = array(0, dim = n), y = 0)

b1 <- 1.5
signal <- 0.7

phy <- compute.brlen(rtree(n = n), method = 'Grafen', power = 1)
phy.x <- compute.brlen(phy, method = 'Grafen', power = .0001)

# Generate random data
x <- rTraitCont(phy.x, model = 'BM', sigma = 1)
e <- signal^0.5 * rTraitCont(phy, model = 'BM', sigma = 1) + (1-signal)^0.5 * rnorm(n = n)
d$x <- x[match(names(e), names(x))]
d$y <- b1 * x + e
rownames(d) <- phy$tip.label

z.x <- phylolm(y ~ 1, phy = phy, data = d, model = 'lambda')
lam.x <- round(z.x$optpar, digits = 4)
z.f <- phylolm(y ~ x, phy = phy, data = d, model = 'lambda')
z.v <- lm(y ~ x, data = d)

R2.resid(z.f, z.x, phy = phy)
R2.resid(z.f, z.v, phy = phy)
R2.resid(z.f, phy = phy)

#################
# PGLMM with one fixed effect

n <- 100
b1 <- 1.5
signal <- 2

phy <- compute.brlen(rtree(n = n), method = 'Grafen', power = 1)
phy.x <- compute.brlen(phy, method = 'Grafen', power = .0001)

# Generate random data
x <- rnorm(n)
d <- data.frame(x = x, y = 0)

e <- signal * rTraitCont(phy, model = 'BM', sigma = 1)
e <- e[match(phy$tip.label, names(e))]

d$y <- rbinom(n = n, size = 1, prob = inv.logit(b1 * d$x + e))
rownames(d) <- phy$tip.label

# Use the function binaryPGLMM() from the rr2 package rather than ape.
z.f <- rr2::binaryPGLMM(y ~ x, data = d, phy = phy)
z.x <- rr2::binaryPGLMM(y ~ 1, data = d, phy = phy)
z.v <- glm(y ~ x, data = d, family = 'binomial')

R2.resid(z.f, z.x, phy = phy)
R2.resid(z.f, z.v, phy = phy)
R2.resid(z.f, phy = phy)

# }

Run the code above in your browser using DataLab