ace: Ancestral Character Estimation

Description

This function estimates ancestral character states, and the associated uncertainty, for continuous and discrete characters.

logLik, deviance, and AIC are generic functions used to extract the log-likelihood, the deviance (-2*logLik), or the Akaike information criterion of a tree. If no such values are available, NULL is returned.

anova is another generic function that is used to compare nested models: the significance of the additional parameter(s) is tested with likelihood ratio tests. You must ensure that the models are effectively nested (if they are not, the results will be meaningless). It is better to list the models from the smallest to the largest.

Usage

ace(x, phy, type = "continuous", method = "ML", CI = TRUE,
    model = if (type == "continuous") "BM" else "ER",
    scaled = TRUE, kappa = 1, corStruct = NULL, ip = 0.1)
## S3 method for class 'ace':
print(x, digits = 4, ...)
## S3 method for class 'ace':
logLik(object, ...)
## S3 method for class 'ace':
deviance(object, ...)
## S3 method for class 'ace':
AIC(object, ..., k = 2)
## S3 method for class 'ace':
anova(object, ...)

Arguments

a vector or a factor; an object of class "ace" in the case of print.

phy

an object of class "phylo".

type

the variable type; either "continuous" or "discrete" (or an abbreviation of these).

method

a character specifying the method used for estimation. Four choices are possible: "ML", "REML", "pic", or "GLS".

a logical specifying whether to return the 95% confidence intervals of the ancestral state estimates (for continuous characters) or the likelihood of the different states (for discrete ones).

model

a character specifying the model (ignored if

method
      = "GLS"

), or a numeric matrix if type = "discrete" (see details).

scaled

a logical specifying whether to scale the contrast estimate (used only if method = "pic").

kappa

a positive value giving the exponent transformation of the branch lengths (see details).

corStruct

if method = "GLS", specifies the correlation structure to be used (this also gives the assumed model).

the initial value(s) used for the ML estimation procedure when type == "discrete" (possibly recycled).

digits

the number of digits to be printed.

object

an object of class "ace".

a numeric value giving the penalty per estimated parameter; the default is k = 2 which is the classical Akaike information criterion.

...

further arguments passed to or from other methods.

Value

a list with the following elements:
aceif type = "continuous", the estimates of the ancestral character values.
CI95if type = "continuous", the estimated 95% confidence intervals.
sigma2if type = "continuous", model = "BM", and method = "ML", the maximum likelihood estimate of the Brownian parameter.
ratesif type = "discrete", the maximum likelihood estimates of the transition rates.
seif type = "discrete", the standard-errors of estimated rates.
index.matrixif type = "discrete", gives the indices of the rates in the rate matrix.
loglikif method = "ML", the maximum log-likelihood.
lik.ancif type = "discrete", the scaled likelihoods of each ancestral state.
callthe function call.

Details

If type = "continuous", the default model is Brownian motion where characters evolve randomly following a random walk. This model can be fitted by maximum likelihood (the default, Schluter et al. 1997), least squares (method = "pic", Felsenstein 1985), or generalized least squares (method = "GLS", Martins and Hansen 1997, Cunningham et al. 1998). In the latter case, the specification of phy and model are actually ignored: it is instead given through a correlation structure with the option corStruct.

In the default setting (i.e., method = "ML" and model = "BM") the maximum likelihood estimation is done simultaneously on the ancestral values and the variance of the Brownian motion process; these estimates are then used to compute the confidence intervals in the standard way. The REML method first estimates the ancestral value at the root (aka, the phylogenetic mean), then the variance of the Brownian motion process is estimated by optimizing the residual log-likelihood. The ancestral values are finally inferred from the likelihood function giving these two parameters. If method = "pic" or "GLS", the confidence intervals are computed using the expected variances under the model, so they depend only on the tree.

It could be shown that, with a continous character, REML results in unbiased estimates of the variance of the Brownian motion process while ML gives a downward bias. Therefore, the former is recommanded over the latter, even though it is not the default.

For discrete characters (type = "discrete"), only maximum likelihood estimation is available (Pagel 1994). The model is specified through a numeric matrix with integer values taken as indices of the parameters. The numbers of rows and of columns of this matrix must be equal, and are taken to give the number of states of the character. For instance, matrix(c(0, 1, 1, 0), 2) will represent a model with two character states and equal rates of transition, matrix(c(0, 1, 2, 0), 2) a model with unequal rates, matrix(c(0, 1, 1, 1, 0, 1, 1, 1, 0), 3) a model with three states and equal rates of transition (the diagonal is always ignored). There are short-cuts to specify these models: "ER" is an equal-rates model (e.g., the first and third examples above), "ARD" is an all-rates-different model (the second example), and "SYM" is a symmetrical model (e.g., matrix(c(0, 1, 2, 1, 0, 3, 2, 3, 0), 3)). If a short-cut is used, the number of states is determined from the data.

References

Cunningham, C. W., Omland, K. E. and Oakley, T. H. (1998) Reconstructing ancestral character states: a critical reappraisal. Trends in Ecology & Evolution, 13, 361--366.

Felsenstein, J. (1985) Phylogenies and the comparative method. American Naturalist, 125, 1--15.

Martins, E. P. and Hansen, T. F. (1997) Phylogenies and the comparative method: a general approach to incorporating phylogenetic information into the analysis of interspecific data. American Naturalist, 149, 646--667.

Pagel, M. (1994) Detecting correlated evolution on phylogenies: a general method for the comparative analysis of discrete characters. Proceedings of the Royal Society of London. Series B. Biological Sciences, 255, 37--45.

Schluter, D., Price, T., Mooers, A. O. and Ludwig, D. (1997) Likelihood of ancestor states in adaptive radiation. Evolution, 51, 1699--1711.

Examples

Run this code

### Just some random data...
data(bird.orders)
x <- rnorm(23)
### Compare the three methods for continuous characters:
ace(x, bird.orders)
ace(x, bird.orders, method = "pic")
ace(x, bird.orders, method = "GLS",
    corStruct = corBrownian(1, bird.orders))
### For discrete characters:
x <- factor(c(rep(0, 5), rep(1, 18)))
ans <- ace(x, bird.orders, type = "d")
#### Showing the likelihoods on each node:
plot(bird.orders, type = "c", FALSE, label.offset = 1)
co <- c("blue", "yellow")
tiplabels(pch = 22, bg = co[as.numeric(x)], cex = 2, adj = 1)
nodelabels(thermo = ans$lik.anc, piecol = co, cex = 0.75)
### An example of the use of the argument `ip':
tr <- character(4)
tr[1] <- "((((t10:5.03,t2:5.03):2.74,(t9:4.17,"
tr[2] <- "t5:4.17):3.60):2.80,(t3:4.05,t7:"
tr[3] <- "4.05):6.53):2.32,((t6:4.38,t1:4.38):"
tr[4] <- "2.18,(t8:2.17,t4:2.17):4.39):6.33);"
tr <- read.tree(text = paste(tr, collapse = ""))
y <- c(rep(1, 6), rep(2, 4))
### The default `ip = 0.1' makes ace fails:
ace(y, tr, type = "d")
ace(y, tr, type = "d", ip = 0.01)
### Surprisingly, using an initial value farther to the
### MLE than the default one works:
ace(y, tr, type = "d", ip = 0.3)

Run the code above in your browser using DataLab