Learn R Programming

phylobase (version 0.6.8)

phylo4d: Combine a phylogenetic tree with data

Description

phylo4d is a generic constructor which merges a phylogenetic tree with data frames to create a combined object of class phylo4d

Usage

## S3 method for class 'phylo':
phylo4d(x, tip.data = NULL, node.data = NULL,
        all.data = NULL, check.node.labels = c("keep", "drop", "asdata"), 
        annote=list(), metadata=list(), ...)
     ## S3 method for class 'phylo4':
phylo4d(x, tip.data = NULL, node.data = NULL,
        all.data = NULL, merge.data = TRUE, metadata = list(), ...)
     ## S3 method for class 'matrix':
phylo4d(x, tip.data = NULL, node.data = NULL,
        all.data = NULL, merge.data = TRUE, metadata = list(),
        edge.length = NULL, tip.label = NULL, node.label = NULL,
        edge.label = NULL, order = "unknown", annote=list(), ...)

Arguments

x
an object of class phylo4, phylo or a matrix of edges (see above)
tip.data
a data frame (or object to be coerced to one) containing only tip data (Optional)
node.data
a data frame (or object to be coerced to one) containing only node data (Optional)
all.data
a data frame (or object to be coerced to one) containing both tip and node data (Optional)
merge.data
if both tip.data and node.data are provided, should columns with common names will be merged together (default TRUE) or not (FALSE)? See details.
metadata
any additional metadata to be passed to the new object
edge.length
Edge (branch) length. (Optional)
tip.label
A character vector of species names (names of "tip" nodes). (Optional)
node.label
A character vector of internal node names. (Optional)
edge.label
A character vector of edge (branch) names. (Optional)
order
character: tree ordering (allowable values are listed in phylo4_orderings, currently "unknown", "preorder" (="cladewise" in ape), and "postorder", with "cladewise" and "pruningwise" also allowed for compatibilit
annote
any additional annotation data to be passed to the new object
check.node.labels
if x is of class phylo, use either keep (the default) to retain internal node labels, drop to drop them, or asdata to convert them to numeric tree data. This argument
...
further arguments to be passed to formatData. Notably, these additional arguments control the behavior of the constructor in the case of missing/extra data and where to look for labels in the

Value

  • An object of class phylo4d.

Details

You can provide several data frames to define traits associated with tip and/or internal nodes. By default, data row names are used to link data to nodes in the tree, with any number-like names (e.g., 10) matched against node ID numbers, and any non-number-like names (e.g., n10) matched against node labels. Alternative matching rules can be specified by passing additional arguments to formatData; these include positional matching, matching exclusively on node labels, and matching based on a column of data rather than on row names. See formatData for more information.

Matching rules will apply the same way to all supplied data frames. This means that you need to be consistent with the row names of your data frames. It is good practice to use tip and node labels (or node numbers) when you combine data with a tree.

If you provide both tip.data and node.data, the treatment of columns with common names will depend on the merge.data argument. If TRUE, columns with the same name in both data frames will be merged; when merging columns of different data types, coercion to a common type will follow standard R rules. If merge.data is FALSE, columns with common names will be preserved independently, with .tip and .node appended to the names. This argument has no effect if tip.data and node.data have no column names in common.

If you provide all.data along with either of tip.data and node.data, it must have distinct column names, otherwise an error will result. Additionally, although supplying columns with the same names within data frames is not illegal, automatic renaming for uniqeness may lead to surprising results, so this practice should be avoided.

See Also

coerce-methods for translation functions. The phylo4d class, the formatData function to check the validity of phylo4d objects; phylo4 class and phylo4 constructor.

Examples

Run this code
treeOwls <- "((Strix_aluco:4.2,Asio_otus:4.2):3.1,Athene_noctua:7.3);"
tree.owls.bis <- ape::read.tree(text=treeOwls)
try(phylo4d(as(tree.owls.bis,"phylo4"),data.frame(wing=1:3)), silent=TRUE)
obj <- phylo4d(as(tree.owls.bis,"phylo4"),data.frame(wing=1:3), match.data=FALSE)
obj
print(obj)

####

data(geospiza_raw)
geoTree <- geospiza_raw$tree
geoData <- geospiza_raw$data

## fix differences in tip names between the tree and the data
geoData <- rbind(geoData, array(, dim = c(1,ncol(geoData)),
                  dimnames = list("olivacea", colnames(geoData))))

### Example using a tree of class 'phylo'
exGeo1 <- phylo4d(geoTree, tip.data = geoData)

### Example using a tree of class 'phylo4'
geoTree <- as(geoTree, "phylo4")

## some random node data
rNodeData <- data.frame(randomTrait = rnorm(nNodes(geoTree)),
                        row.names = nodeId(geoTree, "internal"))

exGeo2 <- phylo4d(geoTree, tip.data = geoData, node.data = rNodeData)

### Example using 'merge.data'
data(geospiza)
trGeo <- extractTree(geospiza)
tDt <- data.frame(a=rnorm(nTips(trGeo)), row.names=nodeId(trGeo, "tip"))
nDt <- data.frame(a=rnorm(nNodes(trGeo)), row.names=nodeId(trGeo, "internal"))

(matchData1 <- phylo4d(trGeo, tip.data=tDt, node.data=nDt, merge.data=FALSE))
(matchData2 <- phylo4d(trGeo, tip.data=tDt, node.data=nDt, merge.data=TRUE))

## Example with 'all.data'
nodeLabels(geoTree) <- as.character(nodeId(geoTree, "internal"))
rAllData <- data.frame(randomTrait = rnorm(nTips(geoTree) + nNodes(geoTree)),
row.names = labels(geoTree, 'all'))

exGeo5 <- phylo4d(geoTree, all.data = rAllData)

## Examples using 'rownamesAsLabels' and comparing with match.data=FALSE
tDt <- data.frame(x=letters[1:nTips(trGeo)],
                  row.names=sample(nodeId(trGeo, "tip")))
tipLabels(trGeo) <- as.character(sample(1:nTips(trGeo)))
(exGeo6 <- phylo4d(trGeo, tip.data=tDt, rownamesAsLabels=TRUE))
(exGeo7 <- phylo4d(trGeo, tip.data=tDt, rownamesAsLabels=FALSE))
(exGeo8 <- phylo4d(trGeo, tip.data=tDt, match.data=FALSE))

## generate a tree and some data
set.seed(1)
p3 <- ape::rcoal(5)
dat <- data.frame(a = rnorm(5), b = rnorm(5), row.names = p3$tip.label)
dat.defaultnames <- dat
row.names(dat.defaultnames) <- NULL
dat.superset <- rbind(dat, rnorm(2))
dat.subset <- dat[-1, ]

## create a phylo4 object from a phylo object
p4 <- as(p3, "phylo4")

## create phylo4d objects with tip data
p4d <- phylo4d(p4, dat)
###checkData(p4d)
p4d.sorted <- phylo4d(p4, dat[5:1, ])
try(p4d.nonames <- phylo4d(p4, dat.defaultnames))
p4d.nonames <- phylo4d(p4, dat.defaultnames, match.data=FALSE)

p4d.subset <- phylo4d(p4, dat.subset)
p4d.subset <- phylo4d(p4, dat.subset)
try(p4d.superset <- phylo4d(p4, dat.superset))
p4d.superset <- phylo4d(p4, dat.superset)

## create phylo4d objects with node data
nod.dat <- data.frame(a = rnorm(4), b = rnorm(4))
p4d.nod <- phylo4d(p4, node.data = nod.dat, match.data=FALSE)


## create phylo4 objects with node and tip data
p4d.all1 <- phylo4d(p4, node.data = nod.dat, tip.data = dat, match.data=FALSE)
nodeLabels(p4) <- as.character(nodeId(p4, "internal"))
p4d.all2 <- phylo4d(p4, all.data = rbind(dat, nod.dat, match.data=FALSE))

Run the code above in your browser using DataLab