TRAMPknowns: TRAMPknowns Objects

Description

These functions create and interact with TRAMPknowns objects (collections of known TRFLP patterns). Knowns contrast with “samples” (see TRAMPsamples) in that knowns contain identified profiles, while samples contain unidentified profiles. Knows must have at most one peak per enzyme/primer combination (see Details).

Usage

TRAMPknowns(data, info, cluster.pars=list(), file.pat=NULL,
            warn.factors=TRUE, ...)

# S3 method for TRAMPknowns
labels(object, ...)
# S3 method for TRAMPknowns
summary(object, include.info=FALSE, ...)

Arguments

data

data.frame containing peak information.

info

data.frame, describing individual samples (see Details for definitions of both data.frames).

cluster.pars

Parameters used when clustering the knowns database. See Details.

file.pat

Optional partial filename in which to store knowns database after modification. Files <file.pat>_info.csv and <file.pat>_data.csv will be created.

warn.factors

Logical: Should a warning be given if any columns in info or data are converted into factors?

object

A TRAMPknowns object.

include.info

Logical: Should the output be augmented with the contents of the info component of the TRAMPknowns object?

...

TRAMPknowns: Additional objects to incorportate into a TRAMPknowns object. Other methods: Further arguments passed to or from other methods.

Value

TRAMPknowns

A new TRAMPknowns object: a list with components info, data (the provided data.frames, with clustering information added to info), cluster.pars and file.pat, plus any extra objects passed as ....

labels.TRAMPknowns

A sorted vector of the unique samples present in x (from info$knowns.pk).

summary.TRAMPknowns

A data.frame, with the size of the peak (if present) for each enzyme/primer combination, with each known (indicated by knowns.pk) as rows and each combination (in the format <primer>_<enzyme>) as columns.

Details

The object has at least two components, which relate to each other (in the sense of a relational database). info holds information about the individual samples, and data holds information about individual peaks (many of which may belong to a single sample).

Column definitions:

info:

knowns.pk:
Unique positive integer, used to identify individual knowns (i.e. a “primary key”).

species:
Character, giving species name.
data:

knowns.fk:
Positive integer, indicating which sample the peak belongs to (by matching against info$knowns.pk) (i.e. a “foreign key”).

primer:
Character, giving the name of the primer used.

enzyme:
Character, giving the name of the restriction digest enzyme used.

size:
Numeric, giving size (in base pairs) of the peak.

In addition, TRAMPknowns will create additional columns holding clustering information (see group.knowns). Additional columns are allowed (and retained, but ignored) in both data.frames. Additional objects are allowed as part of the TRAMPknowns object, but these will not be written by write.TRAMPknowns; any extra objects passed (via ...) will be included in the final TRAMPknowns object.

The cluster.pars argument controls how knowns will be clustered (this will happen automatically as needed). Elements of the list cluster.pars may be any of the three arguments to group.knowns, and will be used as defaults in subsequent calls to group.knowns. If not provided, default values are: dist.method="maximum", hclust.method="complete", cut.height=2.5 (if only some elements of cluster.pars are provided, the remaining elements default to the values above). To change values of clustering parameters in an existing TRAMPknowns object, use group.knowns.

A known contains at most one peak per enzyme/primer combination. Where a species is known to have multiple TRFLP profiles, these should be treated as separate knowns with different, unique, knowns.pk values, but with identical species values. A sample containing either pattern will then be recorded as having that species present (see group.knowns).

Examples

Run this code

# NOT RUN {
## This example builds a TRAMPknowns object from completely artificial
## data:

## The info data.frame:
knowns.info <-
  data.frame(knowns.pk=1:8,
             species=rep(paste("Species", letters[1:5]), length=8))
knowns.info

## The data data.frame:
knowns.data <- expand.grid(knowns.fk=1:8,
                           primer=c("ITS1F", "ITS4"),
                           enzyme=c("BsuRI", "HpyCH4IV"))
knowns.data$size <- runif(nrow(knowns.data), min=40, max=800)

## Construct the TRAMPknowns object:
demo.knowns <- TRAMPknowns(knowns.data, knowns.info, warn.factors=FALSE)

## A plot of the pretend knowns:
plot(demo.knowns, cex=1, group.clusters=TRUE)
# }

Run the code above in your browser using DataLab