read.tridas: Function to read Tree Ring Data Standard (TRiDaS) files

Description

This function reads in a TRiDaS format XML file. Measurements, derived series and various kinds of metadata are supported.

Usage

read.tridas(fname, ids.from.titles = FALSE, ids.from.identifiers = TRUE, combine.series = TRUE, trim.whitespace = TRUE, warn.units = TRUE)

Arguments

fname

character vector giving the file name of the TRiDaS file.

ids.from.titles

logical flag indicating whether to override the (tree, core, radius, measurement) structure imposed by the element hierarchy (element, sample, radius, measurementSeries) of the file. If TRUE, measurement series will be rearranged

ids.from.identifiers

logical flag indicating whether to (partially) override the element hierarchy of the file. If TRUE, measurement series will be grouped according to matching identifiers at the measurementSeries level, where identifiers are availa

combine.series

logical flag indicating whether to combine two or more measurement series with the same set of (tree, core, radius, measurement) ID numbers. Each set of combined measurement series will be represented by one column of a resulting data.frame.

trim.whitespace

logical flag indicating whether to replace repeated white spaces in the text content of the file with only one space. Defaults to TRUE, i.e. excess white space will be trimmed from the text.

warn.units

logical flag indicating whether to warn about unitless measurements and strange units. The function expects measurements in units that can be converted to millimetres. Defaults to TRUE: warnings will be given.

Value

A list with a variable number of components according to the contents of the input file. The possible list components are:
measurementsA data.frame or a list of data.frames with the series in columns and the years as rows. Contains measurements () with known years. The series ids are the column names and the years are the row names. The series ids are derived from elements in the input file. Each unique combination of , , , , and gets a separate data.frame.
idsA data.frame or a list of data.frames with columns named tree, core, radius, and measurement, together giving a unique numeric id for each column of the data.frame(s) in measurements. If !combine.series && (ids.from.titles || ids.from.identifiers), some rows may be non-unique.
titlesA data.frame or a list of data.frames with columns named tree, core, radius, and measurement, containing the hierarchy of each column of the data.frame(s) in measurements.
wood.completenessA data.frame or a list of data.frames containing wood completeness information. Column names are a subset of the following, almost self-explanatory set: pith.presence, heartwood.presence, sapwood.presence, last.ring.presence, last.ring.details, bark.presence, n.sapwood, n.missing.heartwood, n.missing.sapwood, missing.heartwood.foundation, missing.sapwood.foundation, n.unmeasured.inner, n.unmeasured.outer.
unitA character vector giving the unit of the measurements. Length equals the number of data.frames in measurements.
project.idA numeric vector giving the project id, i.e. the position of the corresponding element), of the measurements in each data.frame in measurements. Length equals the number of data.frames.
project.titleA character vector giving the title of the project of each data.frame in measurements. Length equals the number of data.frames.
site.idA data.frame giving the site id (position of element(s) within a ) of each data.frame in measurements. May have several columns to reflect the possibly nested elements.
site.titleA data.frame giving the site () title of each data.frame in measurements. May have several columns to reflect the possibly nested elements.
taxonA data.frame showing the taxonomic name for each data.frame in measurements. Contains some of the following columns: text, lang, normal, normalId, normalStd. The first two are a free-form name and its language, and the rest are related to a normalized name.
variableA data.frame showing the measured variable of each data.frame in measurements. Contains some of the following columns: text, lang, normal, normalId, normalStd, normalTridas. The first two are a free-form name and its language, and the rest are related to a normalized name.
undatedA list of measurements with unknown years, together with metadata. Elements are a subset of the following: [object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
derivedA list of calculated series of values, together with metadata. Elements are a subset of the following: [object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
typeA data.frame containing the type of various entities, and metadata related to each type element. Contents are NA where the metadata is not applicable (e.g. no tree.id when the type element refers to a project). Columns are a subset of the following: [object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
commentsA data.frame containing comments to various entities, and metadata related to each comments element. Contents are NA where the metadata is not applicable. Columns are a subset of the following: [object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
identifierA data.frame containing identifiers of various entities, and metadata related to each identifier element. Contents are NA where the metadata is not applicable. Columns are a subset of the following: [object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
remarkA list of remarks concerning individual measured or derived values, with some of the following items: [object Object],[object Object],[object Object]
laboratoryA data.frame or a list of data.frames with one item per project. Each data.frame contains information about the research laboratories involved in the project. Columns are a subset of the following: [object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
researchA data.frame or a list of data.frames with one item per project. Each data.frame contains information about the systems in which the research project is registered. Columns are the following: [object Object],[object Object],[object Object]
altitudeA data.frame containing the altitude of trees. Columns are the following: [object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
preferredA data.frame containing links to preferred measurement series. Columns are a subset of the following: [object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Details

The parameters used for rearranging (ids.from.titles, ids.from.identifiers) and combining (combine.series) measurement series only affect the four lowest levels of document structure: element, sample, radius, measurementSeries. Series are not reorganized or combined at the upper structural levels (project, object).

References

TRiDaS - The Tree Ring Data Standard, http://www.tridas.org/