read.cep: Reads a CEP (Canoco) data file

Description

read.cep reads a file formatted by relaxed strict CEP format used by Canoco software, among others.

Usage

read.cep(file, maxdata=10000, positive=TRUE, trace=FALSE, force=FALSE)

Arguments

file

File name (character variable).

maxdata

Maximum number of non-zero entries.

positive

Only positive entries, like in community data.

trace

Work verbosely.

force

Run function, even if R refuses first.

Value

Returns a data frame, where columns are species and rows are sites. Column and row names are taken from the CEP file, and changed into unique R names by make.names after stripping the blanks.

Details

Cornell Ecology Programs (CEP) introduced several data formats designed for punched cards. One of these was the `condensed strict' format which was adopted by popular software DECORANA and TWINSPAN. Later, Cajo ter Braak wrote Canoco based on DECORANA, where he adopted the format, but relaxed it somewhat (that's why I call it a `relaxed strict' format). Further, he introduced a more ordinary `free' format, and allowed the use of classical Fortran style `open' format with fixed field widths. This function should be able to deal with all these Canoco formats, whereas it cannot read many of the traditional CEP alternatives.

All variants of CEP formats have:

Two or three title cards, most importantly specifying the format (or word FREE) and the number of items per record (number of species and sites for FREE format).
Data in one of three accepted formats:
1. Condensed format: First number on the line is the site identifier, and it is followed by pairs (`couplets') of numbers identifying the species and its abundance (an integer and a floating point number).
2. Open Fortran format, where the first number on the line must be the site number, followed by abundance values in fields of fixed widths. Empty fields are interpreted as zeros.
3. `Free' format, where the numbers are interpreted as abundance values. These numbers must be separated by blank space, and zeros must be written as zeros.

Species and site names, given in Fortran format (10A8): Ten names per line, eight columns for each.

With option positive = TRUE the function removes all lines and columns with zero or negative marginal sums. In community data with only positive entries, this removes empty sites and species. If data entries can be negative, this ruins data, and such data sets should be read in with option positive = FALSE.

References

Ter Braak, C.J.F. (1984--): CANOCO -- a FORTRAN program for canonical community ordination by [partial] [detrended] [canonical] correspondence analysis, principal components analysis and redundancy analysis. TNO Inst. of Applied Computer Sci., Stat. Dept. Wageningen, The Netherlands.

Examples

Run this code

## Provided that you have the file `dune.spe'
## Not run: 
# theclassic <- read.cep("dune.spe", force=T)## End(Not run)

Run the code above in your browser using DataLab