Reading and Writing zoo Series
write.zoo are convenience functions for reading
"zoo" series from/to text files. They are convenience
read.zoo(file, format = "", tz = "", FUN = NULL, regular = FALSE, index.column = 1, drop = TRUE, FUN2 = NULL, split = NULL, aggregate = FALSE, ..., text) write.zoo(x, file = "", index.name = "Index", row.names = FALSE, col.names = NULL, ...)
- character string or strings giving the name of the file(s)
which the data
are to be read from/written to. See
write.tablefor more information. Alternatively, in
filecan be a
data.frame(e.g., resulting from a previous
read.tablecall) that is subsequently processed to a
- date format argument passed to
- time zone argument passed to
- a function for computing the index from the first column of the data. See details.
- logical. Should the series be coerced to class
"zooreg"(if the series is regular)?
- numeric vector or list. The column names or numbers of the data frame
in which the index/time is stored. If the
colClassesargument is used and
"NULL"is among its componennts then
index.columnrefers to the column numbers after the columns corresponding to
colClasseshave been removed. If specified as a list then one argument will be passed to argument
FUNper component so that, for example,
index.column = list(1, 2)will cause
FUN(x[,1], x[,2], ...)to be called whereas
index.column = list(1:2)will cause
FUN(x[,1:2], ...)to be called where
xis a data frame of characters data. Here
tz, if they specified as arguments.
index.column = 0can be used to specify that the row names be used as the index. In the case that no row names were input sequential numbering is used. If
index.columnis specified as an ordinary vector then if it has the same length as the number of arguments of
FUN2in the event that
FUN2is specified and
FUNis not) then
index.columnis converted to a list. Also it is always converted to a list if it has length 1.
- logical. If the data frame contains just a single data column, should the second dimension be dropped?
- character with name of the index column in the written data file.
- logical. Should row names be written? Default is
FALSEbecause the row names are just character representations of the index.
- logical. Should column names be written? Default is to
write column names only if
xhas column names.
- function. It is applied to the time index after
FUNis not specified but
FUN2is specified then only
- NULL or column number or name or vector of numbers or
names. If not NULL then the data is assumed to be in long format and is
split according to the indicated columns. See the R
reshapecommand for description of long data. If
split = Infthen the first of each run among the times are made into a separate series, the second of each run and so on. If
split= -Infthen the last of each run is made into a separate series, the second last and so on.
- logical or function. If set to
aggregate.zoois applied to the zoo object created to compute the
meanof all values with the same time index. Alternatively,
aggregatecan be set to any other function that should be used for aggregation. If
FALSE(the default), no aggregation is performed and a warning is given if there are any duplicated time indexes. Note that most
zoofunctions do not accept objects with duplicate time indexes. See
- further arguments passed to
- character. If
fileis not supplied and this is, then data are read from the value of
textvia a text connection. See below for an example.
read.zoo is a convenience function which should make it easier
to read data from a text file and turn it into a
read.zoo reads the data file via
index.column (by default the first) of the resulting data is
interpreted to be the index/time, the remaining columns the corresponding data.
(If the file only has only column then that is assumed to be the data column and
1, 2, ... are used for the index.) To assign the appropriate class
to the index,
FUN can be specified and is applied to the first column.
To process the index,
FUN with the index as the
first argument. If
FUN is not specified, the following default is employed:
file is a data frame with a single
index column that appears to be a time index already, then
FUN = identity is used.
The conditions for a readily produced time index are: It is not
factor (and the arguments
format must not be specified).
(b) If the conditions from (a) do not hold then the following strategy is used.
If there are multiple index columns they are pasted together with a space between each.
Using the (pasted) index column: (1) If
tz is specified then the
index column is converted to
POSIXct. (2) If
format is specified
then the index column is converted to
Date. (3) Otherwise, a heuristic
attempts to decide between
trying them in that order (which may not always succeed though). By default,
only the standard date/time format is used. Hence, supplying
is necessary if some date/time format is used that is not the default. And even
if the default format is appropriate for the index, explicitly supplying
FUN or at least
tz typically leads to more
reliable results than the heuristic.
regular is set to
TRUE and the resulting series has an
underlying regularity, it is coerced to a
write.zoo is a convenience function for writing
to text files. It first coerces its argument to a
a column with the index and then calls
vignette("zoo-read", package = "zoo") for detailed examples.
read.zooreturns an object of class
read.zoo works by first reading the data in using
and then processing it. This implies that
if the index field is entirely numeric the default is to pass it to
or the built-in date conversion routine
a number, rather than a character string.
Thus, a date field such as
to represent December 12, 2007 would be seen as
and interpreted as the 91st day
thereby generating an error.
This comment also applies to trailing decimals so that if
2000.10 were intended to represent the 10th month of 2000 in fact
it would receive
2000.1 and regard it as the first month of 2000
unless similar precautions were taken.
In the above cases the index field should be specified to be
"character" so that leading or trailing zeros
are not dropped. This can be done by specifying a
index column in the
"colClasses" argument, which is passed to
as shown in the examples below.
## this manual page provides a few typical examples, many more cases ## are covered in vignette("zoo-read", package = "zoo") ## read text lines with a single date column Lines <- "2013-12-24 2 2013-12-25 3 2013-12-26 8" read.zoo(text = Lines, FUN = as.Date) # explicit coercion read.zoo(text = Lines, format = "%Y-%m-%d") # same read.zoo(text = Lines) # same, via heuristic ## read text lines with date/time in separate columns Lines <- "2013-11-24 12:41:21 2 2013-12-25 12:41:22.25 3 2013-12-26 12:41:22.75 8" read.zoo(text = Lines, index = 1:2, FUN = paste, FUN2 = as.POSIXct) # explicit coercion read.zoo(text = Lines, index = 1:2, tz = "") # same read.zoo(text = Lines, index = 1:2) # same, via heuristic ## read directly from a data.frame (artificial and built-in BOD) dat <- data.frame(date = paste("2000-01-", 10:15, sep = ""), a = sin(1:6), b = cos(1:6)) read.zoo(dat) data("BOD", package = "datasets") read.zoo(BOD) ## Not run: # ## descriptions of typical examples # # ## turn *numeric* first column into yearmon index # ## where number is year + fraction of year represented by month # z <- read.zoo("foo.csv", sep = ",", FUN = as.yearmon) # # ## first column is of form yyyy.mm # ## (Here we use format in place of as.character so that final zero # ## is not dropped in dates like 2001.10 which as.character would do.) # f <- function(x) as.yearmon(format(x, nsmall = 2), "%Y.%m") # z <- read.zoo("foo.csv", header = TRUE, FUN = f) # # ## turn *character* first column into "Date" index # ## Assume lines look like: 12/22/2007 1 2 # z <- read.zoo("foo.tab", format = "%m/%d/%Y") # # # Suppose lines look like: 09112007 1 2 and there is no header # z <- read.zoo("foo.txt", format = "%d%m%Y") # # ## csv file with first column of form YYYY-mm-dd HH:MM:SS # ## Read in times as "chron" class. Requires chron 2.3-22 or later. # z <- read.zoo("foo.csv", header = TRUE, sep = ",", FUN = as.chron) # # ## same but with custom format. Note as.chron uses POSIXt-style # ## Read in times as "chron" class. Requires chron 2.3-24 or later. # z <- read.zoo("foo.csv", header = TRUE, sep = ",", FUN = as.chron, # format = " # # ## same file format but read it in times as "POSIXct" class. # z <- read.zoo("foo.csv", header = TRUE, sep = ",", tz = "") # # ## csv file with first column mm-dd-yyyy. Read times as "Date" class. # z <- read.zoo("foo.csv", header = TRUE, sep = ",", format = "%m-%d-%Y") # # ## whitespace separated file with first column of form YYYY-mm-ddTHH:MM:SS # ## and no headers. T appears literally. Requires chron 2.3-22 or later. # z <- read.zoo("foo.csv", FUN = as.chron) # # # read in all csv files in the current directory and merge them # read.zoo(Sys.glob("*.csv"), header = TRUE, sep = ",") # # # We use "NULL" in colClasses for those columns we don't need but in # # col.names we still have to include dummy names for them. Of what # # is left the index is the first three columns (1:3) which we convert # # to chron class times in FUN and then truncate to 5 seconds in FUN2. # # Finally we use aggregate = mean to average over the 5 second intervals. # library("chron") # # Lines <- "CVX 20070201 9 30 51 73.25 81400 0 # CVX 20070201 9 30 51 73.25 100 0 # CVX 20070201 9 30 51 73.25 100 0 # CVX 20070201 9 30 51 73.25 300 0 # CVX 20070201 9 30 51 73.25 81400 0 # CVX 20070201 9 40 51 73.25 100 0 # CVX 20070201 9 40 52 73.25 100 0 # CVX 20070201 9 40 53 73.25 300 0" # # z <- read.zoo(text = Lines, # colClasses = c("NULL", "NULL", "numeric", "numeric", "numeric", # "numeric", "numeric", "NULL"), # col.names = c("Symbol", "Date", "Hour", "Minute", "Second", "Price", "Volume", "junk"), # index = 1:3, # do not count columns that are "NULL" in colClasses # FUN = function(h, m, s) times(paste(h, m, s, sep = ":")), # FUN2 = function(tt) trunc(tt, "00:00:05"), # aggregate = mean) # ## End(Not run)