Reads a file in table format
Reads a file in table format and creates a data frame from it, with cases corresponding to lines and variables to fields in the file.
WARNING: This method is very much in an alpha stage. Expect it to change.
This method is an extension to the default
function in R. It is possible to specify a column name to column class
map such that the column classes are automatically assigned from the
column header in the file.
In addition, it is possible to read any subset of rows. The method is optimized such that only columns and rows that are of interest are parsed and read into R's memory. This minimizes memory usage at the same time as it speeds up the reading.
"readTable"(file, colClasses=NULL, isPatterns=FALSE, defColClass=NA, header=FALSE, skip=0, nrows=-1, rows=NULL, col.names=NULL, check.names=FALSE, path=NULL, ..., stripQuotes=TRUE, method=c("readLines", "intervals"), verbose=FALSE)
connectionor a filename. If a filename, the path specified by
pathis added to the front of the filename. Unopened files are opened and closed at the end.
- Either a named or an unnamed
vector. If unnamed, it specified the column classes just as used by
read.table. If it is a named vector,
names(colClasses)are used to match the column names read (this requires that
header=TRUE) and the column classes are set to the corresponding values.
TRUE, the matching of
names(colClasses)to the read column names is done by regular expressions matching.
- If the column class map specified by a named
colClassesargument does not match some of the read column names, the column class is by default set to this class. The default is to read the columns in an "as is" way.
TRUE, column names are read from the file.
- The number of lines (commented or non-commented) to skip before trying to read the header or alternatively the data table.
- The number of rows to read of the data table.
- An row index
vectorspecifying which rows of the table to read, e.g. row one is the row following the header. Non-existing rows are ignored. Note that rows are returned in the same order they are requested and duplicated rows are also returned.
- Same as in
- Same as in
read.table(), but default value is
fileis a filename, this path is added to it, otherwise ignored.
- Arguments passed to
TRUE, quotes are stripped from values before being parse. This argument is only effective when
(readLines())is used internally to first only read rows of interest, which is then passed to
"intervals", contigous intervals are first identified in the rows of interest. These intervals are the read one by one using
read.table(). The latter methods is faster and especially more memory efficient if the intervals are not too many, where as the former is prefered if many "scattered" rows are to be read.