A Table is a sequence of chunked arrays. They have a similar interface to record batches, but they can be composed from multiple record batches or chunked arrays.
The Table$create()
function takes the following arguments:
...
arrays, chunked arrays, or R vectors, with names; alternatively,
an unnamed series of record batches may also be provided,
which will be stacked as rows in the table.
schema
a Schema, or NULL
(the default) to infer the schema from
the data in ...
Tables are data-frame-like, and many methods you expect to work on
a data.frame
are implemented for Table
. This includes [
, [[
,
$
, names
, dim
, nrow
, ncol
, head
, and tail
. You can also pull
the data from an Arrow table into R with as.data.frame()
. See the
examples.
A caveat about the $
method: because Table
is an R6
object,
$
is also used to access the object's methods (see below). Methods take
precedence over the table's columns. So, tab$Slice
would return the
"Slice" method function even if there were a column in the table called
"Slice".
A caveat about the [
method for row operations: only "slicing" is
currently supported. That is, you can select a continuous range of rows
from the table, but you can't filter with a logical
vector or take an
arbitrary selection of rows by integer indices.
In addition to the more R-friendly S3 methods, a Table
object has
the following R6 methods that map onto the underlying C++ methods:
$column(i)
: Extract a ChunkedArray
by integer position from the table
$ColumnNames()
: Get all column names (called by names(tab)
)
$GetColumnByName(name)
: Extract a ChunkedArray
by string name
$field(i)
: Extract a Field
from the table schema by integer position
$select(spec)
: Return a new table with a selection of columns.
This supports the usual character
, numeric
, and logical
selection
methods as well as "tidy select" expressions.
$Slice(offset, length = NULL)
: Create a zero-copy view starting at the
indicated integer offset and going for the given length, or to the end
of the table if NULL
, the default.
$Take(i)
: return an Table
with rows at positions given by
integers i
. If i
is an Arrow Array
or ChunkedArray
, it will be
coerced to an R vector before taking.
$Filter(i)
: return an Table
with rows at positions where logical
vector or Arrow boolean-type (Chunked)Array
i
is TRUE
.
$serialize(output_stream, ...)
: Write the table to the given
OutputStream
$cast(target_schema, safe = TRUE, options = cast_options(safe))
: Alter
the schema of the record batch.
There are also some active bindings
$num_columns
$num_rows
$schema
$columns
: Returns a list of ChunkedArray
s
# NOT RUN {
tab <- Table$create(name = rownames(mtcars), mtcars)
dim(tab)
dim(head(tab))
names(tab)
tab$mpg
tab[["cyl"]]
as.data.frame(tab[4:8, c("gear", "hp", "wt")])
# }
Run the code above in your browser using DataLab