arrow (version 0.16.0.2)

Table: Table class

Description

A Table is a sequence of chunked arrays. They have a similar interface to record batches, but they can be composed from multiple record batches or chunked arrays.

Arguments

Factory

The Table$create() function takes the following arguments:

  • ... arrays, chunked arrays, or R vectors, with names; alternatively, an unnamed series of record batches may also be provided, which will be stacked as rows in the table.

  • schema a Schema, or NULL (the default) to infer the schema from the data in ...

S3 Methods and Usage

Tables are data-frame-like, and many methods you expect to work on a data.frame are implemented for Table. This includes [, [[, $, names, dim, nrow, ncol, head, and tail. You can also pull the data from an Arrow table into R with as.data.frame(). See the examples.

A caveat about the $ method: because Table is an R6 object, $ is also used to access the object's methods (see below). Methods take precedence over the table's columns. So, tab$Slice would return the "Slice" method function even if there were a column in the table called "Slice".

A caveat about the [ method for row operations: only "slicing" is currently supported. That is, you can select a continuous range of rows from the table, but you can't filter with a logical vector or take an arbitrary selection of rows by integer indices.

R6 Methods

In addition to the more R-friendly S3 methods, a Table object has the following R6 methods that map onto the underlying C++ methods:

  • $column(i): Extract a ChunkedArray by integer position from the table

  • $ColumnNames(): Get all column names (called by names(tab))

  • $GetColumnByName(name): Extract a ChunkedArray by string name

  • $field(i): Extract a Field from the table schema by integer position

  • $select(spec): Return a new table with a selection of columns. This supports the usual character, numeric, and logical selection methods as well as "tidy select" expressions.

  • $Slice(offset, length = NULL): Create a zero-copy view starting at the indicated integer offset and going for the given length, or to the end of the table if NULL, the default.

  • $Take(i): return an Table with rows at positions given by integers i. If i is an Arrow Array or ChunkedArray, it will be coerced to an R vector before taking.

  • $Filter(i): return an Table with rows at positions where logical vector or Arrow boolean-type (Chunked)Array i is TRUE.

  • $serialize(output_stream, ...): Write the table to the given OutputStream

  • $cast(target_schema, safe = TRUE, options = cast_options(safe)): Alter the schema of the record batch.

There are also some active bindings

  • $num_columns

  • $num_rows

  • $schema

  • $columns: Returns a list of ChunkedArrays

Examples

Run this code
# NOT RUN {
tab <- Table$create(name = rownames(mtcars), mtcars)
dim(tab)
dim(head(tab))
names(tab)
tab$mpg
tab[["cyl"]]
as.data.frame(tab[4:8, c("gear", "hp", "wt")])
# }

Run the code above in your browser using DataLab