arrow v0.16.0.2


Monthly downloads



Integration to 'Apache' 'Arrow'

'Apache' 'Arrow' <> is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. This package provides an interface to the 'Arrow C++' library.



cran conda-forge Nightly macOS Build
Status Nightly Windows Build
Status codecov

Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication.

The arrow package exposes an interface to the Arrow C++ library to access many of its features in R. This includes support for analyzing large, multi-file datasets (open_dataset()), working with individual Parquet (read_parquet(), write_parquet()) and Feather (read_feather(), write_feather()) files, as well as lower-level access to Arrow memory and messages.


Install the latest release of arrow from CRAN with


Installing a released version of the arrow package should require no additional system dependencies. For macOS and Windows, CRAN hosts binary packages that contain the Arrow C++ library. On Linux, source package installation will download necessary C++ dependencies if you set the environment variable LIBARROW_DOWNLOAD=true. See vignette("install", package = "arrow") for details.

If you install the arrow package from source and the C++ library is not found, the R package functions will notify you that Arrow is not available. Call


to retry installation with dependencies.

Note that install_arrow() is available as a standalone script, so you can access it for convenience without first installing the package:


Conda users on Linux and macOS can install arrow from conda-forge with

conda install -c conda-forge r-arrow

Installing a development version

Binary R packages for macOS and Windows are built daily and hosted at To install from there:

install.packages("arrow", repos = "")


install_arrow(nightly = TRUE)

These daily package builds are not official Apache releases and are not recommended for production use. They may be useful for testing bug fixes and new features under active development.


Windows and macOS users who wish to contribute to the R package and don’t need to alter the Arrow C++ library may be able to obtain a recent version of the library without building from source. On macOS, you may install the C++ library using Homebrew:

# For the released version:
brew install apache-arrow
# Or for a development version, you can try:
brew install apache-arrow --HEAD

On Windows, you can download a .zip file with the arrow dependencies from the rwinlib project, and then set the RWINLIB_LOCAL environment variable to point to that zip file before installing the arrow R package. That project contains released versions of the C++ library; for a development version, Windows users may be able to find a binary by going to the Apache Arrow project’s Appveyor, selecting an R job from a recent build, and downloading the build\arrow-*.zip file from the “Artifacts” tab.

If you need to alter both the Arrow C++ library and the R package code, or if you can’t get a binary version of the latest C++ library elsewhere, you’ll need to build it from source too.

First, install the C++ library. See the developer guide for details.

Note that after any change to the C++ library, you must reinstall it and run make clean or git clean -fdx . to remove any cached object code in the r/src/ directory before reinstalling the R package. This is only necessary if you make changes to the C++ library source; you do not need to manually purge object files if you are only editing R or Rcpp code inside r/.

Once you’ve built the C++ library, you can install the R package and its dependencies, along with additional dev dependencies, from the git checkout:

cd ../../r
R -e 'install.packages(c("devtools", "roxygen2", "pkgdown", "covr")); devtools::install_dev_deps()'

If you need to set any compilation flags while building the Rcpp extensions, you can use the ARROW_R_CXXFLAGS environment variable. For example, if you are using perf to profile the R extensions, you may need to set

export ARROW_R_CXXFLAGS=-fno-omit-frame-pointer

If the package fails to install/load with an error like this:

** testing if installed package can be loaded from temporary location
Error: package or namespace load failed for 'arrow' in dyn.load(file, DLLpath = DLLpath, ...):
unable to load shared object '/Users/you/R/00LOCK-r/00new/arrow/libs/':
dlopen(/Users/you/R/00LOCK-r/00new/arrow/libs/, 6): Library not loaded: @rpath/libarrow.14.dylib

try setting the environment variable R_LD_LIBRARY_PATH to wherever Arrow C++ was put in make install, e.g. export R_LD_LIBRARY_PATH=/usr/local/lib, and retry installing the R package.

When installing from source, if the R and C++ library versions do not match, installation may fail. If you’ve previously installed the libraries and want to upgrade the R package, you’ll need to update the Arrow C++ library first.

For any other build/configuration challenges, see the C++ developer guide and vignette("install", package = "arrow").

Editing Rcpp code

The arrow package uses some customized tools on top of Rcpp to prepare its C++ code in src/. If you change C++ code in the R package, you will need to set the ARROW_R_DEV environment variable to TRUE (optionally, add it to your~/.Renviron file to persist across sessions) so that the data-raw/codegen.R file is used for code generation.

The codegen.R script has these additional dependencies:


We use Google C++ style in our C++ code. Check for style errors with


Fix any style issues before committing with

./ --fix

The lint script requires Python 3 and clang-format-7. If the command isn’t found, you can explicitly provide the path to it like CLANG_FORMAT=$(which clang-format-7) ./ On macOS, you can get this by installing LLVM via Homebrew and running the script as CLANG_FORMAT=$(brew --prefix llvm@7)/bin/clang-format ./

Useful functions

Within an R session, these can help with package development:

devtools::load_all() # Load the dev package
devtools::test(filter="^regexp$") # Run the test suite, optionally filtering file names
devtools::document() # Update roxygen documentation
pkgdown::build_site() # To preview the documentation website
devtools::check() # All package checks; see also below
covr::package_coverage() # See test coverage statistics

Any of those can be run from the command line by wrapping them in R -e '$COMMAND'. There’s also a Makefile to help with some common tasks from the command line (make test, make doc, make clean, etc.)

Full package validation

R CMD build .
R CMD check arrow_*.tar.gz --as-cran

Functions in arrow

Name Description
Dataset Multi-file datasets
FeatherTableReader FeatherTableReader class
CsvTableReader Arrow CSV and JSON table reader classes
DictionaryType class DictionaryType
DataType class arrow::DataType
CsvReadOptions File reader options
Expression Arrow expressions
ArrayData ArrayData class
Codec Compression Codec class
FixedWidthType class arrow::FixedWidthType
InputStream InputStream classes
FileFormat Dataset file formats
ParquetWriterProperties ParquetWriterProperties class
mmap_open Open a memory mapped file
mmap_create Create a new read/write memory mapped file of a given size
Scanner Scan the contents of a dataset
ParquetReaderProperties ParquetReaderProperties class
Schema Schema class
FileSystem FileSystem classes
FileStats FileSystem entry stats
FeatherTableWriter FeatherTableWriter class
ChunkedArray ChunkedArray class
MemoryPool class arrow::MemoryPool
Message class arrow::Message
Field Field class
OutputStream OutputStream classes
MessageReader class arrow::MessageReader
ParquetFileReader ParquetFileReader class
ParquetFileWriter ParquetFileWriter class
RecordBatchReader RecordBatchReader classes
read_json_arrow Read a JSON file
read_message Read a Message from a stream
RecordBatchWriter RecordBatchWriter classes
buffer Buffer class
reexports Objects exported from other packages
arrow_available Is the C++ Arrow library available?
type infer the arrow Array type from an R vector
default_memory_pool default arrow::MemoryPool
cast_options Cast options
install_arrow Install or upgrade the Arrow library
codec_is_available Check whether a compression codec is available
make_readable_file Handle a range of possible input sources
read_schema read a Schema from a stream
data-type Apache Arrow data types
array Arrow Arrays
read_record_batch read arrow::RecordBatch as encapsulated IPC message, given a known arrow::Schema
arrow-package arrow: Integration to 'Apache' 'Arrow'
compression Compressed stream classes
RecordBatch RecordBatch class
read_parquet Read a Parquet file
FileSelector file selector
Partitioning Define Partitioning for a Source
read_table Read an arrow::Table from a stream
Source Sources for a Dataset
read_delim_arrow Read a CSV or other delimited file with Arrow
dictionary Create a dictionary type
read_feather Read a Feather file
Table Table class
write_feather Write data in the Feather format
write_arrow Write Arrow formatted data
hive_partition Construct Hive partitioning
enums Arrow enums
open_source Create a Source for a Dataset
open_dataset Open a multi-file dataset
write_parquet Write Parquet file to disk
No Results!

Vignettes of arrow

No Results!

Last month downloads


License Apache License (>= 2.0)
Encoding UTF-8
Language en-US
LazyData true
SystemRequirements C++11
Biarch true
LinkingTo Rcpp (>= 1.0.1)
RoxygenNote 7.0.2
VignetteBuilder knitr
Collate 'enums.R' 'arrow-package.R' 'type.R' 'array-data.R' 'array.R' 'arrowExports.R' 'buffer.R' 'chunked-array.R' 'io.R' 'compression.R' 'compute.R' 'csv.R' 'dataset.R' 'dictionary.R' 'record-batch.R' 'table.R' 'expression.R' 'dplyr.R' 'feather.R' 'field.R' 'filesystem.R' 'install-arrow.R' 'json.R' 'list.R' 'memory-pool.R' 'message.R' 'parquet.R' 'read-record-batch.R' 'read-table.R' 'record-batch-reader.R' 'record-batch-writer.R' 'reexports-bit64.R' 'reexports-tidyselect.R' 'schema.R' 'struct.R' 'util.R' 'write-arrow.R'
NeedsCompilation yes
Packaged 2020-02-14 00:59:58 UTC; enpiar
Repository CRAN
Date/Publication 2020-02-14 12:20:05 UTC

Include our badge in your README