These functions uses the Arrow C++ CSV reader to read into a data.frame.
Arrow C++ options have been mapped to argument names that follow those of
readr::read_delim(), and col_select was inspired by vroom::vroom().
read_delim_arrow(file, delim = ",", quote = "\"",
escape_double = TRUE, escape_backslash = FALSE, col_select = NULL,
skip_empty_rows = TRUE, parse_options = NULL,
convert_options = NULL, read_options = csv_read_options(),
as_tibble = TRUE)read_csv_arrow(file, quote = "\"", escape_double = TRUE,
escape_backslash = FALSE, col_select = NULL,
skip_empty_rows = TRUE, parse_options = NULL,
convert_options = NULL, read_options = csv_read_options(),
as_tibble = TRUE)
read_tsv_arrow(file, quote = "\"", escape_double = TRUE,
escape_backslash = FALSE, col_select = NULL,
skip_empty_rows = TRUE, parse_options = NULL,
convert_options = NULL, read_options = csv_read_options(),
as_tibble = TRUE)
A character path to a local file, or an Arrow input stream
Single character used to separate fields within a record.
Single character used to quote strings.
Does the file escape quotes by doubling them?
i.e. If this option is TRUE, the value """" represents
a single quote, \".
Does the file use backslashes to escape special
characters? This is more general than escape_double as backslashes
can be used to escape the delimiter character, the quote character, or
to add special characters like \n.
A tidy selection specification
of columns, as used in dplyr::select().
Should blank rows be ignored altogether? If
TRUE, blank rows will not be represented at all. If FALSE, they will be
filled with missings.
see csv_parse_options(). If given, this overrides any
parsing options provided in other arguments (e.g. delim, quote, etc.).
Should the function return a data.frame or an
arrow::Table?
A data.frame, or an arrow::Table if as_tibble = FALSE.
read_csv_arrow() and read_tsv_arrow() are wrappers around
read_delim_arrow() that specify a delimiter.
Note that not all readr options are currently implemented here. Please file
an issue if you encounter one that arrow should support.
If you need to control Arrow-specific reader parameters that don't have an
equivalent in readr::read_csv(), you can either provide them in the
parse_options, convert_options, or read_options arguments, or you can
call csv_table_reader() directly for lower-level access.
# NOT RUN {
try({
tf <- tempfile()
on.exit(unlink(tf))
write.csv(iris, file = tf)
df <- read_csv_arrow(tf)
dim(df)
# Can select columns
df <- read_csv_arrow(tf, col_select = starts_with("Sepal"))
})
# }
Run the code above in your browser using DataLab