- sources
A character vector of paths to the dataset files.
- schema
The schema for the dataset. If NULL, the schema will be
inferred from the dataset files.
- hive_style
A logical value indicating whether to the dataset uses
Hive-style partitioning.
- unify_schemas
A logical value indicating whether to unify the schemas
of the dataset files (union_by_name). If TRUE, will execute a UNION by
column name across all files (NOTE: this can add considerably to
the initial execution time)
- format
The format of the dataset files. One of "parquet"
, "csv"
,
"tsv"
, or "sf"
(spatial vector files supported by the sf package / GDAL).
if no argument is provided, the function will try to guess the type based
on minimal heuristics.
- conn
A connection to a database.
- tblname
The name of the table to create in the database.
- mode
The mode to create the table in. One of "VIEW"
or "TABLE"
.
Creating a VIEW
, the default, will execute more quickly because it
does not create a local copy of the dataset. TABLE
will create a local
copy in duckdb's native format, downloading the full dataset if necessary.
When using TABLE
mode with large data, please be sure to use a conn
connections with disk-based storage, e.g. by calling cached_connection()
,
e.g. cached_connection("storage_path")
, otherwise the full data must fit
into RAM. Using TABLE
assumes familiarity with R's DBI-based interface.
- filename
A logical value indicating whether to include the filename in
the table name.
- recursive
should we assume recursive path? default TRUE. Set to FALSE
if trying to open a single, un-partitioned file.
- ...
optional additional arguments passed to duckdb_s3_config()
.
Note these apply after those set by the URI notation and thus may be used
to override or provide settings not supported in that format.