Learn R Programming

⚠️There's a newer version (2.18.0) of this package.Take me there.

R interface to TensorFlow Dataset API

The TensorFlow Dataset API provides various facilities for creating scalable input pipelines for TensorFlow models, including:

  • Reading data from a variety of formats including CSV files and TFRecords files (the standard binary format for TensorFlow training data).

  • Transforming datasets in a variety of ways including mapping arbitrary functions against them.

  • Shuffling, batching, and repeating datasets over a number of epochs.

  • Streaming interface to data for reading arbitrarily large datasets.

  • Reading and transforming data are TensorFlow graph operations, so are executed in C++ and in parallel with model training.

The R interface to TensorFlow datasets provides access to the Dataset API, including high-level convenience functions for easy integration with the keras package.

For documentation on using tfdatasets, see the package website at https://tensorflow.rstudio.com/tools/tfdatasets/.

Copy Link

Version

Install

install.packages('tfdatasets')

Monthly Downloads

3,814

Version

2.9.0

License

Apache License 2.0

Issues

Pull Requests

Stars

Forks

Maintainer

Tomasz Kalinowski

Last Published

June 29th, 2022

Functions in tfdatasets (2.9.0)

dataset_cache

Caches the elements in this dataset.
as_tensor.tensorflow.python.data.ops.dataset_ops.DatasetV2

Get the single element of the dataset.
dataset_collect

Collects a dataset
all_nominal

Find all nominal variables.
all_numeric

Speciy all numeric variables.
as_tf_dataset

Add the tf_dataset class to a dataset
dataset_bucket_by_sequence_length

A transformation that buckets elements in a Dataset by length
dataset_batch

Combines consecutive elements of this dataset into batches.
as_array_iterator

Convert tf_dataset to an iterator that yields R arrays.
choose_from_datasets

Creates a dataset that deterministically chooses elements from datasets.
dataset_options

Get or Set Dataset Options
dataset_map_and_batch

Fused implementation of dataset_map() and dataset_batch()
dataset_concatenate

Creates a dataset by concatenating given dataset with this dataset.
dataset_decode_delim

Transform a dataset with delimted text lines into a dataset with named columns
dataset_filter

Filter a dataset by a predicate
dataset_enumerate

Enumerates the elements of this dataset
dataset_snapshot

Persist the output of a dataset
dataset_scan

A transformation that scans a function across an input dataset
dataset_repeat

Repeats a dataset count times.
dataset_prefetch

Creates a Dataset that prefetches elements from this dataset.
dataset_padded_batch

Combines consecutive elements of this dataset into padded batches.
dataset_take

Creates a dataset with at most count elements from this dataset
dataset_reduce

Reduces the input dataset to a single element.
dataset_interleave

Maps map_func across this dataset, and interleaves the results
dataset_prepare

Prepare a dataset for analysis
dataset_map

Map a function across a dataset.
dataset_skip

Creates a dataset that skips count elements from this dataset
dataset_shuffle_and_repeat

Shuffles and repeats a dataset returning a new permutation for each epoch.
dataset_prefetch_to_device

A transformation that prefetches dataset values to the given device
dense_features

Dense Features
dataset_take_while

A transformation that stops dataset iteration based on a predicate.
dataset_window

Combines input elements into a dataset of windows.
dataset_group_by_window

Group windows of elements by key and reduce them
dataset_flat_map

Maps map_func across this dataset and flattens the result.
dataset_shard

Creates a dataset that includes only 1 / num_shards of this dataset.
dataset_rejection_resample

A transformation that resamples a dataset to a target distribution.
dataset_unique

A transformation that discards duplicate elements of a Dataset.
dataset_unbatch

Unbatch a dataset
dataset_shuffle

Randomly shuffles the elements of this dataset.
iterator_get_next

Get next element from iterator
iterator_initializer

An operation that should be run to initialize this iterator.
make-iterator

Creates an iterator for enumerating the elements of this dataset.
has_type

Identify the type of the variable.
%>%

Pipe operator
fixed_length_record_dataset

A dataset of fixed-length records from one or more binary files.
random_integer_dataset

Creates a Dataset of pseudorandom values
make_csv_dataset

Reads CSV files into a batched dataset
delim_record_spec

Specification for reading a record from a text file with delimited values
reexports

Objects exported from other packages
sample_from_datasets

Samples elements at random from the datasets in datasets.
next_batch

Tensor(s) for retrieving the next batch from a dataset
feature_spec

Creates a feature specification.
iterator_string_handle

String-valued tensor that represents this iterator
iterator_make_initializer

Create an operation that can be run to initialize this iterator
output_types

Output types and shapes
step_bucketized_column

Creates bucketized columns
hearts

Heart Disease Data Set
sparse_tensor_slices_dataset

Splits each rank-N tf$SparseTensor in this dataset row-wise.
step_numeric_column

Creates a numeric column specification
sql_record_spec

A dataset consisting of the results from a SQL query
step_remove_column

Creates a step that can remove columns
input_fn.tf_dataset

Construct a tfestimators input function from a dataset
step_categorical_column_with_vocabulary_list

Creates a categorical column specification
scaler

List of pre-made scalers
step_embedding_column

Creates embeddings columns
selectors

Selectors
step_shared_embeddings_column

Creates shared embeddings for categorical columns
scaler_min_max

Creates an instance of a min max scaler
scaler_standard

Creates an instance of a standard scaler
dataset_use_spec

Transform the dataset using the provided spec.
step_indicator_column

Creates Indicator Columns
step_crossed_column

Creates crosses of categorical columns
file_list_dataset

A dataset of all files matching a pattern
steps

Steps for feature columns specification.
tensor_slices_dataset

Creates a dataset whose elements are slices of the given tensors.
tensors_dataset

Creates a dataset with a single element, comprising the given tensors.
layer_input_from_dataset

Creates a list of inputs from a dataset
fit.FeatureSpec

Fits a feature specification.
length.tf_dataset

Get Dataset length
step_categorical_column_with_hash_bucket

Creates a categorical column with hash buckets specification
tfrecord_dataset

A dataset comprising records from one or more TFRecord files.
text_line_dataset

A dataset comprising lines from one or more text files.
step_categorical_column_with_vocabulary_file

Creates a categorical column with vocabulary file
read_files

Read a dataset from a set of files
step_categorical_column_with_identity

Create a categorical column with identity
range_dataset

Creates a dataset of a step-separated range of values.
until_out_of_range

Execute code that traverses a dataset until an out of range condition occurs
zip_datasets

Creates a dataset by zipping together the given datasets.
with_dataset

Execute code that traverses a dataset