sparktf (version 0.1.0)

spark_read_tfrecord: Read a TFRecord File

Description

Read a TFRecord file as a Spark DataFrame.

Usage

spark_read_tfrecord(sc, name = NULL, path = name, schema = NULL,
  record_type = c("Example", "SequenceExample"), overwrite = TRUE)

Arguments

sc

A spark conneciton.

name

The name to assign to the newly generated table or the path to the file. Note that if a path is provided for the `name` argument then one cannot specify a name.

path

The path to the file. Needs to be accessible from the cluster. Supports the "hdfs://", "s3a://" and "file://" protocols.

schema

(Currently unsupported.) Schema of TensorFlow records. If not provided, the schema is inferred from TensorFlow records.

record_type

Input format of TensorFlow records. By default it is Example.

overwrite

Boolean; overwrite the table with the given name if it already exists?

Examples

Run this code
# NOT RUN {
iris_tbl <- copy_to(sc, iris)
data_path <- file.path(tempdir(), "iris")
df1 <- iris_tbl %>%
ft_string_indexer_model(
  "Species", "label",
  labels = c("setosa", "versicolor", "virginica")
)

df1 %>%
spark_write_tfrecord(
  path = data_path,
  write_locality = "local"
)

spark_read_tfrecord(sc, data_path)
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab