text_line_dataset

A dataset comprising lines from one or more text files.

Interface to 'TensorFlow' Datasets, a high-level library for
building complex input pipelines from simple, re-usable pieces.
See <https://www.tensorflow.org/guide> for additional
details.

Tomasz Kalinowski

tfdatasets

Interface to 'TensorFlow' Datasets

Daniel Falbel

JJ Allaire

Yuan Tang

Kevin Ushey

RStudio 

 Google Inc.

text_line_dataset function

<dl><dt>filenames</dt>
<dd>String(s) specifying one or more filenames</dd>
<dt>compression_type</dt>
<dd>A string, one of: <code>NULL</code> (no compression), <code>"ZLIB"</code>,
or <code>"GZIP"</code>.</dd>
<dt>...</dt>
<dd>unused, must be empty.</dd>
<dt>buffer_size</dt>
<dd>(Optional.) A tf.int64 scalar denoting the number of bytes
to buffer. A value of 0 results in the default buffering values chosen
based on the compression type.</dd>
<dt>num_parallel_reads</dt>
<dd>(Optional.) A tf.int64 scalar representing the
number of files to read in parallel. If greater than one, the records of
files read in parallel are outputted in an interleaved order. If your input
pipeline is I/O bottlenecked, consider setting this parameter to a value
greater than one to parallelize the I/O. If NULL, files will be read
sequentially.</dd>
<dt>name</dt>
<dd>(Optional.) A name for the tf.data operation.</dd>
<dt>record_spec</dt>
<dd>(Optional) Specification used to decode delimimted text
lines into records (see <code>delim_record_spec()</code>).</dd>
<dt>parallel_records</dt>
<dd>(Optional) An integer, representing the number of
records to decode in parallel. If not specified, records will be processed
sequentially. This is only applicable if <code>record_spec</code> is provided</dd></dl>

Arguments

A dataset comprising lines from one or more text files. — text_line_dataset

<dl>

<dt>filenames</dt>
<dd>String(s) specifying one or more filenames</dd>


<dt>compression_type</dt>
<dd>A string, one of: <code>NULL</code> (no compression), <code>"ZLIB"</code>,
or <code>"GZIP"</code>.</dd>


<dt>...</dt>
<dd>unused, must be empty.</dd>


<dt>buffer_size</dt>
<dd>(Optional.) A tf.int64 scalar denoting the number of bytes
to buffer. A value of 0 results in the default buffering values chosen
based on the compression type.</dd>


<dt>num_parallel_reads</dt>
<dd>(Optional.) A tf.int64 scalar representing the
number of files to read in parallel. If greater than one, the records of
files read in parallel are outputted in an interleaved order. If your input
pipeline is I/O bottlenecked, consider setting this parameter to a value
greater than one to parallelize the I/O. If NULL, files will be read
sequentially.</dd>


<dt>name</dt>
<dd>(Optional.) A name for the tf.data operation.</dd>


<dt>record_spec</dt>
<dd>(Optional) Specification used to decode delimimted text
lines into records (see <code>delim_record_spec()</code>).</dd>


<dt>parallel_records</dt>
<dd>(Optional) An integer, representing the number of
records to decode in parallel. If not specified, records will be processed
sequentially. This is only applicable if <code>record_spec</code> is provided</dd>

</dl>

text_line_dataset: A dataset comprising lines from one or more text files.

Description

Usage

Value

Arguments