Use this when your inputs are in string or integer format, and you have an
in-memory vocabulary mapping each value to an integer ID. By default,
out-of-vocabulary values are ignored. Use default_value
to specify how to
include out-of-vocabulary values. For the input dictionary features
,
features$key
is either tensor or sparse tensor object. If it's tensor object,
missing values can be represented by -1
for int and ''
for string.
column_categorical_with_vocabulary_list(..., vocabulary_list,
dtype = NULL, default_value = -1L, num_oov_buckets = 0L)
Expression(s) identifying input feature(s). Used as the column name and the dictionary key for feature parsing configs, feature tensors, and feature columns.
An ordered iterable defining the vocabulary. Each
feature is mapped to the index of its value (if present) in
vocabulary_list
. Must be castable to dtype
.
The type of features. Only string and integer types are
supported. If NULL
, it will be inferred from vocabulary_list
.
The value to use for values not in vocabulary_list
.
Non-negative integer, the number of out-of-vocabulary
buckets. All out-of-vocabulary inputs will be assigned IDs in the range
[vocabulary_size, vocabulary_size+num_oov_buckets)
based on a hash of the
input value. A positive num_oov_buckets
can not be specified with
default_value
.
A categorical column with in-memory vocabulary.
ValueError: if vocabulary_list
is empty, or contains
duplicate keys.
ValueError: if dtype
is not integer or string.
Note that these values are independent of the default_value
argument.
Other feature column constructors: column_bucketized
,
column_categorical_weighted
,
column_categorical_with_hash_bucket
,
column_categorical_with_identity
,
column_categorical_with_vocabulary_file
,
column_crossed
,
column_embedding
,
column_numeric
, input_layer