Use this when your sparse features are in string or integer format, and you
want to distribute your inputs into a finite number of buckets by hashing.
output_id = Hash(input_feature_string) % bucket_size For input dictionary
features
, features$key$
is either tensor or sparse tensor object. If it's
tensor object, missing values can be represented by -1
for int and ''
for
string. Note that these values are independent of the default_value
argument.
column_categorical_with_hash_bucket(..., hash_bucket_size, dtype = tf$string)
Expression(s) identifying input feature(s). Used as the column name and the dictionary key for feature parsing configs, feature tensors, and feature columns.
An int > 1. The number of buckets.
The type of features. Only string and integer types are supported.
A _HashedCategoricalColumn
.
ValueError: hash_bucket_size
is not greater than 1.
ValueError: dtype
is neither string nor integer.
Other feature column constructors:
column_bucketized()
,
column_categorical_weighted()
,
column_categorical_with_identity()
,
column_categorical_with_vocabulary_file()
,
column_categorical_with_vocabulary_list()
,
column_crossed()
,
column_embedding()
,
column_numeric()
,
input_layer()