column_categorical_with_identity: Construct a Categorical Column that Returns Identity Values

Description

Use this when your inputs are integers in the range [0, num_buckets), and you want to use the input value itself as the categorical ID. Values outside this range will result in default_value if specified, otherwise it will fail.

Usage

column_categorical_with_identity(..., num_buckets, default_value = NULL)

Arguments

...

Expression(s) identifying input feature(s). Used as the column name and the dictionary key for feature parsing configs, feature tensors, and feature columns.

num_buckets

Number of unique values.

default_value

If NULL, this column's graph operations will fail for out-of-range inputs. Otherwise, this value must be in the range [0, num_buckets), and will replace inputs in that range.

Value

A categorical column that returns identity values.

Raises

ValueError: if num_buckets is less than one.
ValueError: if default_value is not in range [0, num_buckets).

Details

Typically, this is used for contiguous ranges of integer indexes, but it doesn't have to be. This might be inefficient, however, if many of IDs are unused. Consider column_categorical_with_hash_bucket() in that case.

For input dictionary features, features$key is either tensor or sparse tensor object. If it's tensor object, missing values can be represented by -1 for int and '' for string. Note that these values are independent of the default_value argument.