Feature hashing, or the hashing trick, is a transformation of a
text variable into a new set of numerical variables. This is done by
applying a hashing function over the tokens and using the hash values
as feature indices. This allows for a low memory representation of the
text. This implementation is done using the MurmurHash3 method.
The argument `num_terms` controls the number of indices that the hashing
function will map to. This is the tuning parameter for this
transformation. Since the hashing function can map two different tokens
to the same index, will a higher value of `num_terms` result in a lower
chance of collision.
The new components will have names that begin with `prefix`, then
the name of the variable, followed by the tokens all seperated by
`-`. The variable names are padded with zeros. For example,
if `num_terms < 10`, their names will be `hash1` - `hash9`.
If `num_terms = 101`, their names will be `hash001` - `hash101`.