Learn R Programming

text2vec (version 0.2.0)

feature_hasher: Creates meta information about feature hashing

Description

Creates text2vec_feature_hasher object (actually a simple list), which contains meta-information about feature hashing parameters. Usually result of this function is used in create_hash_corpus function.

Usage

feature_hasher(hash_size = 2^18, ngram = c(ngram_min = 1L, ngram_max = 1L),
  signed_hash = FALSE)

Arguments

hash_size
integer > 0 - number of hash-buckets for hashing trick (feature hashing). Preferably power of 2 number.
ngram
integer vector. The lower and upper boundary of the range of n-values for different n-grams to be extracted. All values of n such that
signed_hash
logical, indicating whether to use second hash-function to reduce impact of collisions.

See Also

create_hash_corpus

Examples

Run this code
fh <- feature_hasher(2**16, c(1L, 2L), TRUE)

Run the code above in your browser using DataLab