Learn R Programming

jiebaR (version 0.8.1)

simhash: Simhash computation

Description

Simhash worker uses the keyword extraction worker to find the keywords and uses simhash algorithm to compute simhash. dict hmm, idf and stop_word should be provided when initializing jiebaR worker.

Usage

simhash(code, jiebar)
vector_simhash(code, jiebar)

Arguments

code
A Chinese sentence or the path of a text file.
jiebar
jiebaR Worker.

Details

There is a symbol <=< code=""> for this function.

References

MS Charikar - Similarity Estimation Techniques from Rounding Algorithms

See Also

<=.simhash< a=""> worker

Examples

Run this code
## Not run: 
# ### Simhash
# words = "hello world"
# simhasher = worker("simhash",topn=1)
# simhasher <= words
# distance("hello world" , "hello world!" , simhasher)
# ## End(Not run)

Run the code above in your browser using DataLab