hunspell_check: Hunspell Spell Checking

Description

Various tools for spell checking. The hunspell_check function takes a vector of words and tests each individual word for correctness. The hunspell_find function takes a character vector with text (sentences) and returns only incorrect words. Finally hunspell_suggest is used to suggest correct words for each (incorrect) input word.

Usage

hunspell_check(words, ignore = character(), lang = "en_US")
hunspell_find(text, ignore = character(), delim = " .!?:;,.",
  lang = "en_US")
hunspell_suggest(words, lang = "en_US")
hunspell_analyze(words, lang = "en_US")
hunspell_stem(words, lang = "en_US")

Arguments

words

character vector with individual words to spellcheck

ignore

character vector with additional approved words dictionary

lang

which dictionary to use. Currently only en_US is supported

text

character vector with arbitrary length text

delim

string with characters used to deliminate words

Details

The functions hunspell_analyze and hunspell_stem try to break down a word and return it's structure or stem word(s).

Currently only US english dictionary is included with the package. Additional dictrionaries can be downloaded from an OpenOffice http://ftp.snt.utwente.nl/pub/software/openoffice/contrib/dictionaries/{mirror} or http://archive.ubuntu.com/ubuntu/pool/main/libr/libreoffice-dictionaries/?C=S;O=D{bundle}.

Examples

Run this code

#check individual words
words <- c("beer", "wiskey", "wine")
correct <- hunspell_check(words)
print(correct)

# find suggestions for incorrect words
hunspell_suggest(words[!correct])

# find incorrect words in piece of text
bad <- hunspell_find("spell checkers are not neccessairy for langauge ninja's")
print(bad)
hunspell_suggest(bad)

Run the code above in your browser using DataLab