tm (version 0.6-2)

stemCompletion: Complete Stems

Description

Heuristically complete stemmed words.

Usage

stemCompletion(x, dictionary, type = c("prevalent", "first", "longest", "none", "random", "shortest"))

Arguments

x
A character vector of stems to be completed.
dictionary
A Corpus or character vector to be searched for possible completions.
type
A character naming the heuristics to be used:
prevalent
Default. Takes the most frequent match as completion.

first
Takes the first found completion.

longest
Takes the longest completion in terms of characters.

none
Is the identity.

random
Takes some completion.

shortest
Takes the shortest completion in terms of characters.

Value

A character vector with completed words.

References

Ingo Feinerer (2010). Analysis and Algorithms for Stemming Inversion. Information Retrieval Technology --- 6th Asia Information Retrieval Societies Conference, AIRS 2010, Taipei, Taiwan, December 1--3, 2010. Proceedings, volume 6458 of Lecture Notes in Computer Science, pages 290--299. Springer-Verlag, December 2010.

Examples

data("crude")
stemCompletion(c("compan", "entit", "suppl"), crude)