tm (version 0.7-8)

stemCompletion: Complete Stems

Description

Heuristically complete stemmed words.

Usage

stemCompletion(x,
               dictionary,
               type = c("prevalent", "first", "longest",
                        "none", "random", "shortest"))

Value

A character vector with completed words.

Arguments

x

A character vector of stems to be completed.

dictionary

A Corpus or character vector to be searched for possible completions.

type

A character naming the heuristics to be used:

prevalent

Default. Takes the most frequent match as completion.

first

Takes the first found completion.

longest

Takes the longest completion in terms of characters.

none

Is the identity.

random

Takes some completion.

shortest

Takes the shortest completion in terms of characters.

References

Ingo Feinerer (2010). Analysis and Algorithms for Stemming Inversion. Information Retrieval Technology --- 6th Asia Information Retrieval Societies Conference, AIRS 2010, Taipei, Taiwan, December 1--3, 2010. Proceedings, volume 6458 of Lecture Notes in Computer Science, pages 290--299. Springer-Verlag, December 2010.

Examples

Run this code
data("crude")
stemCompletion(c("compan", "entit", "suppl"), crude)

Run the code above in your browser using DataCamp Workspace