CorrectS: Function to remove some forms of pluralization.
Description
This function takes a character vector as input and removes some
forms of pluralization from the ends of the words.
Usage
CorrectS(term_vec)
Arguments
term_vec
A character vector
Value
Returns an object of class data.frame with three columns. The first
column is the argument term_vec. The second column is the depluralized
version of the words in term_vec. The third column is a logical, indicating
whether or not the word in term_vec was changed.
Details
The entries of the vector should be single words or short n-grams
without punctuation as the function only looks at the ends of strings. In
other words, if entries are a paragraph of text. Only the final words will
get de-pluralized. (Even then, if the final character is a period, as would
be the case with paragraphs, it's likely that nothing will be de-pluralized.)