hash_lemmas: Lemmatization List
Description
A dataset based on Mechura's (2016) English lemmatization list. This
data set can be useful for join style lemma replacement of inflected token
forms to their root lemmas. While this is not a true morphological analysis
this style of lemma replacement is fast and typically still robust.Format
A data frame with 41,533 rows and 2 variablesDetails
- token. An inflected token with affixes
- lemma. A base form