A dataset based on Mechura's (2016) English lemmatization list. This data set can be useful for join style lemma replacement of inflected token forms to their root lemmas. While this is not a true morphological analysis this style of lemma replacement is fast and typically still robust.
data(hash_lemmas)
A data frame with 41,532 rows and 2 variables
token. An inflected token with affixes
lemma. A base form
Mechura, M. B. (2016). Lemmatization list: English (en) [Data file]. Retrieved from http://www.lexiconista.com