Learn R Programming

RSLP

Removedor de Sufixos da Língua Portuguesa

This package uses the algorithm Stemming Algorithm for the Portuguese Language described in this article by Viviane Moreira Orengo and Christian Huyck.

The idea of the stemmer is very well explained by the following schema.

Installing

To install the package you can use the following:

devtools::install_github("dfalbel/rslp")

Using

The only important function of the package is the rslp function. You can call it on a vector of characters like this:

library(rslp)
words <- c("balões", "aviões", "avião", "gostou", "gosto", "gostaram")
rslp(words)
#> [1] "bal"  "avi"  "avi"  "gost" "gost" "gost"

It works with vector of texts too, using the rslp_doc function.

docs <- c(
  "coma frutas pois elas fazem bem para a saúde.",
  "não coma doces, eles fazem mal para os dentes."
  )
rslp_doc(docs)
#> [1] "com frut poi ela faz bem par a saud." 
#> [2] "nao com doc, ele faz mal par os dent."

Copy Link

Version

Install

install.packages('rslp')

Monthly Downloads

143

Version

0.2.0

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Daniel Falbel

Last Published

May 11th, 2020

Functions in rslp (0.2.0)

extract_replacement_rules

Extract replacement rules
extract_rule_info

Extract Rule Info
rslp_doc

RSLP Document
verify_sufix

Verify
remove_accents

Remove Acccents
%>%

Pipe operator
extract_rules

Extract Rules from file
extract_rules_info

Extract Rules Info
rslp

RSLP
rslp_

RSLP_
apply_rules

Apply rules
extract_raw_rules

Extract raw rules