This class allows guess the language by using language detector of library cld2. Creates the language property which indicates the idiom text. Optionally, it is possible to choose the language provided by Twitter.
This class inherits from GenericPipe and implements the
pipe abstract function.
bdpar::GenericPipe -> GuessLanguagePipe
new()Creates a GuessLanguagePipe object.
GuessLanguagePipe$new(
propertyName = "language",
alwaysBeforeDeps = list("StoreFileExtPipe", "TargetAssigningPipe"),
notAfterDeps = list(),
languageTwitter = TRUE
)propertyNameA character value. Name of the property
associated with the GenericPipe.
alwaysBeforeDepsA list value. The dependencies
alwaysBefore (GenericPipes that must be executed before
this one).
notAfterDepsA list value. The dependencies
notAfter (GenericPipes that cannot be executed after
this one).
languageTwitterA logical value. Indicates whether
for the Instances of type twtid the language that
returns the API is obtained or the detector is applied.
pipe()Preprocesses the Instance to obtain the
language of the data.
GuessLanguagePipe$pipe(instance)instanceA Instance value. The Instance
to preprocess.
The Instance with the modifications that have
occurred in the pipe.
getLanguage()Guesses the language of data.
GuessLanguagePipe$getLanguage(data)dataA character value. The text to guess the ç
language.
The language guesser. Format: see ISO 639-3:2007.
clone()The objects of this class are cloneable with this method.
GuessLanguagePipe$clone(deep = FALSE)deepWhether to make a deep clone.
To obtain the language of the tweets, it will be verified that there is a json file with the information stored in memory. On the other hand, it is necessary define the "cache.twitter.path" field of bdpar.Options variable to know where the information of tweets are saved.
AbbreviationPipe, bdpar.Options,
ContractionPipe, File2Pipe,
FindEmojiPipe, FindEmoticonPipe,
FindHashtagPipe, FindUrlPipe,
FindUserNamePipe, GuessDatePipe,
Instance, InterjectionPipe,
MeasureLengthPipe, GenericPipe,
SlangPipe, StopWordPipe,
StoreFileExtPipe, TargetAssigningPipe,
TeeCSVPipe, ToLowerCasePipe