This class is responsible of detecting the existing URLs in the
data field of each Instance. Identified URLs are
stored inside the URLs field of Instance class.
Moreover if required, is able to perform inline URLs removement.
FindUrlPipe
FindUrlPipe$new(propertyName = "URLs",
alwaysBeforeDeps = list(),
notAfterDeps = list())
Arguments:
propertyName: (character) name of the property associated with the Pipe.
alwaysBeforeDeps: (list) the dependences alwaysBefore (Pipes that must be executed before this one).
notAfterDeps: (list) the dependences notAfter (Pipes that cannot be executed after this one).
This class inherits from PipeGeneric and implements the
pipe abstract function.
pipe:
preprocesses the Instance to obtain/remove the users.
Usage:
pipe(instance,
removeUrl = TRUE,
URLPatterns = list(self$URLPattern, self$EmailPattern),
namesURLPatterns = list("UrlPattern","EmailPattern"))Value:
the Instance with the modifications that have occurred in the Pipe.
Arguments:
instance:
(Instance) Instance to preproccess.
removeUrl: (logical) indicates if the URLs are removed.
URLPatterns: (list) the regex to find URLs.
namesURLPatterns: (list) the names of regex.
findUrl: finds the URLs in the data.
Usage:
findHashtag(pattern, data)
Value: list with URLs found.
Arguments:
pattern: (character) regex to find URLs.
data: (character) text to search the URLs.
removeUrl: removes the URLs in the data.
Usage:
removeUrl(pattern, data)
Value: the data with URLs removed.
Arguments:
pattern: (character) regex to find URLs.
data: (character) text to remove the URLs.
putNamesURLPattern: sets the names to URL patterns result.
Usage:
putNamesURLPattern(resultOfURLPatterns)
Value:
Value of resultOfURLPatterns variable with the names of URL pattern.
Arguments:
resultOfURLPatterns: (list) list with URLs found.
getURLPatterns: gets of URL patterns.
Usage:
getURLPatterns()
Value: value of URL patterns.
getNamesURLPatterns: gets of name of URLs.
Usage:
getNamesURLPatterns()
Value: value of name of URLs.
setNamesURLPatterns: sets the name of URLs.
Usage:
setNamesURLPatterns(namesURLPatterns)
Arguments:
namesURLPatterns: (character) the new value of the name of URLs.
URLPattern: (character) regular expression to detect URLs.
EmailPattern: (character) regular expression to detect emails.
URLPatterns: (list) regular expressions used to detect URLs.
namesURLPatterns: (list) names of regular expressions that are used to identify URLs.
The regular expressions indicated in the URLPatterns
variable are used to identify URLs.
AbbreviationPipe, ContractionPipe,
File2Pipe, FindEmojiPipe,
FindEmoticonPipe, FindHashtagPipe,
FindUserNamePipe, GuessDatePipe,
GuessLanguagePipe, Instance,
InterjectionPipe, MeasureLengthPipe,
PipeGeneric, SlangPipe,
StopWordPipe, StoreFileExtPipe,
TargetAssigningPipe, TeeCSVPipe,
ToLowerCasePipe