This class is responsible of detecting the existing URLs in the
data field of each Instance
. Identified URLs are
stored inside the URLs field of Instance
class.
Moreover if required, is able to perform inline URLs removement.
FindUrlPipe
FindUrlPipe$new(propertyName = "URLs", alwaysBeforeDeps = list(), notAfterDeps = list())
Arguments:
propertyName: (character) name of the property associated with the Pipe.
alwaysBeforeDeps: (list) the dependences alwaysBefore (Pipes that must be executed before this one).
notAfterDeps: (list) the dependences notAfter (Pipes that cannot be executed after this one).
This class inherits from PipeGeneric
and implements the
pipe
abstract function.
pipe:
preprocesses the Instance
to obtain/remove the users.
Usage:
pipe(instance, removeUrl = TRUE, URLPatterns = list(self$URLPattern, self$EmailPattern), namesURLPatterns = list("UrlPattern","EmailPattern"))
Value:
the Instance
with the modifications that have occurred in the Pipe.
Arguments:
instance:
(Instance) Instance
to preproccess.
removeUrl: (logical) indicates if the URLs are removed.
URLPatterns: (list) the regex to find URLs.
namesURLPatterns: (list) the names of regex.
findUrl: finds the URLs in the data.
Usage:
findHashtag(pattern, data)
Value: list with URLs found.
Arguments:
pattern: (character) regex to find URLs.
data: (character) text to search the URLs.
removeUrl: removes the URLs in the data.
Usage:
removeUrl(pattern, data)
Value: the data with URLs removed.
Arguments:
pattern: (character) regex to find URLs.
data: (character) text to remove the URLs.
putNamesURLPattern: sets the names to URL patterns result.
Usage:
putNamesURLPattern(resultOfURLPatterns)
Value:
Value of resultOfURLPatterns
variable with the names of URL pattern.
Arguments:
resultOfURLPatterns: (list) list with URLs found.
getURLPatterns: gets of URL patterns.
Usage:
getURLPatterns()
Value: value of URL patterns.
getNamesURLPatterns: gets of name of URLs.
Usage:
getNamesURLPatterns()
Value: value of name of URLs.
setNamesURLPatterns: sets the name of URLs.
Usage:
setNamesURLPatterns(namesURLPatterns)
Arguments:
namesURLPatterns: (character) the new value of the name of URLs.
URLPattern: (character) regular expression to detect URLs.
EmailPattern: (character) regular expression to detect emails.
URLPatterns: (list) regular expressions used to detect URLs.
namesURLPatterns: (list) names of regular expressions that are used to identify URLs.
The regular expressions indicated in the URLPatterns
variable are used to identify URLs.
AbbreviationPipe
, ContractionPipe
,
File2Pipe
, FindEmojiPipe
,
FindEmoticonPipe
, FindHashtagPipe
,
FindUserNamePipe
, GuessDatePipe
,
GuessLanguagePipe
, Instance
,
InterjectionPipe
, MeasureLengthPipe
,
PipeGeneric
, SlangPipe
,
StopWordPipe
, StoreFileExtPipe
,
TargetAssigningPipe
, TeeCSVPipe
,
ToLowerCasePipe