qdapRegex (version 0.7.8)

rm_endmark: Remove/Replace/Extract Endmarks

Description

Remove/replace/extract endmarks from a string.

Usage

rm_endmark(
  text.var,
  trim = !extract,
  clean = TRUE,
  pattern = "@rm_endmark",
  replacement = "",
  extract = FALSE,
  dictionary = getOption("regex.library"),
  ...
)

ex_endmark( text.var, trim = !extract, clean = TRUE, pattern = "@rm_endmark", replacement = "", extract = TRUE, dictionary = getOption("regex.library"), ... )

Value

Returns a character string with endmarks removed.

Arguments

text.var

The text variable.

trim

logical. If TRUE removes leading and trailing white spaces.

clean

trim logical. If TRUE extra white spaces and escaped character will be removed.

pattern

A character string containing a regular expression (or character string for fixed = TRUE) to be matched in the given character vector. Default, @rm_endmark uses the rm_dollar regex from the regular expression dictionary from the dictionary argument.

replacement

Replacement for matched pattern.

extract

logical. If TRUE the endmark strings are extracted into a list of vectors.

dictionary

A dictionary of canned regular expressions to search within if pattern begins with "@rm_".

...

Other arguments passed to gsub.

Details

The default regular expression used by rm_endmark finds endmark punctuation used in the qdap package; this includes ! . ? * AND |. This behavior can be altered (to ; AND : or to use just ! . AND ?) by using a secondary regular expression from the regex_usa data (or other dictionary) via (pattern = "@rm_endmark2" or pattern = "@rm_endmark3"). See Examples for example usage.

See Also

gsub, stri_extract_all_regex

Other rm_ functions: rm_abbreviation(), rm_between(), rm_bracket(), rm_caps_phrase(), rm_caps(), rm_citation_tex(), rm_citation(), rm_city_state_zip(), rm_city_state(), rm_date(), rm_default(), rm_dollar(), rm_email(), rm_emoticon(), rm_hash(), rm_nchar_words(), rm_non_ascii(), rm_non_words(), rm_number(), rm_percent(), rm_phone(), rm_postal_code(), rm_repeated_characters(), rm_repeated_phrases(), rm_repeated_words(), rm_tag(), rm_time(), rm_title_name(), rm_url(), rm_white(), rm_zip()

Examples

Run this code
x <- c("I like the dog.", "I want it *|", "I;", 
    "Who is| that?", "Hello world", "You...")

rm_endmark(x)
ex_endmark(x)

rm_endmark(x, pattern="@rm_endmark2")
ex_endmark(x, pattern="@rm_endmark2")

rm_endmark(x, pattern="@rm_endmark3")
ex_endmark(x, pattern="@rm_endmark3")

Run the code above in your browser using DataCamp Workspace