modifiers
From stringr v1.0.0
by Hadley Wickham
Control matching behaviour with modifier functions.
- fixed
- Compare literal bytes in the string. This is very fast, but not usually what you want for non-ASCII character sets.
- coll
- Compare strings respecting standard collation rules.
- regexp
- The default. Uses ICU regular expressions.
- boundary
- Match boundaries between things.
Usage
fixed(pattern, ignore_case = FALSE)
coll(pattern, ignore_case = FALSE, locale = NULL, ...)
regex(pattern, ignore_case = FALSE, multiline = FALSE, comments = FALSE, dotall = FALSE, ...)
boundary(type = c("character", "line_break", "sentence", "word"), skip_word_none = TRUE, ...)
Arguments
- pattern
- Pattern to modify behaviour.
- ignore_case
- Should case differences be ignored in the match?
- locale
- Locale to use for comparisons. See
stri_locale_list()
for all possible options. - ...
- Other less frequently used arguments passed on to
stri_opts_collator
,stri_opts_regex
, orstri_opts_brkiter
- multiline
- If
TRUE
,$
and^
match the beginning and end of each line. IfFALSE
, the default, only match the start and end of the input. - comments
- If
TRUE
, white space and comments beginning with#
are ignored. Escape literal spaces with\
. - dotall
- If
TRUE
,.
will also match line terminators. - type
- Boundary type to detect.
- skip_word_none
- Ignore "words" that don't contain any characters or numbers - i.e. punctuation.
Examples
pattern <- "a.b"
strings <- c("abb", "a.b")
str_detect(strings, pattern)
str_detect(strings, fixed(pattern))
str_detect(strings, coll(pattern))
# coll() is useful for locale-aware case-insensitive matching
i <- c("I", "\u0130", "i")
i
str_detect(i, fixed("i", TRUE))
str_detect(i, coll("i", TRUE))
str_detect(i, coll("i", TRUE, locale = "tr"))
# Word boundaries
words <- c("These are some words.")
str_count(words, boundary("word"))
str_split(words, " ")[[1]]
str_split(words, boundary("word"))[[1]]
# Regular expression variations
str_extract_all("The Cat in the Hat", "[a-z]+")
str_extract_all("The Cat in the Hat", regex("[a-z]+", TRUE))
str_extract_all("a\nb\nc", "^.")
str_extract_all("a\nb\nc", regex("^.", multiline = TRUE))
str_extract_all("a\nb\nc", "a.")
str_extract_all("a\nb\nc", regex("a.", dotall = TRUE))