Vectorised over string
and pattern
.
str_extract(string, pattern)str_extract_all(string, pattern, simplify = FALSE)
Input vector. Either a character vector, or something coercible to one.
Pattern to look for.
The default interpretation is a regular expression, as described
in stringi::stringi-search-regex. Control options with
regex()
.
Match a fixed string (i.e. by comparing only bytes), using
fixed()
. This is fast, but approximate. Generally,
for matching human text, you'll want coll()
which
respects character matching rules for the specified locale.
Match character, word, line and sentence boundaries with
boundary()
. An empty pattern, "", is equivalent to
boundary("character")
.
If FALSE
, the default, returns a list of character
vectors. If TRUE
returns a character matrix.
A character vector.
str_match()
to extract matched groups;
stringi::stri_extract()
for the underlying implementation.
# NOT RUN { shopping_list <- c("apples x4", "bag of flour", "bag of sugar", "milk x2") str_extract(shopping_list, "\\d") str_extract(shopping_list, "[a-z]+") str_extract(shopping_list, "[a-z]{1,4}") str_extract(shopping_list, "\\b[a-z]{1,4}\\b") # Extract all matches str_extract_all(shopping_list, "[a-z]+") str_extract_all(shopping_list, "\\b[a-z]+\\b") str_extract_all(shopping_list, "\\d") # Simplify results into character matrix str_extract_all(shopping_list, "\\b[a-z]+\\b", simplify = TRUE) str_extract_all(shopping_list, "\\d", simplify = TRUE) # Extract all words str_extract_all("This is, suprisingly, a sentence.", boundary("word")) # }
Run the code above in your browser using DataCamp Workspace