stringr (version 1.1.0)

str_extract: Extract matching patterns from a string.


Vectorised over string and pattern.


str_extract(string, pattern)
str_extract_all(string, pattern, simplify = FALSE)


Input vector. Either a character vector, or something coercible to one.
Pattern to look for.

The default interpretation is a regular expression, as described in stringi-search-regex. Control options with regex().

Match a fixed string (i.e. by comparing only bytes), using fixed(x). This is fast, but approximate. Generally, for matching human text, you'll want coll(x) which respects character matching rules for the specified locale.

Match character, word, line and sentence boundaries with boundary(). An empty pattern, "", is equivalent to boundary("character").

If FALSE, the default, returns a list of character vectors. If TRUE returns a character matrix.


A character vector.

See Also

str_match to extract matched groups; stri_extract for the underlying implementation.


Run this code
shopping_list <- c("apples x4", "bag of flour", "bag of sugar", "milk x2")
str_extract(shopping_list, "\\d")
str_extract(shopping_list, "[a-z]+")
str_extract(shopping_list, "[a-z]{1,4}")
str_extract(shopping_list, "\\b[a-z]{1,4}\\b")

# Extract all matches
str_extract_all(shopping_list, "[a-z]+")
str_extract_all(shopping_list, "\\b[a-z]+\\b")
str_extract_all(shopping_list, "\\d")

# Simplify results into character matrix
str_extract_all(shopping_list, "\\b[a-z]+\\b", simplify = TRUE)
str_extract_all(shopping_list, "\\d", simplify = TRUE)

# Extract all words
str_extract_all("This is, suprisingly, a sentence.", boundary("word"))

Run the code above in your browser using DataCamp Workspace