stringi (version 0.2-5)

stri_split_regex: Split a String By Regex Pattern Matches

Description

Splits each element of str into substrings. A pattern indicates delimiters that separate the input into fields. The input data between the matches become the fields themselves.

Usage

stri_split_regex(str, pattern, n_max = -1L, omit_empty = FALSE,
  opts_regex = NULL)

Arguments

str
character vector with strings to search in
pattern
pattern character; regular expressions
n_max
integer vector, maximal number of pieces to return
omit_empty
logical vector; determines whether empty strings should be removed from the result
opts_regex
a named list with ICU Regex settings as generated with stri_opts_regex; NULL for default settings

Value

  • Returns a list of character vectors.

Details

Vectorized over str, pattern, n_max, and omit_empty.

If n_max is negative (default), then all pieces are extracted.

omit_empty is applied during splitting: if set to TRUE, then empty strings will never appear in the resulting vector.

Note that if you want to split a string by characters from a specific class (e.g. whitespaces), stri_split_charclass will be a little bit faster.

See Also

Other search_regex: stri_count_regex; stri_detect_regex; stri_extract_all_regex, stri_extract_first_regex, stri_extract_first_regex, stri_extract_last_regex, stri_extract_last_regex; stri_locate_all_regex, stri_locate_first_regex, stri_locate_first_regex, stri_locate_last_regex, stri_locate_last_regex; stri_match_all_regex, stri_match_first_regex, stri_match_first_regex, stri_match_last_regex, stri_match_last_regex; stri_opts_regex; stri_replace_all_regex, stri_replace_first_regex, stri_replace_first_regex, stri_replace_last_regex, stri_replace_last_regex; stringi-search-regex; stringi-search

Other search_split: stri_split_boundaries; stri_split_charclass; stri_split_coll; stri_split_fixed; stri_split_lines, stri_split_lines1, stri_split_lines1; stri_split; stringi-search

Examples

Run this code
if (stri_install_check(silent=TRUE))
stri_split_regex("Lorem ipsum dolor sit amet",
   "\\p{Z}+") # see also stri_split_charclass

Run the code above in your browser using DataLab