Vectorised over string
and pattern
.
str_split(string, pattern, n = Inf, simplify = FALSE)str_split_fixed(string, pattern, n)
Input vector. Either a character vector, or something coercible to one.
Pattern to look for.
The default interpretation is a regular expression, as described
in stringi::stringi-search-regex. Control options with
regex()
.
Match a fixed string (i.e. by comparing only bytes), using
fixed()
. This is fast, but approximate. Generally,
for matching human text, you'll want coll()
which
respects character matching rules for the specified locale.
Match character, word, line and sentence boundaries with
boundary()
. An empty pattern, "", is equivalent to
boundary("character")
.
number of pieces to return. Default (Inf) uses all possible split positions.
For str_split_fixed
, if n is greater than the number of pieces,
the result will be padded with empty strings.
If FALSE
, the default, returns a list of character
vectors. If TRUE
returns a character matrix.
For str_split_fixed
, a character matrix with n
columns.
For str_split
, a list of character vectors.
stri_split()
for the underlying implementation.
# NOT RUN { fruits <- c( "apples and oranges and pears and bananas", "pineapples and mangos and guavas" ) str_split(fruits, " and ") str_split(fruits, " and ", simplify = TRUE) # Specify n to restrict the number of possible matches str_split(fruits, " and ", n = 3) str_split(fruits, " and ", n = 2) # If n greater than number of pieces, no padding occurs str_split(fruits, " and ", n = 5) # Use fixed to return a character matrix str_split_fixed(fruits, " and ", 3) str_split_fixed(fruits, " and ", 4) # }
Run the code above in your browser using DataCamp Workspace