regmatches
Extract or Replace Matched Substrings
Extract or replace matched substrings from match data obtained by
regexpr
, gregexpr
or
regexec
.
Usage
regmatches(x, m, invert = FALSE)
regmatches(x, m, invert = FALSE) < value
Arguments
 x
 a character vector
 m
 an object with match data
 invert
 a logical: if
TRUE
, extract or replace the nonmatched substrings.  value
 an object with suitable replacement values for the
matched or nonmatched substrings (see
Details
).
Details
If invert
is FALSE
(default), regmatches
extracts
the matched substrings as specified by the match data. For vector
match data (as obtained from regexpr
), empty matches are
dropped; for list match data, empty matches give empty components
(zerolength character vectors).
If invert
is TRUE
, regmatches
extracts the
nonmatched substrings, i.e., the strings are split according to the
matches similar to strsplit
(for vector match data, at
most a single split is performed).
Note that the match data can be obtained from regular expression
matching on a modified version of x
with the same numbers of
characters.
The replacement function can be used for replacing the matched or
nonmatched substrings. For vector match data, if invert
is
FALSE
, value
should be a character vector with length the
number of matched elements in m
. Otherwise, it should be a
list of character vectors with the same length as m
, each as
long as the number of replacements needed. Replacement coerces values
to character or list and generously recycles values as needed.
Missing replacement values are not allowed.
Value

For
regmatches
, a character vector with the matched substrings
if m
is a vector and invert
is FALSE
. Otherwise,
a list with the matched or nonmatched substrings.For regmatches<
, the updated character vector.
Examples
library(base)
x < c("A and B", "A, B and C", "A, B, C and D", "foobar")
pattern < "[[:space:]]*(,and)[[:space:]]"
## Match data from regexpr()
m < regexpr(pattern, x)
regmatches(x, m)
regmatches(x, m, invert = TRUE)
## Match data from gregexpr()
m < gregexpr(pattern, x)
regmatches(x, m)
regmatches(x, m, invert = TRUE)
## Consider
x < "John (fishing, hunting), Paul (hiking, biking)"
## Suppose we want to split at the comma (plus spaces) between the
## persons, but not at the commas in the parenthesized hobby lists.
## One idea is to "blank out" the parenthesized parts to match the
## parts to be used for splitting, and extract the persons as the
## nonmatched parts.
## First, match the parenthesized hobby lists.
m < gregexpr("\\([^)]*\\)", x)
## Write a little utility for creating blank strings with given numbers
## of characters.
blanks < function(n) {
vapply(Map(rep.int, rep.int(" ", length(n)), n, USE.NAMES = FALSE),
paste, "", collapse = "")
}
## Create a copy of x with the parenthesized parts blanked out.
s < x
regmatches(s, m) < Map(blanks, lapply(regmatches(s, m), nchar))
s
## Compute the positions of the split matches (note that we cannot call
## strsplit() on x with match data from s).
m < gregexpr(", *", s)
## And finally extract the nonmatched parts.
regmatches(x, m, invert = TRUE)