rematch2 (version 2.0.1)

re_match: Extract Regular Expression Matches Into a Data Frame

Description

re_match wraps regexpr and returns the match results in a convenient data frame. The data frame has one column for each capture group if perl=TRUE, and one final columns called .match for the matching (sub)string. The columns of the capture groups are named if the groups themselves are named.

Usage

re_match(text, pattern, perl = TRUE, ...)

Arguments

text

Character vector.

pattern

A regular expression. See regex for more about regular expressions.

perl

logical should perl compatible regular expressions be used? Defaults to TRUE, setting to FALSE will disable capture groups.

...

Additional arguments to pass to regexpr.

Value

A data frame of character vectors: one column per capture group, named if the group was named, and additional columns for the input text and the first matching (sub)string. Each row corresponds to an element in the text vector.

See Also

Other tidy regular expression matching: re_exec_all, re_exec, re_match_all

Examples

Run this code
# NOT RUN {
dates <- c("2016-04-20", "1977-08-08", "not a date", "2016",
  "76-03-02", "2012-06-30", "2015-01-21 19:58")
isodate <- "([0-9]{4})-([0-1][0-9])-([0-3][0-9])"
re_match(text = dates, pattern = isodate)

# The same with named groups
isodaten <- "(?<year>[0-9]{4})-(?<month>[0-1][0-9])-(?<day>[0-3][0-9])"
re_match(text = dates, pattern = isodaten)
# }

Run the code above in your browser using DataLab