ore.search: Search for matches to a regular expression

Description

Search a character vector for one or more matches to an Oniguruma-compatible regular expression. The result is of class "orematch", for which printing and indexing methods are available. The print method uses the crayon package, if it is available.

Usage

ore.search(regex, text, all = FALSE, start = 1L, simplify = TRUE)
is.orematch(x)
## S3 method for class 'orematch':
[(x, i, j, ...)
## S3 method for class 'orematch':
print(x, ...)

Arguments

regex

A single character string or object of class "ore". In the former case, this will first be passed through ore.

text

A vector of strings to match against.

all

If TRUE, then all matches within each element of text will be found. Otherwise, the search will stop at the first match.

start

An optional vector of offsets (in characters) at which to start searching. Will be recycled to the length of text.

simplify

If TRUE, an object of class "orematch" will be returned if text is of length 1. Otherwise, a list of such objects will always be returned.

An R object.

For indexing into an "orematch" object, the match number.

For indexing into an "orematch" object, the group number.

...

Ignored.

Value

For ore.search, an "orematch" object, or a list of the same, each with elements
textA copy of the text element for the current match.
nMatchesThe number of matches found.
offsetsThe offsets (in characters) of each match.
byteOffsetsThe offsets (in bytes) of each match.
lengthsThe lengths (in characters) of each match.
byteLengthsThe lengths (in bytes) of each match.
matchesThe matched substrings.
groupsEquivalent metadata for each parenthesised subgroup in regex, in a series of matrices. If named groups are present in the regex then dimnames will be set appropriately.
For is.orematch, a logical vector indicating whether the specified object has class "orematch". For extraction with one index, a vector of matched substrings. For extraction with two indices, a vector or matrix of substrings corresponding to captured groups.

Examples

Run this code

# Pick out pairs of consecutive word characters
match <- ore.search("(\\w)(\\w)", "This is a test", all=TRUE)

# Find the second matched substring ("is", from "This")
match[2]

# Find the content of the second group in the second match ("s")
match[2,2]

Run the code above in your browser using DataLab

Description

Usage

Arguments

Value

See Also

Examples