Capture string tokens into a data.frame
Given a character vector and a regular expression containing capture
strcapture will extract the captured tokens into a
tabular data structure, such as a data.frame, the type and structure of
which is specified by a prototype object. The assumption is that the
same number of tokens are captured from every input string.
strcapture(pattern, x, proto, perl = FALSE, useBytes = FALSE)
The regular expression with the capture expressions.
A character vector in which to capture the tokens.
data.frameor S4 object that behaves like one. See details.
Arguments passed to
proto argument is typically a
data.frame, with a
column corresponding to each capture expression, in order. The
captured character vector is coerced to the type of the column, and
the column names are carried over to the return value. Any data in the
prototype are ignored. See the examples.
A tabular data structure of the same type as
data.frame, containing a column for each capture
expression. The column types and names are inherited from
proto. Cases in
x that do not match
NA in every column.
x <- "chr1:1-1000" pattern <- "(.*?):([[:digit:]]+)-([[:digit:]]+)" proto <- data.frame(chr=character(), start=integer(), end=integer()) strcapture(pattern, x, proto)