stringi (version 1.0-1)

stringi-search-regex: Regular Expressions in stringi

Description

A regular expression is a pattern describing, possibly in a very abstract way, a part of text. Thanks to many regex functions in stringi, regular expressions may be a very powerful tool in your hand to perform string searching, substring extraction, string splitting, etc., tasks.

Arguments

Regex Functions in <pkg>stringi</pkg>

Note that if a given regex pattern is empty, then all functions in stringi give NA in result and generate a warning. On a syntax error, a quite informative failure message is shown.

If you would like to search for a fixed pattern, refer to stringi-search-coll or stringi-search-fixed. This allows to do a locale-aware text lookup, or a very fast exact-byte search, respectively.

Details

All stri_*_regex functions in stringi use the ICU regex engine. Its settings may be tuned up (for example to perform case-insensitive search), see the stri_opts_regex function for more details.

Regular expression patterns in ICU are quite similar in form and behavior to Perl's regexes. Their implementation is loosely inspired by JDK 1.4 java.util.regex. ICU Regular Expressions conform to the Unicode Technical Standard #18 (see References section) and its features are summarized in the ICU User Guide (see below). A good general introduction to regexes is (Friedl, 2002). Some general topics are also covered in the Rmanual, see regex.

References

Regular expressions -- ICU User Guide, http://userguide.icu-project.org/strings/regexp

J.E.F. Friedl, Mastering Regular Expressions, O'Reilly, 2002

Unicode Regular Expressions -- Unicode Technical Standard #18, http://www.unicode.org/reports/tr18/

Unicode Regular Expressions -- Regex tutorial, http://www.regular-expressions.info/unicode.html

See Also

Other search_regex: stri_opts_regex; stringi-search

Other stringi_general_topics: stringi-arguments; stringi-encoding; stringi-locale; stringi-search-boundaries; stringi-search-charclass; stringi-search-coll; stringi-search-fixed; stringi-search; stringi, stringi-package