stringi (version 0.2-5)

stri_locate_boundaries: Locate Specific Text Boundaries

Description

This function locates specific text boundaries (like character, word, line, or sentence boundaries) and splits strings at the indicated positions.

Usage

stri_locate_boundaries(str, boundary = "word", locale = NULL)

Arguments

str
character vector or an object coercible to
boundary
character vector, each string is one of character, line-break, sentence, or word
locale
NULL or "" for text boundary analysis following the conventions of the default locale, or a single string with locale identifier, see stringi-locale.

Value

  • A list of max(length(str), length(boundary)) integer matrices is returned. The first column gives the start positions of substrings between located boundaries, and the second column gives the end positions. The indices are code point-based, thus they may be passed e.g. to the stri_sub function.

    Moreover, you may get two NAs in one row for no match or NA arguments.

Details

Vectorized over str and boundary.

For more information on the text boundary analysis performed by ICU's BreakIterator, see stri_locate_boundaries.

For locating words in a text using ICU's word iterator, see stri_locate_words.

See Also

Other indexing: stri_locate_all_charclass, stri_locate_first_charclass, stri_locate_first_charclass, stri_locate_last_charclass, stri_locate_last_charclass; stri_locate_all_coll, stri_locate_first_coll, stri_locate_first_coll, stri_locate_last_coll, stri_locate_last_coll; stri_locate_all_fixed, stri_locate_first_fixed, stri_locate_first_fixed, stri_locate_last_fixed, stri_locate_last_fixed; stri_locate_all_regex, stri_locate_first_regex, stri_locate_first_regex, stri_locate_last_regex, stri_locate_last_regex; stri_locate_all; stri_locate_first; stri_locate_last; stri_locate_words; stri_locate; stri_sub, stri_sub<-

Other locale_sensitive: %!==%, %!=%, %<=%< a="">, %<%< a="">, %===%, %==%, %>=%, %>%, %stri!==%, %stri!=%, %stri<=%< a="">, %stri<%< a="">, %stri===%, %stri==%, %stri>=%, %stri>%; stri_cmp, stri_cmp_eq, stri_cmp_equiv, stri_cmp_ge, stri_cmp_gt, stri_cmp_le, stri_cmp_lt, stri_cmp_neq, stri_cmp_nequiv, stri_compare; stri_count_coll; stri_detect_coll; stri_duplicated, stri_duplicated_any; stri_enc_detect2; stri_extract_all_coll, stri_extract_first_coll, stri_extract_first_coll, stri_extract_last_coll, stri_extract_last_coll; stri_extract_words; stri_locate_all_coll, stri_locate_first_coll, stri_locate_first_coll, stri_locate_last_coll, stri_locate_last_coll; stri_locate_words; stri_opts_collator; stri_order, stri_sort; stri_replace_all_coll, stri_replace_first_coll, stri_replace_first_coll, stri_replace_last_coll, stri_replace_last_coll; stri_split_boundaries; stri_split_coll; stri_trans_tolower, stri_trans_totitle, stri_trans_toupper; stri_unique; stri_wrap; stringi-locale; stringi-search-coll

Other search_locate: stri_locate_all_charclass, stri_locate_first_charclass, stri_locate_first_charclass, stri_locate_last_charclass, stri_locate_last_charclass; stri_locate_all_coll, stri_locate_first_coll, stri_locate_first_coll, stri_locate_last_coll, stri_locate_last_coll; stri_locate_all_fixed, stri_locate_first_fixed, stri_locate_first_fixed, stri_locate_last_fixed, stri_locate_last_fixed; stri_locate_all_regex, stri_locate_first_regex, stri_locate_first_regex, stri_locate_last_regex, stri_locate_last_regex; stri_locate_all; stri_locate_first; stri_locate_last; stri_locate_words; stri_locate; stringi-search

Other text_boundaries: stri_extract_words; stri_locate_words; stri_split_boundaries; stri_wrap

Examples

Run this code
if (stri_install_check(silent=TRUE))
stri_locate_boundaries("The\u00a0above-mentioned packages are...", boundary='line')

Run the code above in your browser using DataLab