Text Boundary Analysis in

Text boundary analysis is the process of locating linguistic boundaries while formatting and handling text.


Examples of the boundary analysis process process include:

Generally, text boundary analysis is a locale-dependent operation. For example, in Japanese and Chinese one does not separate words with spaces - a line break can occur even in the middle of a word. These languages have punctuation and diacritical marks that cannot start or end a line, so this must also be taken into account. stringi uses ICU's BreakIterator to locate specific text boundaries. Note that the BreakIterator's behavior may be controlled in come cases, see stri_opts_brkiter.
  • The character boundary iterator tries to match what a user would think of as a ``character'' -- a basic unit of a writing system for a language -- which may be more than just a single Unicode code point.
  • The word boundary iterator locates the boundaries of words, for purposes such as ``Find whole words'' operations.
  • The line_break iterator locates positions that would be appropriate points to wrap lines when displaying the text.
  • On the other hand, a break iterator of type sentence locates sentence boundaries.
For technical details on different classes of text boundaries refer to the ICU User Guide, see below.


Boundary Analysis -- ICU User Guide,

See Also

Other locale_sensitive: %s<%, stri_compare, stri_count_boundaries, stri_duplicated, stri_enc_detect2, stri_extract_all_boundaries, stri_locate_all_boundaries, stri_opts_collator, stri_order, stri_split_boundaries, stri_trans_tolower, stri_unique, stri_wrap, stringi-locale, stringi-search-coll Other text_boundaries: stri_count_boundaries, stri_extract_all_boundaries, stri_locate_all_boundaries, stri_opts_brkiter, stri_split_boundaries, stri_split_lines, stri_trans_tolower, stri_wrap, stringi-search Other stringi_general_topics: stringi-arguments, stringi-encoding, stringi-locale, stringi-package, stringi-search-charclass, stringi-search-coll, stringi-search-fixed, stringi-search-regex, stringi-search

  • stringi-search-boundaries
Documentation reproduced from package stringi, version 1.1.5, License: file LICENSE

Community examples

Looks like there are no examples yet.