stri_locate_boundaries(str, opts_brkiter = NULL)stri_locate_words(str, locale = NULL)
stri_opts_brkiter
;
NULL
for default break iterator, i.e. line_break
;
stri_locate_boundar
NULL
or ""
for text boundary analysis following
the conventions of the default locale, or a single string with
locale identifier, see stringi-locale;
stri_locate_words
onlylength(str)
integer matrices
is returned. The first column gives the start positions
of substrings between located boundaries, and the second column gives
the end positions. The indices are code point-based, thus
they may be passed e.g. to the stri_sub
function.Moreover, you may get two NA
s in one row
for no match or NA
arguments.
str
.For more information on the text boundary analysis
performed by BreakIterator
, see
stringi-search-boundaries.
In case of stri_locate_words
,
just like in stri_extract_words
and stri_count_words
,
BreakIterator
iterator is used
to locate word boundaries, and all non-word characters
(UBRK_WORD_NONE
rule status) are ignored.
This is function is equivalent to a call to
stri_locate_boundaries(str, stri_opts_brkiter(type="word", skip_word_none=TRUE, locale=locale))
stri_locate
,
stri_locate_all
,
stri_locate_all_charclass
,
stri_locate_all_coll
,
stri_locate_all_fixed
,
stri_locate_all_regex
,
stri_locate_first
,
stri_locate_first_charclass
,
stri_locate_first_coll
,
stri_locate_first_fixed
,
stri_locate_first_regex
,
stri_locate_last
,
stri_locate_last_charclass
,
stri_locate_last_coll
,
stri_locate_last_fixed
,
stri_locate_last_regex
;
stri_sub
, stri_sub<-
Other locale_sensitive: %s!==%
,
%s!=%
, %s<=%< a="">=%<>
,
%s<%< a="">%<>
, %s===%
,
%s==%
, %s>=%
,
%s>%
, %stri!==%
,
%stri!=%
, %stri<=%< a="">=%<>
,
%stri<%< a="">%<>
, %stri===%
,
%stri==%
, %stri>=%
,
%stri>%
; stri_cmp
,
stri_cmp_eq
, stri_cmp_equiv
,
stri_cmp_ge
, stri_cmp_gt
,
stri_cmp_le
, stri_cmp_lt
,
stri_cmp_neq
,
stri_cmp_nequiv
,
stri_compare
;
stri_count_boundaries
,
stri_count_words
;
stri_duplicated
,
stri_duplicated_any
;
stri_enc_detect2
;
stri_extract_words
;
stri_opts_collator
;
stri_order
, stri_sort
;
stri_split_boundaries
;
stri_trans_tolower
,
stri_trans_totitle
,
stri_trans_toupper
;
stri_unique
; stri_wrap
;
stringi-locale
;
stringi-search-boundaries
;
stringi-search-coll
Other search_locate: stri_locate
,
stri_locate_all
,
stri_locate_all_charclass
,
stri_locate_all_coll
,
stri_locate_all_fixed
,
stri_locate_all_regex
,
stri_locate_first
,
stri_locate_first_charclass
,
stri_locate_first_coll
,
stri_locate_first_fixed
,
stri_locate_first_regex
,
stri_locate_last
,
stri_locate_last_charclass
,
stri_locate_last_coll
,
stri_locate_last_fixed
,
stri_locate_last_regex
;
stringi-search
Other text_boundaries: stri_count_boundaries
,
stri_count_words
;
stri_extract_words
;
stri_opts_brkiter
;
stri_split_boundaries
;
stri_split_lines
,
stri_split_lines1
,
stri_split_lines1
;
stri_trans_tolower
,
stri_trans_totitle
,
stri_trans_toupper
;
stri_wrap
;
stringi-search-boundaries
;
stringi-search
test <- "The\u00a0above-mentioned features are very useful. Warm thanks to their developers."
stri_locate_boundaries(test, stri_opts_brkiter(type="line"))
stri_locate_boundaries(test, stri_opts_brkiter(type="word"))
stri_locate_boundaries(test, stri_opts_brkiter(type="sentence"))
stri_locate_boundaries(test, stri_opts_brkiter(type="character"))
stri_locate_words(test)
Run the code above in your browser using DataLab