stringi (version 1.3.1)

stri_sub: Extract a Substring From or Replace a Substring In a Character Vector

Description

stri_sub extracts substrings under code point-based index ranges provided. Its replacement version allows to substitute parts of a string with given strings. stri_sub_replace is its magrittr's pipe-operator- friendly version.

Usage

stri_sub(str, from = 1L, to = -1L, length)

stri_sub(str, from = 1L, to = -1L, length, omit_na=FALSE) <- value

stri_sub_replace(str, from = 1L, to = -1L, length, omit_na=FALSE, value)

Arguments

str

character vector

from

integer vector or two-column matrix

to

integer vector; mutually exclusive with length and from being a matrix

length

integer vector; mutually exclusive with to and from being a matrix

omit_na

single logical value; if TRUE, missing values in any of the arguments provided will result in an unchanged input; replacement function only

value

character vector to be substituted with; replacement function only

Value

stri_sub returns a character vector. stri_sub<- changes the str object.

The extract function stri_sub returns the indicated substrings. The replacement function stri_sub<- is invoked for its side effect: once it is called, str is modified.

Details

Vectorized over str, [value], from and (to or length). to and length are mutually exclusive.

to has priority over length. If from is a two-column matrix, then the first column is used as from and the second one as to. In such case arguments to and length are ignored.

Naturally, the indexes are code point-based, and not byte-based. Note that for some Unicode strings, the extracted substrings may not be well-formed, especially if the input is not NFC-normalized (see stri_trans_nfc), includes byte order marks, Bidirectional text marks, and so on. Handle with care.

Indexes are 1-based, i.e., an index equal to 1 denotes the first character in a string, which gives a typical R look-and-feel. Argument to defines the last index of the substring, inclusive.

For negative indexes in from or to, counting starts at the end of the string. For instance, index -1 denotes the last code point in the string. Non-positive length gives an empty string.

In stri_sub, out-of-bound indexes are silently corrected. If from > to, then an empty string is returned.

In stri_sub<-, some configurations of indexes may work as string concatenation at the front, back, or middle.

See Also

Other indexing: stri_locate_all_boundaries, stri_locate_all

Examples

Run this code
# NOT RUN {
s <- "Lorem ipsum dolor sit amet, consectetur adipisicing elit."
stri_sub(s, from=1:3*6, to=21)
stri_sub(s, from=c(1,7,13), length=5)
stri_sub(s, from=1, length=1:3)
stri_sub(s, -17, -7)
stri_sub(s, -5, length=4)
(stri_sub(s, 1, 5) <- "stringi")
(stri_sub(s, -6, length=5) <- ".")
(stri_sub(s, 1, 1:3) <- 1:2)

x <- c("a;b", "c:d")
(stri_sub(x, stri_locate_first_fixed(x, ";"), omit_na=TRUE) <- "_")

# }
# NOT RUN {
x %>% stri_sub_replace(1, 5, value="new_substring")
# }

Run the code above in your browser using DataLab