str_sub

0th

Percentile

Extract substrings from a character vector.

Extract substrings from a character vector.

Keywords
character
Usage
str_sub(string, start = 1L, end = -1L)
Arguments
string
input character vector.
start
integer vector giving position of first charater in substring, defaults to first character. If negative, counts backwards from last character.
end
integer vector giving position of last character in substring, defaults to last character. If negative, counts backwards from last character.
Details

sub_str will recycle all arguments to be the same length as the longest argument. If any arguments are of length 0, the output will be a zero length character vector.

Substrings are inclusive - they include the characters at both start and end positions. sub_str(string, 1, -1) will return the complete substring, from the first character to the last.

Value

  • character vector of substring from start to end (inclusive). Will be length of longest input argument.

See Also

substring which this function wraps, and link{str_sub_replace} for the replacement version

Aliases
  • str_sub
Examples
hw <- "Hadley Wickham"

str_sub(hw, 1, 6)
str_sub(hw, end = 6)
str_sub(hw, 8, 14)
str_sub(hw, 8)
str_sub(hw, c(1, 8), c(6, 14))

str_sub(hw, -1)
str_sub(hw, -7)
str_sub(hw, end = -7)

str_sub(hw, seq_len(str_length(hw)))
str_sub(hw, end = seq_len(str_length(hw)))
Documentation reproduced from package stringr, version 0.5, License: GPL-2

Community examples

antoine.fabri@gmail.com at Jun 13, 2018 stringr v1.3.1

Comparison to `base::substr` , we take the examples from doc with slight alterations. ```r hw <- "Hadley Wickham" ``` ## Same basic use ```r identical(str_sub(hw, 1, 6), substr(hw, 1, 6)) # [1] TRUE identical(str_sub(hw, 8, 14), substr(hw, 8, 14)) # [1] TRUE ``` ## `substr` doesn't have default values ```r str_sub(hw, end = 6) # [1] "Hadley" substr(hw,stop=6) # Error in substr(hw, stop = 6) : # argument "start" is missing, with no default identical(str_sub(hw, end = 6), substr(hw, 1, 6)) # [1] TRUE str_sub(hw, 8) # [1] "Wickham" substr(hw,start=8) # Error in substr(hw, start = 8) : # argument "stop" is missing, with no default identical(str_sub(hw, 8) , substr(hw, 8, 14)) # [1] TRUE ``` ## different ways of dealing with negative indices For `substr`, a negative value for `start` is equivalent to setting it to `1`, and a negative value for `stop` is equivalent to setting it to `0`. For `str_sub` it means starting from the end with the last position being `-1`. ```r str_sub(hw, -1) # [1] "m" substr(hw,-1, 14) # [1] "Hadley Wickham" identical(str_sub(hw, -1), substr(hw, 14+1 -1, 14)) # [1] TRUE str_sub(hw, end = -7) # [1] "Hadley W" substr(hw,1, -7) # [1] "" identical(str_sub(hw, -1), substr(hw, 14,14)) # [1] TRUE ``` ## Vectorisation For `substr` Simple vectorization is not supported by default (only 1st element is considered). ```r str_sub(hw, c(1, 8), c(6, 14)) # [1] "Hadley" "Wickham" substr(hw, c(1, 8), c(6, 14)) # [1] "Hadley" identical(str_sub(hw, c(1, 8), c(6, 14)), Vectorize(substr,USE.NAMES = FALSE)(hw, c(1, 8), c(6, 14))) # TRUE str_sub(hw, seq_len(str_length(hw))) identical(str_sub(hw, seq_len(str_length(hw))), Vectorize(substr,USE.NAMES = FALSE)(hw, seq_len(str_length(hw)), 14)) # TRUE identical(str_sub(hw, end = seq_len(str_length(hw))), Vectorize(substr,USE.NAMES = FALSE)(hw, 1, seq_len(str_length(hw)))) # TRUE ``` `substr` doesn't support passing a 2 column matrix as the 2nd argument: ```r pos <- str_locate_all(hw, "[aeio]")[[1]] str_sub(hw, pos) str_sub(hw, pos[, 1], pos[, 2]) identical(str_sub(hw, pos), Vectorize(substr,USE.NAMES = FALSE)(hw, pos[, 1], pos[, 2])) # TRUE ``` ## Basic replacement form is the same ```r x <- x2 <- "BBCDEF" str_sub(x, 1, 1) <- "A" substr(x2, 1, 1) <- "A" identical(x, x2) # [1] TRUE ``` But here again no default arguments and negative indices don't mean the same. ## Replacing by empty string not supported by `substr<-` ```r str_sub(x,1,3) <- "";x # [1] "DEF" substr(x2,1,3) <- "";x # [1] "ABCDEF" ``` ## dealing with NAs `substr<-` returns error when assigning NA. `str_sub` has an `omit_na` parameter to ignore problematic assignments. ```r x1 <- x2 <- x3 <- x4 <- x1b <- x2b <-"AAA" str_sub(x1, 1, NA) <- "B";x1 substr(x1b, 1, NA) <- "B";x1b identical(x1,x1b) # [1] TRUE str_sub(x2, 1, 2) <- NA;x2 # [1] NA substr(x2b, 1, 2) <- NA;x2b # Error in `substr<-`(`*tmp*`, 1, 2, value = NA) : invalid value str_sub(x3, 1, NA, omit_na = TRUE) <- "B";x3 # [1] "AAA" str_sub(x4, 1, 2, omit_na = TRUE) <- NA;x4 # [1] "AAA" ```