filesstrings (version 2.2.0)

extract_numbers: Extract numbers (or non-numbers) from a string.

Description

extract_numbers extracts the numbers (or non-numbers) from a string where decimals are optionally allowed. extract_non_numerics extracts the bits of the string that aren't extracted by extract_numbers. nth_number is a convenient wrapper for extract_numbers, allowing you to choose which number you want. Similarly nth_non_numeric. Please view the examples at the bottom of this page to ensure that you understand how these functions work, and their limitations. These functions are vectorized over string.

Usage

extract_numbers(string, leave_as_string = FALSE, decimals = FALSE,
  leading_decimals = FALSE, negs = FALSE)

extract_non_numerics(string, decimals = FALSE, leading_decimals = FALSE, negs = FALSE)

nth_number(string, n, leave_as_string = FALSE, decimals = FALSE, leading_decimals = FALSE, negs = FALSE)

first_number(string, leave_as_string = FALSE, decimals = FALSE, leading_decimals = FALSE, negs = FALSE)

last_number(string, leave_as_string = FALSE, decimals = FALSE, leading_decimals = FALSE, negs = FALSE)

nth_non_numeric(string, n, decimals = FALSE, leading_decimals = FALSE, negs = FALSE)

first_non_numeric(string, decimals = FALSE, leading_decimals = FALSE, negs = FALSE)

last_non_numeric(string, decimals = FALSE, leading_decimals = FALSE, negs = FALSE)

Arguments

string

A string.

leave_as_string

Do you want to return the number as a string (TRUE) or as numeric (FALSE, the default)?

decimals

Do you want to include the possibility of decimal numbers (TRUE) or not (FALSE, the default).

leading_decimals

Do you want to allow a leading decimal point to be the start of a number?

negs

Do you want to allow negative numbers? Note that double negatives are not handled here (see the examples).

n

The index of the number (or non-numeric) that you seek. Negative indexing is allowed i.e. n = 1 (the default) will give you the first number (or non-numeric) whereas n = -1 will give you the last number (or non-numeric), n = -2 will give you the second last number and so on.

Value

For extract_numbers and extract_non_numerics, a list of numeric or character vectors, one list element for each element of string. For nth_number and nth_non_numeric, a vector the same length as string (as in length(string), not nchar(string)).

Details

If any part of a string contains an ambiguous number (e.g. 1.2.3 would be ambiguous if decimals = TRUE (but not otherwise)), the value returned for that string will be NA. Note that these functions do not know about scientific notation (e.g. 1e6 for 1000000).

  • first_number(...) is just nth_number(..., n = 1).

  • last_number(...) is just nth_number(..., n = -1).

  • first_non_numeric(...) is just nth_non_numeric(..., n = 1).

  • last_non_numeric(...) is just nth_non_numeric(..., n = -1).

Examples

Run this code
# NOT RUN {
extract_numbers(c("abc123abc456", "abc1.23abc456"))
extract_numbers(c("abc1.23abc456", "abc1..23abc456"), decimals = TRUE)
extract_numbers("abc1..23abc456", decimals = TRUE)
extract_numbers("abc1..23abc456", decimals = TRUE, leading_decimals = TRUE)
extract_numbers("abc1..23abc456", decimals = TRUE, leading_decimals = TRUE,
                leave_as_string = TRUE)
extract_numbers("-123abc456")
extract_numbers("-123abc456", negs = TRUE)
extract_numbers("--123abc456", negs = TRUE)
extract_non_numerics("abc123abc456")
extract_non_numerics("abc1.23abc456")
extract_non_numerics("abc1.23abc456", decimals = TRUE)
extract_non_numerics("abc1..23abc456", decimals = TRUE)
extract_non_numerics("abc1..23abc456", decimals = TRUE,
leading_decimals = TRUE)
extract_non_numerics(c("-123abc456", "ab1c"))
extract_non_numerics("-123abc456", negs = TRUE)
extract_non_numerics("--123abc456", negs = TRUE)
extract_numbers(c(rep("abc1.2.3", 2), "a1b2.2.3", "e5r6"), decimals = TRUE)
extract_numbers("ab.1.2", decimals = TRUE, leading_decimals = TRUE)
nth_number("abc1.23abc456", 2)
nth_number("abc1.23abc456", 2, decimals = TRUE)
nth_number("-123abc456", -2, negs = TRUE)
extract_non_numerics("--123abc456", negs = TRUE)
nth_non_numeric("--123abc456", 1)
nth_non_numeric("--123abc456", -2)
# }

Run the code above in your browser using DataCamp Workspace