Learn R Programming

filesstrings (version 1.1.0)

extract_numbers: Extract numbers (or non-numbers) from a string.

Description

extract_numbers extracts the numbers (or non-numbers) from a string where decimals are optionally allowed. extract_non_numerics extracts the bits of the string that aren't extracted by extract_numbers. nth_number is a convenient wrapper for extract_numbers, allowing you to choose which number you want. Similarly nth_non_numeric. Please view the examples at the bottom of this page to ensure that you understand how these functions work, and their limitations. These functions are vectorised over string.

Usage

extract_numbers(string, leave_as_string = FALSE, decimals = FALSE,
  leading_decimals = FALSE, negs = FALSE)

extract_non_numerics(string, decimals = FALSE, leading_decimals = FALSE, negs = FALSE)

nth_number(string, n = 1, leave_as_string = FALSE, decimals = FALSE, leading_decimals = FALSE, negs = FALSE)

nth_non_numeric(string, n = 1, decimals = FALSE, leading_decimals = FALSE, negs = FALSE)

Arguments

string

A string.

leave_as_string

Do you want to return the number as a string (TRUE) or as numeric (FALSE, the default)?

decimals

Do you want to include the possibility of decimal numbers (TRUE) or not (FALSE, the default).

leading_decimals

Do you want to allow a leading decimal point to be the start of a number?

negs

Do you want to allow negative numbers? Note that double negatives are not handled here (see the examples).

n

The index of the number (or non-numeric) that you seek. Negative indexing is allowed i.e. n = 1 (the default) will give you the first number (or non-numeric) whereas n = -1 will give you the last number (or non-numeric), n = -2 will give you the second last number and so on.

Value

For extract_numbers and extract_non_numerics, a list of numeric or character vectors, one list element for each element of string. For nth_number and nth_non_numeric, a vector the same length as string (as in length(string), not nchar(string)).

Details

If any part of a string contains an ambiguous number (e.g. 1.2.3 would be ambiguous if decimals = TRUE (but not otherwise)), the value returned for that string will be NA. Note that these functions do not know about scientific notation (e.g. 1e6 for 1000000).

Examples

Run this code
# NOT RUN {
extract_numbers(c("abc123abc456", "abc1.23abc456"))
extract_numbers(c("abc1.23abc456", "abc1..23abc456"), decimals = TRUE)
extract_numbers("abc1..23abc456", decimals = TRUE)
extract_numbers("abc1..23abc456", decimals = TRUE, leading_decimals = TRUE)
extract_numbers("abc1..23abc456", decimals = TRUE, leading_decimals = TRUE,
                leave_as_string = TRUE)
extract_numbers("-123abc456")
extract_numbers("-123abc456", negs = TRUE)
extract_numbers("--123abc456", negs = TRUE)
extract_non_numerics("abc123abc456")
extract_non_numerics("abc1.23abc456")
extract_non_numerics("abc1.23abc456", decimals = TRUE)
extract_non_numerics("abc1..23abc456", decimals = TRUE)
extract_non_numerics("abc1..23abc456", decimals = TRUE,
leading_decimals = TRUE)
extract_non_numerics(c("-123abc456", "ab1c"))
extract_non_numerics("-123abc456", negs = TRUE)
extract_non_numerics("--123abc456", negs = TRUE)
extract_numbers(c(rep("abc1.2.3", 2), "a1b2.2.3", "e5r6"), decimals = TRUE)
extract_numbers("ab.1.2", decimals = TRUE, leading_decimals = TRUE)
nth_number("abc1.23abc456", 2)
nth_number("abc1.23abc456", 2, decimals = TRUE)
nth_number("-123abc456", -2, negs = TRUE)
extract_non_numerics("--123abc456", negs = TRUE)
nth_non_numeric("--123abc456", 1)
nth_non_numeric("--123abc456", -2)
# }

Run the code above in your browser using DataLab