strex (version 1.4.1)

str_extract_numbers: Extract numbers from a string.

Description

Extract the numbers from a string, where decimals, scientific notation and commas (as separators, not as an alternative to the decimal point) are optionally allowed.

Usage

str_extract_numbers(
  string,
  decimals = FALSE,
  leading_decimals = decimals,
  negs = FALSE,
  sci = FALSE,
  commas = FALSE,
  leave_as_string = FALSE
)

Arguments

string

A string.

decimals

Do you want to include the possibility of decimal numbers (TRUE) or not (FALSE, the default).

leading_decimals

Do you want to allow a leading decimal point to be the start of a number?

negs

Do you want to allow negative numbers? Note that double negatives are not handled here (see the examples).

sci

Make the search aware of scientific notation e.g. 2e3 is the same as 2000.

commas

Allow comma separators in numbers (i.e. interpret 1,100 as a single number (one thousand one hundred) rather than two numbers (one and one hundred)).

leave_as_string

Do you want to return the number as a string (TRUE) or as numeric (FALSE, the default)?

Value

For str_extract_numbers and str_extract_non_numerics, a list of numeric or character vectors, one list element for each element of string. For str_nth_number and str_nth_non_numeric, a numeric or character vector the same length as the vector string.

Details

If any part of a string contains an ambiguous number (e.g. 1.2.3 would be ambiguous if decimals = TRUE (but not otherwise)), the value returned for that string will be NA and a warning will be issued.

With scientific notation, it is assumed that the exponent is not a decimal number e.g. 2e2.4 is unacceptable. Commas, however, are acceptable in the exponent, so 2e1,100 is fine and equal to 2e1100 if the option to allow commas in numbers has been turned on.

Numbers outside the double precision floating point range (i.e. with absolute value greater than 1.797693e+308) are read as Inf (or -Inf if they begin with a minus sign). This is what base::as.numeric() does.

See Also

Other numeric extractors: str_nth_number_after_mth(), str_nth_number_before_mth(), str_nth_number()

Examples

Run this code
# NOT RUN {
strings <- c(
  "abc123def456", "abc-0.12def.345", "abc.12e4def34.5e9",
  "abc1,100def1,230.5", "abc1,100e3,215def4e1,000"
)
str_extract_numbers(strings)
str_extract_numbers(strings, decimals = TRUE)
str_extract_numbers(strings, decimals = TRUE, leading_decimals = TRUE)
str_extract_numbers(strings, commas = TRUE)
str_extract_numbers(strings,
  decimals = TRUE, leading_decimals = TRUE,
  sci = TRUE
)
str_extract_numbers(strings,
  decimals = TRUE, leading_decimals = TRUE,
  sci = TRUE, commas = TRUE, negs = TRUE
)
str_extract_numbers(strings,
  decimals = TRUE, leading_decimals = FALSE,
  sci = FALSE, commas = TRUE, leave_as_string = TRUE
)
str_extract_numbers(c("22", "1.2.3"), decimals = TRUE)
# }

Run the code above in your browser using DataLab