Learn R Programming

Ecfun (version 0.1-0)

asNumericDF: Coerce to numeric dropping commas and info after a blank

Description

Delete commas (thousand separators) and drop information after a blank, then coerce to numeric and order the rows by the orderBy. Some Excel imports include commas as thousand separators; this replaces any commas with char(0), ''. Also, some character data includes footnote references following the year. Table F-1 from the US Census Bureau needs all three of these features: It needs orderBy, because the most recent year appears first, just the opposite of most other data sets where the most recent year appears last. It has footnote references following a character string indicating the year. And it includes commas as thousand separators.

Usage

asNumericChar(x)
asNumericDF(x, keep=function(x)any(!is.na(x)), orderBy)

Arguments

x
For asNumericChar, this is a character vector to be converted to numeric after gsub(',', '', x). For asNumericDF, this is a data.frame with all character columns to be converted to numeri
keep
something to indicate which columns to keep
orderBy
Which columns to order the rows of x[, keep] by

Value

  • all numeric data.frame

Details

1. Replace commas by nothing 2. strsplit on ' ' and take only the first part, thereby eliminating the footnote references. 3. Replace any blanks with NAs 4. as.numeric 5. lapply(x, 1-4) 6. order the rows; by default, ascending on the first column

See Also

scan gsub Quotes

Examples

Run this code
fakeF1 <- data.frame(yr=c('1948', '1947 (1)'),
                     q1=c('1,234', ''), duh=rep(NA, 2) )
nF1 <- asNumericDF(fakeF1)

nF1. <- data.frame(yr=asNumericChar(fakeF1$yr),
                   q1=asNumericChar(fakeF1$q1))[2:1,]

# correct answer
row.names(nF1.) <- 2:1

nF1c <- data.frame(yr=1947:1948, q1=c(NA, 1234))
row.names(nF1c) <- 2:1

stopifnot(
all.equal(nF1, nF1.)
)
stopifnot(
all.equal(nF1, nF1c)
)

Run the code above in your browser using DataLab