keyUpdate: Update a key in light of a new data frame (add variables and

Description

The following chores must be handled. 1. If the data.frame has variables which are not currently listed in the variable key's "name_old" variable, then new variables are added to the key. 2. If the data.frame has new values for the previously existing variables, then those values must be added to the keys. 3. If the old key has "name_new" or "class_new" designated for variables, those MUST be preserved in the new key for all new values of those variables.

Usage

keyUpdate(key, dframe, append = TRUE, safeNumericToInteger = TRUE)

Arguments

key

A variable key

dframe

A data.frame object.

append

If long key, should new rows be added to the end of the updated key? Default is TRUE. If FALSE, new rows will be sorted with the original values.

safeNumericToInteger

Default TRUE: Should we treat variables which appear to be integers as integers? In many csv data sets, the values coded c(1, 2, 3) are really integers, not floats c(1.0, 2.0, 3.0). See safeInteger. ## Need to consider implementing this: ## @param ignoreCase

Value

Updated variable key.

Details

This function will not alter key values for "class_old", "value_old" or "value_new" for variables that have no new information.

This function deduces if the key provided is in the wide or long format from the class of the object.

Examples

Run this code

dat1 <- data.frame("Score" = c(1, 2, 3, 42, 4, 2),
                   "Gender" = c("M", "M", "M", "F", "F", "F"))
## First try with a long key
key1 <- keyTemplate(dat1, long = TRUE)
key1[5, "value_new"] <- 10
key1[6, "value_new"] <- "female"
key1[7, "value_new"] <- "male"
key1[key1$name_old == "Score", "name_new"] <- "NewScore"
dat2 <- data.frame("Score" = 7, "Gender" = "other", "Weight" = rnorm(3))
dat2 <- plyr::rbind.fill(dat1, dat2)
dat2 <- dat2[-1,]
keyUpdate(key1, dat2, append = TRUE)
keyUpdate(key1, dat2, append = FALSE)
key1c <- key1
key1c[key1c$name_old == "Score", "class_new"] <- "character"
keyUpdate(key1c, dat2, append = TRUE)
str(dat3 <- keyApply(dat2, key1c))
## Now try a wide key
key1 <- keyTemplate(dat1)
(key1.u <- keyUpdate(key1, dat2))
str(keyApply(dat2, key1.u))
str(keyApply(dat2, key1c))

mydf.key.path <- system.file("extdata", "mydf.key.csv", package = "kutils")
mydf.key <-  keyImport(mydf.key.path)

set.seed(112233)
N <- 20
## The new Jan data arrived!
mydf2 <- data.frame(x5 = rnorm(N),
                    x4 = rpois(N, lambda = 3),
                    x3 = ordered(sample(c("lo", "med", "hi"),
                                       size = N, replace=TRUE),
                                levels = c("med", "lo", "hi")),
                    x2 = letters[sample(c(1:4,6), N, replace = TRUE)],
                    x1 = factor(sample(c("jan"), N, replace = TRUE)),
                    x7 = ordered(letters[sample(c(1:4,6), N, replace = TRUE)]),
                    x6 = sample(c(1:5), N, replace = TRUE),
                    stringsAsFactors = FALSE)
mydf.key2 <- keyUpdate(mydf.key, mydf2)
mydf.key2
mydf.key2["x1", "value_old"] <- "cindy|bobby|jan|peter|marcia|greg"
mydf.key2["x1", "value_new"] <- "Cindy<Bobby<Jan<Peter<Marcia<Greg"

mydf.key.path <- system.file("extdata", "mydf.key.csv", package = "kutils")
mydf.path <- system.file("extdata", "mydf.csv", package = "kutils")
mydf <- read.csv(mydf.path, stringsAsFactors=FALSE)
mydf3 <- rbind(mydf, mydf2)
## Now recode with revised key
mydf4 <- keyApply(mydf3, mydf.key2)
rockchalk::summarize(mydf4)

Run the code above in your browser using DataLab