keyImport: Import a file and clean up for use as variable key

Description

After the researcher has updated the key by filling in new names and values, we import that key file. This function imports the file by its name, after deducing the file type from the suffix.

Usage

keyImport(file, ignoreCase = TRUE, sep = c(character = "\\|", logical =
  "\\|", integer = "\\|", factor = "[\\|

Arguments

file

A file name, ending in csv, xlsx or rds.

ignoreCase

In the use of this key, should we ignore differences in capitalization of the "name_old" variable? Sometimes there are inadvertent misspellings due to changes in capitalization. Columns named "var01" and "Var01" and "VAR01" probably should receive the same treatment, even if the key has name_old equal to "Var01".

sep

Defaults are specified, don't change this unless you know what you are doing. In wide keys, what separators are used between values? This should be a named vector which declares separators that are used in the key. In our defaults, the separator for classes character, logical, integer, and numeric is the pipe, "|", while for factor and ordered variables, the separator may be either pipe or less than. Use regular expressions in supplying separator values.

na.strings

Values in the value_new column which will be treated as NA in the key. The defaults are ".", "", "\s" (white space), "NA", and "N/A". These will prevent a new value like "" or " " from being created, so if one intends to insert white space, the na.strings vector must be specified.

...

additional arguments for read.csv or read.xlsx.

keynames

Don't use this unless you are very careful. In our current scheme, the column names in a key should be c("name_old", "name_new", "class_old", "class_new", "value_old", "value_new", "missings", "recodes"). If your key does not use those column names, it is necessary to provide keynames in a format "our_name"="your_name". For example, keynames = c(name_old = "oldvar", name_new = "newname", class_old = "vartype", class_new = "class", value_old = "score", value_new = "val") .

Value

key object

Details

This can be either a wide or long format key file.

This cleans up variables in following ways. 1) name_old and name_new have leading and trailing spaces removed 2) value_old and value_new have leading and trailing spaces removed, and if they are empty or blank spaces, then new values are set as NA. 3) if value_old and value_new are identical, the values are removed from the key.

Examples

Run this code

mydf.key.path <- system.file("extdata", "mydf.key.csv", package = "kutils")
mydf.key <-  keyImport(mydf.key.path)

mydf.keylong.path <- system.file("extdata", "mydf.key_long.csv", package = "kutils")
mydf.keylong <- keyImport(mydf.keylong.path)

Run the code above in your browser using DataLab