Learn R Programming

codelist

codelist is a package that should make it easier to work with code lists. A code list is a set of codes with associated values. For example one could have codes ‘E’ = ‘Employed’, ‘U’ = ‘Unemployed’, ‘N’ = ‘Not belonging to the working population’ and ‘X’ = ‘Unknown’. Codes can also have a description, be contained in a hierarchy and indicate specific types of missing values.

Quick Overview

library(codelist)

First we define the code list. In this case it is defined manually; in practice these will often be read from file. A code list has at minimum codes and corresponding labels. Here we also have a column indicating which codes can be interpreted as missing values.

cl <- codelist(
  codes = c("E", "U", "N", "X"), 
  labels = c("Employed", "Unemployed", 
    "Not belonging to working population", "Unknown"),
  missing = c(0, 0, 0, 1)
)

We can use this code list to define a ‘coded’ vector; which is a vector of codes with an attribute ‘codelist’.

x <- coded(c("N", "E", "E", "U", NA, "E", "N", "X", "N"), cl)
x
## [1] N    E    E    U    <NA> E    N    X    N   
## 4 Codelist: E(=Employed) ...X(=Unknown)

The general idea of the codelist package, is that we work with the codes as these are generally the most accurate. However, for presentation and statistical analyses we will often want to work with the labels. The labels methods transforms the vector into a factor for analysis and presentation:

table(x)
## x
## E N U X 
## 3 3 1 1 
table(labels(x, missing = FALSE), useNA = "ifany")
## 
##                            Employed                          Unemployed 
##                                   3                                   1 
## Not belonging to working population                             Unknown 
##                                   3                                   1 
##                                <NA> 
##                                   1 

The code list can also be used to check if codes are valid making the code more safe:

try( x[1] <- "A" ) 
## Error in `[<-.coded`(`*tmp*`, 1, value = "A") : 
##   Invalid codes used in value.
try( any(x == "B") )
## Error in Ops.coded(x, "B") : Invalid codes used in RHS

In this case the codes are somewhat readable. However, generally when reading code it is difficult to understand what a line of code like the lines above means. For someone reading the code it is easier to work with the labels:

x[1] <- as.label("Employed")
x[is.missing(x)] <- as.label("Unemployed")

Of course using invalid labels will generated an error.

More information

More information can be found in the vignettes of the package:

Copy Link

Version

Install

install.packages('codelist')

Monthly Downloads

178

Version

0.1.0

License

GPL-3

Maintainer

Jan der Laan

Last Published

February 20th, 2025

Functions in codelist (0.1.0)

format.code

Format a code object for pretty printing
in_labels

Match codes based on label
codes

Get the codes belonging to given labels
levelcast

Recode codes to a higher level in a hierarchy
objectcodes

Example code list for object types
objectsales

Example data set to demonstrate working with code lists
as.code

Convert object to code
cl_is_valid

Check if the codelist is valid
cl_filter

Filter a code list
cl_levels

Get the hierarchical level for each code in a code list
cl

Get the code list associated with the object
as.label

Label character vector as label to use in comparisons with a code vector
code

Code vector
cl_locale

Get the locale to use with the codelist
cl_nlevels

Get the number of hierarchical levels in a code list
as.codelist

Convert an object to a codelist object
codelist

Create a codelist object
is.codelist

Check if an object is a Code List
is.code

Check if object is a code
labels.code

Convert vector with codes to factor using a code list
is.missing

Find out which elements of a vector have missing values