Learn R Programming

DDIwR (version 0.12)

getMetadata: Extract metadata information

Description

Extract a list containing the variable labels, value labels and any available information about missing values.

Usage

getMetadata(x, save = FALSE, declared = TRUE, OS = "Windows", encoding = "UTF-8", ...)

Value

An R list roughly equivalent to a DDI codebook, containing all variables, their corresponding variable labels and value labels, and (if applicable) missing values if imported and found.

Arguments

x

A path to a file, or a data frame object

save

Boolean, save an .R file in the same directory

declared

Logical, embed the data as a declared object

OS

The target operating system, for the eol - end of line separator, if saving the file

encoding

The character encoding used to read a file

...

Additional arguments for this function (internal uses only)

Author

Adrian Dusa

Details

This function reads an XML file containing a DDI codebook version 2.5, or an SPSS or Stata file and returns a list containing the variable labels, value labels, plus some other useful information.

It additionally attempts to automatically detect a type for each variable:

cat:categorical variable using numeric values
catchar:categorical variable using character values
catnum:categorical variable for which numerical summaries
can be calculated (ex. a 0...10 Likert response scale)
num:numerical
numcat:numerical variable with very few values (ex. number of children)
for which a table of frequencies is possible in addition to frequencies

By default, this function extracts the metadata into an R list object, but when the argument save is activated, the argument OS (case insensitive) can be either:
"Windows" (default), or "Win",
"MacOS", "Darwin", "Apple", "Mac",
"Linux".

The end of line separator changes only when the target OS is different from the running OS.

For the moment, only DDI version 2.5 (Codebook) is supported, but DDI Lifecycle is planned to be implemented.

Examples

Run this code
x <- data.frame(
    A = declared(
        c(1:5, -92),
        labels = c(Good = 1, Bad = 5, NR = -92),
        na_values = -92
    ),
    C = declared(
        c(1, -91, 3:5, -92),
        labels = c(DK = -91, NR = -92),
        na_values = c(-91, -92)
    )
)

getMetadata(x)$dataDscr

Run the code above in your browser using DataLab