labelled (version 2.7.0)

look_for: Look for keywords variable names and descriptions / Create a data dictionary

Description

look_for emulates the lookfor Stata command in R. It supports searching into the variable names of regular R data frames as well as into variable labels descriptions. The command is meant to help users finding variables in large datasets.

Usage

look_for(data, ..., labels = TRUE, ignore.case = TRUE, details = TRUE)

lookfor(data, ..., labels = TRUE, ignore.case = TRUE, details = TRUE)

generate_dictionary( data, ..., labels = TRUE, ignore.case = TRUE, details = TRUE )

# S3 method for look_for print(x, ...)

convert_list_columns_to_character(x)

lookfor_to_long_format(x)

Arguments

data

a data frame

...

optional list of keywords, a character string (or several character strings), which can be formatted as a regular expression suitable for a base::grep() pattern, or a vector of keywords; displays all variables if not specified

labels

whether or not to search variable labels (descriptions); TRUE by default

ignore.case

whether or not to make the keywords case sensitive; TRUE by default (case is ignored during matching)

details

add details about each variable (turn off for a quicker search)

x

a tibble returned by look_for()

Value

a tibble data frame featuring the variable position, name and description (if it exists) in the original data frame

Details

When no keyword is provided, it will produce a data dictionary of the overall data frame.

The function looks into the variable names for matches to the keywords. If available, variable labels are included in the search scope. Variable labels of data.frame imported with foreign or memisc packages will also be taken into account (see to_labelled()). If no keyword is provided, it will return all variables of data.

look_for(), lookfor() and generate_dictionary() are equivalent.

By default, results will be summrized when printing. To deactivate default printing, use dplyr::as_tibble().

lookfor_to_long_format() could be used to transform results with one row per factor level and per value label.

Use convert_list_columns_to_character() to convert named list columns into character vectors (see examples).

Examples

Run this code
# NOT RUN {
look_for(iris)

# Look for a single keyword.
look_for(iris, "petal")
look_for(iris, "s")

# Look for with a regular expression
look_for(iris, "petal|species")
look_for(iris, "s$")

# Look for with several keywords
look_for(iris, "pet", "sp")
look_for(iris, "pet", "sp", "width")
look_for(iris, "Pet", "sp", "width", ignore.case = FALSE)

# Quicker search without variable details
look_for(iris, details = FALSE)

# To deactivate default printing, convert to tibble
look_for(iris) %>% dplyr::as_tibble()

# To convert named lists into character vectors
look_for(iris) %>% convert_list_columns_to_character()

# Long format with one row per factor and per value label
look_for(iris) %>% lookfor_to_long_format()

# Both functions can be combined
look_for(iris) %>%
  lookfor_to_long_format() %>%
  convert_list_columns_to_character()

# Labelled data
# }
# NOT RUN {
  data(fertility, package = "questionr")
  look_for(children)
  look_for(children, "id")
  look_for(children) %>%
    lookfor_to_long_format() %>%
    convert_list_columns_to_character()
# }

Run the code above in your browser using DataLab