This function takes a data frame and creates a data dictionary. The data dictionary includes the variable name, a human-readable name, the variable type, and a description. If a model is specified, the function uses OpenAI's API to generate the information based on the characteristics of the data frame.
create_data_dictionary(
data,
file_path,
model = NULL,
sample_n = 5,
grouping = NULL,
force = FALSE
)
A data frame containing the variable name, human-readable name, variable type, and description for each variable in the input data frame.
A data frame to create a data dictionary for.
The file path to save the data dictionary to.
The ID of the OpenAI chat completion models to use for
generating descriptions (see openai::list_models()
). If NULL (default), a
scaffolding for the data dictionary is created.
The number of rows to sample from the data frame to use as input for the model. Default NULL.
A character vector of column names to group by when sampling rows from the data frame for the model. Default NULL.
If TRUE, overwrite the file at file_path
if it already exists.
Default FALSE.