Learn R Programming

folda (version 0.1.0)

getDataInShape: Align Data with a Missing Reference

Description

This function aligns a given dataset (data) with a reference dataset (missingReference). It ensures that the structure, column names, and factor levels in data match the structure of missingReference. If necessary, missing columns are initialized with NA, and factor levels are adjusted to match the reference. Additionally, it handles the imputation of missing values based on the reference and manages flag variables for categorical or numerical columns.

Usage

getDataInShape(data, missingReference)

Value

A data frame where the structure, column names, and factor levels of data are aligned with missingReference. Missing values in data are imputed based on the first row of the missingReference, and flag variables are updated accordingly.

Arguments

data

A data frame to be aligned and adjusted according to the missingReference.

missingReference

A reference data frame that provides the structure (column names, factor levels, and missing value reference) for aligning data.

Examples

Run this code
data <- data.frame(
  X1_FLAG = c(0, 0, 0),
  X1 = factor(c(NA, "C", "B"), levels = LETTERS[2:3]),
  X2_FLAG = c(NA, 0, 1),
  X2 = c(2, NA, 3)
)

missingReference <- data.frame(
  X1_FLAG = 1,
  X1 = factor("A", levels = LETTERS[1:2]),
  X2 = 1,
  X2_FLAG = 1
)

getDataInShape(data, missingReference)

Run the code above in your browser using DataLab