Learn R Programming

sdglinkage (version 0.1.0)

add_random_error: Add random error flags to a data frame.

Description

add_random_error adds a column of error flags (between 0 and 1) to a data frame based on the prob.

Usage

add_random_error(dataset, error_name, prob = c(0.95, 0.05))

Arguments

dataset

A data frame of the dataset.

error_name

A string of the name and type of the error in the form of 'error name_error type'. The error name should be one of the variable name in the dataset, and the error type can be either: 'missing', 'insert', 'variant', 'typo', 'pho', 'ocr', 'trans_date' or 'trans_char'.

prob

A vector of two numerical probablities, where the first one is the probablity of being 0 and the second one is the probablity of being 1.

Value

A data frame of the dataset with an additional column of binary encoded error.

Examples

Run this code
# NOT RUN {
adult_with_flag <- add_random_error(adult[1:100,], prob = c(0.97, 0.03), "age_missing")
adult_with_flag <- add_random_error(adult_with_flag, prob = c(0.65, 0.35), "education_typo")

# }

Run the code above in your browser using DataLab