Learn R Programming

dataQualityR (version 1.0)

checkDataQuality: checkDataQuality

Description

The function takes in a data frame object, runs data quality checks on each variable, generates summary statistics, and outputs two csv files containing the data quality report -- one for numeric variables and the other for categorical variables

Usage

checkDataQuality(data,  
				 out.file.num, 
				 out.file.cat,
				 numeric.cutoff = -1)

Arguments

data
An object of class data.frame
out.file.num
Filename for saving data quality report of numeric variables
out.file.cat
Filename for saving data quality report of categoric variables
numeric.cutoff
The minimum number of unique values needed for a numeric variable to be treated as continous. This feature is included to account for binary or multi-category variables, with small number of unique values, which are stored as numeric. Default is -1 which

Value

  • Returns csv files stored directly on disk

Examples

Run this code
data(crx)
num.file <- paste(tempdir(), "/dq_num.csv", sep= "")
cat.file <- paste(tempdir(), "/dq_cat.csv", sep= "")
checkDataQuality(data= crx, out.file.num= num.file, out.file.cat= cat.file)

Run the code above in your browser using DataLab