Learn R Programming

⚠️There's a newer version (1.4.2) of this package.Take me there.

dataMaid (version 1.1.2)

A Suite of Checks for Identification of Potential Errors in a Data Frame as Part of the Data Screening Process

Description

Data screening is an important first step of any statistical analysis. dataMaid autogenerates a customizable data report with a thorough summary of the checks and the results that a human can use to identify possible errors. It provides an extendable suite of test for common potential errors in a dataset.

Copy Link

Version

Install

install.packages('dataMaid')

Monthly Downloads

1,211

Version

1.1.2

License

GPL-2

Issues

Pull Requests

Stars

Forks

Maintainer

Claus Ekstrom

Last Published

May 3rd, 2018

Functions in dataMaid (1.1.2)

defaultFactorChecks

Default checks for factor variables
basicVisual

identifyMissing

A checkFunction for identifying miscoded missing values.
defaultFactorSummaries

Default summary functions for factor variables
defaultCharacterSummaries

Default summary functions for character variables
description

Extract the contents of the attribute description
identifyCaseIssues

A checkFunction for identifying case issues
defaultNumericSummaries

Default summary functions for numeric variables
isSingular

Check if a variable only contains a single value
defaultDateChecks

Default checks for Date variables
summarize

Summarize a variable/dataset
standardVisual

Produce distribution plots using ggplot from ggplot2.
isSupported

Check if a variable has a class supported by dataMaid
defaultIntegerChecks

Default checks for integer variables
isCPR

Check if a variable consists of Danish CPR numbers
exampleData

Example data with zero-inflated variables
messageGenerator

Produce a message for the output of a checkFunction
minMax

summaryFunction for minimum and maximum
isKey

Check if a variable qualifies as a key
summaryFunction

Create an object of class summaryFunction
presidentData

Semi-artificial data about the US presidents
quartiles

summaryFunction for quartiles
summaryResult

Create object of class summaryResult
defaultLogicalSummaries

Default summary functions for logical variables
setVisuals

Set visual arguments for makeDataReport
setSummaries

Set summary arguments for makeDataReport
visualFunction

Create an object of class visualFunction
visualize

Produce distribution plots
defaultNumericChecks

Default checks for numeric variables
checkResult

Create object of class checkResult
identifyWhitespace

A checkFunction for identifying whitespace
identifyOutliersTBStyle

A checkFunction for identifying outliers Turkey Boxstole style
defaultLabelledSummaries

Default summary functions for labelled variables
setChecks

Set check arguments for makeDataReport
render

Simplified Rmarkdown rendering
uniqueValues

summaryFunction for unique values
identifyOutliers

A checkFunction for identifying outliers
variableType

Summary function for original class
defaultLogicalChecks

Default checks for logical variables
identifyNums

A checkFunction
makeCodebook

Produce a data codebook
makeDataReport

Produce a data report
toyData

Small example data to show the features of dataMaid
testData

Extended example data to test the features of dataMaid
allCheckFunctions

Overview of all available checkFunctions
basicVisualCFLB

importFrom stats na.omit
allClasses

Vector of all variable classes in dataMaid
centralValue

summaryFunction for central values
allVisualFunctions

Overview of all available visualFunctions
bigPresidentData

Semi-artificial data about the US presidents (extended version)
artData

Semi-artificial data about masterpieces of art
classes

Extract the contents of the attribute classes
defaultDateSummaries

Default summary functions for Date variables
clean

Produce a data cleaning overview document (deprecated version)
allSummaryFunctions

Overview of all available summaryFunctions
check

Perform checks of potential errors in variable/dataset
checkFunction

Create an object of class checkFunction
countMissing

Summary function for missing values
defaultIntegerSummaries

Default summary functions for integer variables
defaultLabelledChecks

Default checks for labelled variables
defaultCharacterChecks

Default checks for character variables
identifyLoners

A checkFunction for identifying sparsely represented values (loners)