Learn R Programming

DriveML (version 0.1.0)

smartEDA: SmartEDA - functions that automates most of exploratory analyses tasks in modeling

Description

SmartEDA includes multiple custom functions to perform initial exploratory analysis on any input data describing the structure and the relationships present in the data. The generated output can be obtained in both summary and graphical form. The graphical form or charts can also be exported as reports.

Usage

smartEDA(data, Template = NULL, Target = NULL, label = NULL,
  theme = "Default", op_file = NULL, op_dir = getwd(), sc = NULL,
  sn = NULL, Rc = NULL)

Arguments

data

a data frame

Template

R markdown template (.rmd file)

Target

dependent variable. If there is no defined target variable then keep as it is NULL.

label

target variable descriptions, not a mandatory field

theme

customized ggplot theme (default SmartEDA theme) (for Some extra themes use Package: ggthemes)

op_file

output file name (.html)

op_dir

output path

sc

sample number of plots for categorical variable. User can decide how many number of plots to depict in html report.

sn

sample number of plots for numerical variable. User can decide how many number of plots to depict in html report.

Rc

reference category of target variable. If Target is categorical then Pclass value is mandatory and which should not be NULL

Value

HTML Rmarkdown output file in .html format

Details

SmartEDA has four major functionalities 1. Descriptive statistics

  • Numerical variable summary :

  • ExpNumStat - Summary statistics for numerical variables ExpNumStat

  • Categorical variable summary :

  • ExpCatStat - Function provides summary statistics for all character or categorical columns in the dataframe ExpCatStat

  • ExpCTable - Function to create frequency and custom tables ExpCTable

2. Data visualization

  • Numerical variable plot :

  • ExpNumViz - Distributions of numeric variables ExpNumViz

  • Categorical variable plot :

  • ExpCatViz - Distributions of categorical variables ExpCatViz

  • Normality testing plot:

  • ExpOutQQ - Quantile Quantile Plots ExpOutQQ

  • ExpParcoord - Parallel Co ordinate plots ExpParcoord

3. Custom tables

  • Customized summary statistics :

  • ExpCustomStat - Customized summary statistics ExpCustomStat

4. EDA report

  • Function to create HTML EDA report :

  • ExpReport - Function to create HTML EDA report ExpReport

Examples

Run this code
# NOT RUN {
# Genearate complete EDA report
smartEDA(iris, op_file="eda_report.html", op_dir = tempdir(), sc = NULL, sn = 2)
# }

Run the code above in your browser using DataLab