Learn R Programming

dlookr (version 0.3.9)

eda_report: Reporting the information of EDA

Description

The eda_report() report the information of Exploratory data analysis for object inheriting from data.frame.

Usage

eda_report(.data, ...)

# S3 method for data.frame eda_report(.data, target = NULL, output_format = c("pdf", "html"), output_file = NULL, output_dir = tempdir(), font_family = NULL, browse = TRUE, ...)

Arguments

.data

a data.frame or a tbl_df.

...

arguments to be passed to methods.

target

target variable.

output_format

report output type. Choose either "pdf" and "html". "pdf" create pdf file by knitr::knit(). "html" create html file by rmarkdown::render().

output_file

name of generated file. default is NULL.

output_dir

name of directory to generate report file. default is tempdir().

font_family

charcter. font family name for figure in pdf.

browse

logical. choose whether to output the report results to the browser.

Reported information

The EDA process will report the following information:

  • Introduction

    • Information of Dataset

    • Information of Variables

    • About EDA Report

  • Univariate Analysis

    • Descriptive Statistics

    • Normality Test of Numerical Variables

      • Statistics and Visualization of (Sample) Data

  • Relationship Between Variables

    • Correlation Coefficient

      • Correlation Coefficient by Variable Combination

      • Correlation Plot of Numerical Variables

  • Target based Analysis

    • Gruoped Descriptive Statistics

      • Gruoped Numerical Variables

      • Gruoped Categorical Variables

    • Gruoped Relationship Between Variables

      • Grouped Correlation Coefficient

      • Grouped Correlation Plot of Numerical Variables

See vignette("EDA") for an introduction to these concepts.

Details

Generate generalized data EDA reports automatically. You can choose to output to pdf and html files. This is useful for EDA a data frame with a large number of variables than data with a small number of variables. For pdf output, Korean Gothic font must be installed in Korean operating system.

Examples

Run this code
# NOT RUN {
library(dplyr)

# Generate data for the example
carseats <- ISLR::Carseats
carseats[sample(seq(NROW(carseats)), 20), "Income"] <- NA
carseats[sample(seq(NROW(carseats)), 5), "Urban"] <- NA

## target variable is categorical variable
# reporting the EDA information
# create pdf file. file name is EDA_Report.pdf
eda_report(carseats, US)

# create pdf file. file name is EDA.pdf
eda_report(carseats, "US", output_file = "EDA.pdf")

# create pdf file. file name is EDA.pdf and not browse
eda_report(carseats, "US", output_dir = ".", output_file = "EDA.pdf", browse = FALSE)

# create html file. file name is EDA_Report.html
eda_report(carseats, "US", output_format = "html")

# create html file. file name is EDA.html
eda_report(carseats, US, output_format = "html", output_file = "EDA.html")


## target variable is numerical variable
# reporting the EDA information
eda_report(carseats, Sales)

# create pdf file. file name is EDA2.pdf
eda_report(carseats, "Sales", output_file = "EDA2.pdf")

# create html file. file name is EDA_Report.html
eda_report(carseats, "Sales", output_format = "html")

# create html file. file name is EDA2.html
eda_report(carseats, Sales, output_format = "html", output_file = "EDA2.html")

## target variable is null
# reporting the EDA information
eda_report(carseats)


# create pdf file. file name is EDA2.pdf
eda_report(carseats, output_file = "EDA2.pdf")

# create html file. file name is EDA_Report.html
eda_report(carseats, output_format = "html")

# create html file. file name is EDA2.html
eda_report(carseats, output_format = "html", output_file = "EDA2.html")
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab