Learn R Programming

⚠️There's a newer version (0.6.3) of this package.Take me there.

dlookr (version 0.6.2)

Tools for Data Diagnosis, Exploration, Transformation

Description

A collection of tools that support data diagnosis, exploration, and transformation. Data diagnostics provides information and visualization of missing values and outliers and unique and negative values to help you understand the distribution and quality of your data. Data exploration provides information and visualization of the descriptive statistics of univariate variables, normality tests and outliers, correlation of two variables, and relationship between target variable and predictor. Data transformation supports binning for categorizing continuous variables, imputates missing values and outliers, resolving skewness. And it creates automated reports that support these three tasks.

Copy Link

Version

Install

install.packages('dlookr')

Monthly Downloads

2,298

Version

0.6.2

License

GPL-2 | file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Choonghyun Ryu

Last Published

July 1st, 2023

Functions in dlookr (0.6.2)

binning_rgr

Binning by recursive information gain ratio maximization
correlate

Compute the correlation coefficient between two variable
cramer

Cramer's V statistic
compare_numeric

Compare numerical variables
describe

Compute descriptive statistic
binning

Binning the Numeric Data
binning_by

Optimal Binning for Scoring Modeling
describe.tbl_dbi

Compute descriptive statistic
compare_category

Compare categorical variables
diagnose

Diagnose data quality of variables
diagnose.tbl_dbi

Diagnose data quality of variables in the DBMS
diagnose_outlier.tbl_dbi

Diagnose outlier of numerical variables in the DBMS
diagnose_category.tbl_dbi

Diagnose data quality of categorical variables in the DBMS
diagnose_numeric.tbl_dbi

Diagnose data quality of numerical variables in the DBMS
diagnose_outlier

Diagnose outlier of numerical variables
diagnose_paged_report.tbl_dbi

Reporting the information of data diagnosis for table of the DBMS
diagnose_category

Diagnose data quality of categorical variables
diagnose_numeric

Diagnose data quality of numerical variables
diagnose_paged_report

Reporting the information of data diagnosis
diagnose_report

Reporting the information of data diagnosis
eda_paged_report.tbl_dbi

Reporting the information of EDA for table of the DBMS
eda_paged_report

Reporting the information of EDA
dlookr_orange_paged

Generate paged HTML document
dlookr_templ_html

dlookr HTML template Loads additional style and template file
diagnose_report.tbl_dbi

Reporting the information of data diagnosis for table of the DBMS
plot_correlate

Deprecated functions in package ‘dlookr’
eda_report

Reporting the information of EDA
eda_report.tbl_dbi

Reporting the information of EDA for table of the DBMS
extract

Extract bins from "bins"
entropy

Calculate the entropy
find_skewness

Finding skewed variables
find_outliers

Finding variables including outliers
dlookr-package

dlookr: Tools for Data Diagnosis, Exploration, Transformation
eda_web_report

Reporting the information of EDA with html
diagnose_sparese

Diagnosis of level combinations of categorical variables
eda_web_report.tbl_dbi

Reporting the information of EDA for table of the DBMS with html
import_google_font

Import Google Fonts
get_class

Extracting a class of variables
plot.compare_category

Visualize Information for an "compare_category" Object
imputate_na

Impute Missing Values
plot.compare_numeric

Visualize Information for an "compare_numeric" Object
diagnose_web_report

Reporting the information of data diagnosis with html
plot.bins

Visualize Distribution for a "bins" object
imputate_outlier

Impute Outliers
performance_bin

Diagnose Performance Binned Variable
heartfailure

Heart Failure Data
get_column_info

Describe column of table in the DBMS
get_os

Finding Users Machine's OS
get_percentile

Finding percentile
kld

Kullback-Leibler Divergence
plot.transform

Visualize Information for an "transform" Object
diagnose_web_report.tbl_dbi

Reporting the information of data diagnosis for table of the DBMS with html
find_class

Extract variable names or indices of a specific class
find_na

Finding variables including missing values
get_transform

Transform a numeric vector
plot.univar_category

Visualize Information for an "univar_category" Object
plot_correlate.tbl_dbi

Visualize correlation plot of numerical data
jsd

Jensen-Shannon Divergence
plot_hist_numeric

Plot histogram of numerical variables
plot_na_pareto

Pareto chart for missing value
plot.infogain_bins

Visualize Distribution for an "infogain_bins" Object
plot_na_hclust

Combination chart for missing value
kurtosis

Kurtosis of the data
plot_normality

Plot distribution information of numerical data
plot.correlate

Visualize Information for an "correlate" Object
normality.tbl_dbi

Performs the Shapiro-Wilk test of normality
normality

Performs the Shapiro-Wilk test of normality
plot_outlier.target_df

Plot outlier information of target_df
plot.imputation

Visualize Information for an "imputation" Object
summary.bins

Summarizing Binned Variable
plot.optimal_bins

Visualize Distribution for an "optimal_bins" Object
pps

Compute Predictive Power Score
jobchange

Job Change of Data Scientists
plot.overview

Visualize Information for an "overview" Object
plot.univar_numeric

Visualize Information for an "univar_numeric" Object
plot.performance_bin

Visualize Performance for an "performance_bin" Object
summary.univar_category

Summarizing univar_category information
overview

Describe overview of data
plot_qq_numeric

Plot Q-Q plot of numerical variables
plot.pps

Visualize Information for an "pps" Object
plot.relate

Visualize Information for an "relate" Object
plot_box_numeric

Plot Box-Plot of numerical variables
plot_correlate.data.frame

Visualize correlation plot of numerical data
plot_bar_category

Plot bar chart of categorical variables
print.relate

Summarizing relate information
summary.compare_category

Summarizing compare_category information
plot_na_intersect

Plot the combination variables that is include missing value
plot_normality.tbl_dbi

Plot distribution information of numerical data
relate

Relationship between target variable and variable of interest
skewness

Skewness of the data
summary.correlate

Summarizing Correlation Coefficient
plot_outlier.tbl_dbi

Plot outlier information of numerical data diagnosis in the DBMS
univar_category

Statistic of univariate categorical variables
plot_outlier

Plot outlier information of numerical data diagnosis
summary.univar_numeric

Summarizing univar_numeric information
summary.optimal_bins

Summarizing Performance for Optimal Bins
univar_numeric

Statistic of univariate numerical variables
summary.overview

Summarizing overview information
summary.transform

Summarizing transformation information
theil

Theil's U statistic
target_by.tbl_dbi

Target by one column in the DBMS
summary.performance_bin

Summarizing Performance for Binned Variable
summary.pps

Summarizing Predictive Power Score
summary.compare_numeric

Summarizing compare_numeric information
transform

Data Transformations
transformation_paged_report

Reporting the information of transformation
transformation_report

Reporting the information of transformation
summary.imputation

Summarizing imputation information
target_by

Target by one variables
transformation_web_report

Reporting the information of transformation with html