Learn R Programming

⚠️There's a newer version (1.9.5) of this package.Take me there.

funModeling (version 1.8)

Exploratory Data Analysis and Data Preparation Tool-Box Book

Description

Around 10% of almost any predictive modeling project is spent in predictive modeling, 'funModeling' and the book Data Science Live Book () are intended to cover remaining 90%: data preparation, profiling, selecting best variables 'dataViz', assessing model performance and other functions.

Copy Link

Version

Install

install.packages('funModeling')

Monthly Downloads

2,018

Version

1.8

License

GPL-2

Maintainer

Pablo Casas

Last Published

August 1st, 2019

Functions in funModeling (1.8)

convert_df_to_categoric

Convert every column in a data frame to character
categ_analysis

Profiling analysis of categorical vs. target variable
compare_df

Compare two data frames by keys
range01

Transform a variable into the [0-1] range
cross_plot

Cross-plotting input variable vs. target variable
data_country

People with flu data
entropy_2

Computes the entropy between two variables
coord_plot

Coordinate plot
ROC

ROC Curves
desc_groups_rank

Profiling categorical variable (rank)
equal_freq

Equal frequency binning
mplot_density

Density plot for discrete and continuous values
fibonacci

Fibonacci series
auto_grouping

Reduce cardinality in categorical variable by automatic grouping
discretize_rgr

Variable discretization by gain ratio maximization
dist2d

Distance from specific point to line
correlation_table

Get correlation against target variable
gain_lift

Generates lift and cumulative gain performance table and plot
funModeling-package

funModeling: Exploratory data analysis, data preparation and model performance
discretize_df

Discretize a data frame
discretize_get_bins

Get the data frame thresholds for discretization
v_compare

Compare two vectors
freq

Frequency table for categorical variables
hampel_outlier

Hampel Outlier Threshold
mae

Mean Absolute Error (MAE)
lares_pal

Personal Colours Palette
information_gain

Information gain
plot_palette

Plot Palette Colours
plotar

Correlation plots
rmse

Root Mean Squared Error (RMSE)
profiling_num

Profiling numerical data
prep_outliers

Outliers Data Preparation
df_status

Get a summary for the given data frame (o vector).
mplot_roc

ROC Curve Plot
mplot_full

MPLOTS Score Full Report Plots
gg_text_customs

Custom colours to use in ggplot as scale_color_manual on texts
var_rank_info

Importance variable ranking based on information theory
mplot_splits

Split and compare quantiles plot
gg_fill_customs

Custom colours to use in ggplot as scale_fill_manual
gg_colour_customs

Custom colours to use in ggplot as scale_color_manual
mse

Mean Squared Error (MSE)
data_golf

Play golf
errors

Calculate Errors
mplot_cuts

Cuts by quantiles for score plot
export_plot

Export plot to jpeg file
mplot_cuts_error

Cuts by quantiles on absolut and percentual errors plot
mape

Mean Absolute Percentage Error (MAPE)
desc_groups

Profiling categorical variable
gain_ratio

Gain ratio
get_sample

Sampling training and test data
tukey_outlier

Tukey Outlier Threshold
theme_lares2

lares Theme for ggplot2
plot_num

Plotting numerical data
theme_lares

Theme for ggplot2
infor_magic

Computes several information theory metrics between two vectors
heart_disease

Heart Disease Data
scale_x_comma

Axis scales format
mplot_lineal

Linear Regression Results Plot
rsq

R Squared
mplot_metrics

AUC and LogLoss Plots
rsqa

Adjusted R Squared
concatenate_n_vars

Concatenate 'N' variables