helda v1.1.3
Monthly downloads
Preprocess Data and Get Better Insights from Machine Learning Models
The main focus is on preprocessing and data visualization of machine learning models performances.
Some functions allow to fill in gaps in time series using linear interpolation on panel data, some functions
permit to draw lift effect and lift curve in order to benchmark machine learning models or you can even find
the optimal number of clusters in agglomerative clustering algorithm.
Readme
R package helda (HELpful functions for Data Analysis in R)
Overview
This package provides functionalities that aim at facilitating and saving time when analysing data.
Installation
You can install helda from CRAN by simply running:
install.packages("helda")
Development version
To get a bug fix, or use a feature from the development version, you can install helda from this GitHub repository.
# install.packages("devtools")
devtools::install_github("Redcart/helda")
Usage
This is a quick introduction to the lift curve function of the package:
library(helda)
data_training <- titanic_training
data_validation <- titanic_validation
model_glm <- glm(formula = "Survived ~ Pclass + Sex + Age +
SibSp + Fare + Embarked",
data = data_training,
family = binomial(link = "logit"))
predictions <- predict(object = model_glm,
newdata = data_validation,
type = "response")
plot <- lift_curve(predictions = predictions,
true_labels = data_validation$Survived,
positive_label = 1)
plot

Getting help
If you encounter a clear bug, please file a minimal reproducible example on the issues section of the repository.
Author
Simon Corde
Functions in helda
| Name | Description | |
| titanic_training | Titanic training data set | |
| titanic_validation | Titanic validation data set | |
| start_end_to_fill | Function for filling start and end gaps in time series | |
| titanic_testing | Titanic testing data set | |
| world_countries_pop | World countries population from 1960 to 2018 | |
| windows_to_linux_path | Convert windows path into linux path | |
| lift_curve | Lift curve graph | |
| create_calendar | Complete empty calendar | |
| compute_inertia_ahc | Intra group inertia for choosing the optimal number of clusters in Agglomerative Clustering | |
| create_formula | Create a formula | |
| cluster_centroid | Centroid of a cluster | |
| proc_freq | SAS proc freq in R | |
| lift_effect | Lift effect curve | |
| kmeans_procedure | K-means procedure | |
| gap_to_fill | Filling intermediate gaps in a time serie | |
| compute_global_inertia | Inertia of a data frame | |
| No Results! | ||
Last month downloads
Details
| License | GPL-3 |
| Encoding | UTF-8 |
| LazyData | true |
| URL | https://www.github.com/Redcart/helda |
| BugReports | https://github.com/Redcart/helda/issues |
| RoxygenNote | 7.1.0 |
| NeedsCompilation | no |
| Packaged | 2020-06-13 09:37:38 UTC; simon |
| Repository | CRAN |
| Date/Publication | 2020-06-13 12:50:03 UTC |
| depends | , R (>= 3.5.0) |
| suggests | covr (>= 3.4.0) , devtools (>= 2.2.1) , testthat (>= 2.1.0) |
| imports | dplyr (>= 0.7.8) , ggplot2 (>= 3.2.0) , rlang (>= 0.4.2) , sqldf (>= 0.4-11) , stats (>= 3.5.0) , stringr (>= 1.3.1) |
| Contributors |
Include our badge in your README
[](http://www.rdocumentation.org/packages/helda)