helda v1.1.3

0

Monthly downloads

0th

Percentile

Preprocess Data and Get Better Insights from Machine Learning Models

The main focus is on preprocessing and data visualization of machine learning models performances. Some functions allow to fill in gaps in time series using linear interpolation on panel data, some functions permit to draw lift effect and lift curve in order to benchmark machine learning models or you can even find the optimal number of clusters in agglomerative clustering algorithm.

Readme

R package helda (HELpful functions for Data Analysis in R)

CRAN
status R build
status Build
Status Codecov test
coverage Documentation License: GPL
v3 Downloads

Overview

This package provides functionalities that aim at facilitating and saving time when analysing data.

Installation

You can install helda from CRAN by simply running:

install.packages("helda")

Development version

To get a bug fix, or use a feature from the development version, you can install helda from this GitHub repository.

# install.packages("devtools")
devtools::install_github("Redcart/helda")

Usage

This is a quick introduction to the lift curve function of the package:

library(helda)

data_training <- titanic_training
data_validation <- titanic_validation

model_glm <- glm(formula = "Survived ~ Pclass + Sex + Age + 
                 SibSp + Fare + Embarked",
                 data = data_training,
                 family = binomial(link = "logit"))

predictions <- predict(object = model_glm, 
                       newdata = data_validation, 
                       type = "response")

plot <- lift_curve(predictions = predictions, 
                   true_labels = data_validation$Survived, 
                   positive_label = 1)

plot

Getting help

If you encounter a clear bug, please file a minimal reproducible example on the issues section of the repository.

Author

Simon Corde

Functions in helda

Name Description
titanic_training Titanic training data set
titanic_validation Titanic validation data set
start_end_to_fill Function for filling start and end gaps in time series
titanic_testing Titanic testing data set
world_countries_pop World countries population from 1960 to 2018
windows_to_linux_path Convert windows path into linux path
lift_curve Lift curve graph
create_calendar Complete empty calendar
compute_inertia_ahc Intra group inertia for choosing the optimal number of clusters in Agglomerative Clustering
create_formula Create a formula
cluster_centroid Centroid of a cluster
proc_freq SAS proc freq in R
lift_effect Lift effect curve
kmeans_procedure K-means procedure
gap_to_fill Filling intermediate gaps in a time serie
compute_global_inertia Inertia of a data frame
No Results!

Last month downloads

Details

License GPL-3
Encoding UTF-8
LazyData true
URL https://www.github.com/Redcart/helda
BugReports https://github.com/Redcart/helda/issues
RoxygenNote 7.1.0
NeedsCompilation no
Packaged 2020-06-13 09:37:38 UTC; simon
Repository CRAN
Date/Publication 2020-06-13 12:50:03 UTC

Include our badge in your README

[![Rdoc](http://www.rdocumentation.org/badges/version/helda)](http://www.rdocumentation.org/packages/helda)