Learn R Programming

⚠️There's a newer version (5.2.13) of this package.Take me there.

lares

R Package for Analytics and Machine Learning

lares is a library designed to automate, improve, and speed everyday Analysis and Machine Learning tasks. With a wide variety of family functions within Machine Learning, Data Wrangling, EDA, and Scrappers, lares helps the analyst or data scientist to get quick, reproducible, and robust results, without the need of repetitive coding or extensive programming skills.

You are most welcome to install, use, and/or comment on any of the code and functionalities. If you are colour blind as well, glad to share my colour palettes! Feel free to contact me via Linkedin, and please, do let me know where did you got my contact from.

Installation

# install.packages('devtools')
devtools::install_github("laresbernardo/lares")

CRAN NOTE: I currently don't have planned to submit the library into CRAN, eventhough it passes all its quality tests (and I'm a huge fan). I think lares is more of an everyday useful package rather than a "specialized for a specific task" library. It has too many useful and various kinds of functions, from NLP to querying APIs to plotting Machine Learning results to market stocks and portfolio reports. I gladly share my code with the community and encourage you to use/comment/share it, but I strongly think that CRAN is not aiming for this kind of libraries in their repertoire.

See the library in action!

AutoML Simplified Map from h2o_automl()

Insights While Understanding

To get insights and value out of your dataset, first you need to understand its structure, types of data, empty values, interactions between variables... corr_cross() and freqs() are here to give you just that! They show a wide persepective of your dataset content, correlations, and frequencies. Additionally, with the missingness() function to detect all missing values and df_str() to break down you data frame's structure, you will be ready to squeeze valuable insights out of your data.

Kings of Data Mining

My favourite and most used functions are freqs(), distr(), and corr_var(). In this RMarkdown you can see them in action. Basically, they group and count values within variables, show distributions of one variable vs another one (numerical or categorical), and calculate/plot correlations of one variables vs all others, no matter what type of data you insert.

If there is space for one more, I would add ohse() (One Hot Smart Encoding), which has made my life much easier and my work much valuable. It converts a whole data frame into numerical values by making dummy variables (categoricals turned into new columns with 1s and 0s, ordered by frequencies and grouping less frequent into a single column) and dates into new features (such as month, year, week of the year, minutes if time is present, holidays given a country, currency exchange rates, etc).

What else is there?

You can type lares:: in RStudio and you will see a pop-up with all the functions that are currently available within the package. If in doubt, you can use the ? function (i.e. ?lares::function) and the Help tab will display a short explanation on each function and its parameters. If you want to check all the documentation, simply run help(package = lares). You can find similar family functions in the See Also section of each documentation as well.

Getting further help

If you encounter a clear bug, please share with us a reproducible example on Github and I'll take care of it. For inquiries, and other matters, you can email me directly.

Copy Link

Version

Install

install.packages('lares')

Monthly Downloads

3,658

Version

4.7

License

AGPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Bernardo Lares

Last Published

February 19th, 2025

Functions in lares (4.7)

clusterKmeans

K-Means Clustering Automated
conf_mat

Confussion Matrix
balance_data

Balance Binary Data by Resampling: Under-Over Sampling
bindfiles

Bind Files into Dataframe
categ_reducer

Reduce categorical values
cleanText

Clean text
calibrate

Calibrate Sampling Scores
bring_api

Get API (JSON) and Transform into data.frame
daily_stocks

Daily Stocks Dataframe
ROC

ROC Curves
dalex_explainer

DALEX Explainer
autoline

New Line Feed for Long Character Strings
corr

Correlation table
dalex_local

DALEX Local
dateformat

Transform any date input into Date
db_download

Download Dropbox File by File's Name
df_str

Dataset columns and rows structure
corr_plot

Correlation plot
crosstab

Weighted Cross Tabulation
daily_portfolio

Daily Portfolio Dataframe
db_upload

Upload Dropbox File
corr_var

Correlation between variable and dataframe
dalex_variable

DALEX Partial Dependency Plots (PDP)
gain_lift

Cumulative Gain, Lift and Response
deg2num

Convert from degrees to numeric coordinates
freqs_df

All Frequencies on Data Frame
date_feats

One Hot Encoding for Date/Time Variables (Dummy Variables)
corr_cross

Correlation Cross-Table
export_results

Export h2o_automl's Results
dalex_residuals

DALEX Residuals
dft

Dataset: Titanic Sub-dataset por Examples
dist2d

Distance from specific point to line
export_plot

Install latest version of H2O
freqs

Frequencies Calculations and Plot
etf_sector

ETF's Sectors Breakdown
gg_colour_customs

Custom colours to use in ggplot as scale_color_manual
formatTime

Auto Detect Time-Date Format
gg_fill_customs

Custom colours to use in ggplot as scale_fill_manual
forecast_ml

Machine Learning Forecast
formatNum

Nicely Format Numerical Values
f1_contacts

Hubspot contacts (Somos F1)
distr

Compare Variables with their Distributions
errors

Calculate Errors
fb_insights

Facebook Insights API
geoAddress

Get Google's Geodata given the Addresses
geoGrid

Check, Cross, and Plot Coordinates with Polygons
dfl

Dataset: Random Data for Examples
fb_post

Get Facebook's Post Comments (API Graph)
gg_pie

Density plot for discrete and continuous values
gg_bars

Quick Nice Bar Plot
get_tweets

Get Tweets
h2o_predict_model

H2O Predict using H2O Model Object
gg_text_customs

Custom colours to use in ggplot as scale_color_manual on texts
h2o_predict_MOJO

H2O Predict using MOJO file
haveInternet

Internet Connection Check
h2o_update

Install latest version of H2O
lares-exports

Pipe operator
importxlsx

Import Excel File with All Its Tabs
holidays

Holidays in your Country
li_auth

OAuth Linkedin
fb_ads

Facebook Ads API
fb_accounts

Facebook Ad Accounts
fb_posts

Get Facebook's Page Posts (API Graph)
lares

Analytics, Visualization & Machine Learning Tasks Library
h2o_selectmodel

Select Model from h2o_automl's Leaderboard
geoMap

Plot Map or Shapefile
h2o_predict_binary

H2O Predict using Binary file
json2vector

Convert JSON string to vector (data.frame with 1 row)
iter_seeds

Iterate and Search for Best Seed
forecast_arima

ARIMA Forecast
get_credentials

Load personal parameters and credentials
li_profile

Get My Personal LinkedIn Data
mplot_cuts

Cuts by quantiles for score plot
impute

Impute Missing Values (using MICE)
mplot_conf

Confussion Matrix Plot
mape

Mean Absolute Percentage Error (MAPE)
geoStratum

Get Colombia's Stratum given the Coordinates
missingness

Calculate and Visualize Missingness
model_metrics

Model Metrics and Performance
ip_country

Find country from a given IP
listfiles

List files in a directory
get_currency

Download Historical Currency Exchange Rate
mplot_importance

Variables Importances Plot
mplot_roc

ROC Curve Plot
rbind_full

Smart rbind
mplot_lineal

Linear Regression Results Plot
quiet

Quiet prints and verbose noice
stocks_file

Get Personal Portfolio's Data
stocks_hist

Download Stocks Historical Data
mse

Mean Squared Error (MSE)
mplot_splits

Split and compare quantiles plot
msplit

Split a dataframe for training and testing sets
plot_survey

Visualize Survey Results
loglossBinary

Loggarithmic Loss Function for Binary Models
h2o_automl

Automated H2O's AutoML
one_hot_encoding_commas

One Hot Encoding for a Vector with Comma Separated Values
mplot_response

Cumulative Response Plot
ohse

One Hot Smart Encoding (Dummy Variables)
mplot_metrics

AUC and LogLoss Plots
mplot_density

Density plot for discrete and continuous values
matrixwd

MatrixDS Auto Working Directory for Shiny
mplot_cuts_error

Cuts by quantiles on absolut and percentual errors plot
mplot_full

MPLOTS Score Full Report Plots
plot_palette

Plot Palette Colours
plot_nums

Plot All Numerical Features (Boxplots)
removenacols

Remove/Drop Columns in which ALL or SOME values are NAs
removenarows

Remove/Drop Rows in which ALL or SOME values are NAs
myip

What's my IP
queryDW

PostgreSQL Queries on Redshift Database (read-write)
noPlot

Plot Result with Nothing to Plot
plot_timeline

Density plot for discrete and continuous values
splot_etf

Portfolio's Sector Distribution (ETFs)
rsq

R Squared
rmse

Root Mean Squared Error (RMSE)
mplot_gain

Cumulative Gain Plot
pass

Pass Through a dplyr's Pipeline
splot_types

Portfolio Plots: Types of Stocks
lares_pal

Personal Colours Palette
h2o_predict_API

H2O Predict using API Service
left

Left: First n characters
splot_growth

Portfolio Plots: Growth (Cash + Invested)
plot_cats

Plot All Categorical Features (Frequencies)
prophesize

Facebook's Prophet Forecast
read.file

Read Files Quickly (Auto-detected)
quants

Calculate cuts by quantiles
readGS

Google Sheets Reading
splot_summary

Portfolio Plots: Total Summary
splot_roi

Portfolio Plots: Daily ROI
textCloud

Wordcloud Plot
textFeats

Create features out of text
year_month

Convert Date into Year-Month (YYYY-MM)
writeGS

Google Sheets Writing
stocks_obj

Portfolio's Calculations and Plots
try_require

Check if Specific Package is Installed
statusbar

Progressive Status Bar (Domino)
numericalonly

Filter only Numerical Values and
normalize

Normalize Vector
mae

Mean Absolute Error (MAE)
plot_df

Plot Summary of Numerical and Categorical Features
plot_chord

Chords Plot
mailSend

Send Emails with Attachments (POST)
stocks_report

Portfolio's Full Report and Email
replaceall

Replace Values With
vector2text

Convert a vector into a comma separated text
updateLares

Update the library
typeform_download

Download typeform data
right

Right: Last n characters
queryDummy

PostgreSQL Queries on Dummy Database (read only)
queryGA

Queries on Google Analytics
queryProduc

PostgreSQL Queries on Production Database (read-write)
rsqa

Adjusted R Squared
scale_x_comma

Axis scales format
trendsTime

Google Trends: Timelines Plot
zerovar

Zero Variance Columns
trendsRelated

Google Trends: Related Plot
year_week

Convert Date into Year-Week (YYYY-WW)
splot_change

Portfolio Plots: Daily Change
sentimentBreakdown

Sentiment Breakdown on Text
textTokenizer

Tokenize Vectors into Words
theme_lares2

lares Theme for ggplot2
theme_lares

Theme for ggplot2
tree_var

Recursive Partitioning and Regression Trees