Learn R Programming

⚠️There's a newer version (5.2.13) of this package.Take me there.

lares

R Package for Analytics and Machine Learning

R library designed to automate, improve, and speed everyday Analysis and Machine Learning tasks. With a wide variety of family functions such as Machine Learning, EDA, Investment, Queries, Scrappers, APIs, lares helps the analyst or data scientist to get quick, reproducible, and robust results, without the need of repetitive coding nor extensive programming skills. Feel free to install, use, and/or comment on any of the code and functionalities. Oh, and if you are also colourblind, be sure to check the colour palettes!

Don't hesitate to contact me, and please, do let me know where did you first hear from the library and which family of functions you are most interested in.

Installation

# install.packages('devtools')
devtools::install_github("laresbernardo/lares")

# Full installation with recommended dependencies (takes some time)
devtools::install_github("laresbernardo/lares", dependencies = TRUE)

# User friendly update
lares::updateLares()

CRAN NOTE: I currently don't have planned to submit the library to CRAN, eventhough I'm a huge fan and it passes all its quality tests. I see lares more of an everyday useful shareble package rather than a "specialized for a specific task" library. It has too many various kinds of functions, from NLP to querying APIs to plotting Machine Learning results to market stocks and portfolio reports. I gladly share my code with the community and encourage you to use/comment/share it, but I do think that CRAN is not aiming for this kind of libraries in their repertoire.

See the library in action!

AutoML Simplified Map from h2o_automl()

Insights While Understanding

To get insights and value out of your dataset, first you need to understand its structure, types of data, empty values, interactions between variables... corr_cross() and freqs() are here to give you just that! They show a wide persepective of your dataset content, correlations, and frequencies. Additionally, with the missingness() function to detect all missing values and df_str() to break down you data frame's structure, you will be ready to squeeze valuable insights out of your data.

Kings of Data Mining

My favourite and most used functions are freqs(), distr(), and corr_var(). In this RMarkdown you can see them in action. Basically, they group and count values within variables, show distributions of one variable vs another one (numerical or categorical), and calculate/plot correlations of one variables vs all others, no matter what type of data you insert.

If there is space for one more, I would add ohse() (One Hot Smart Encoding), which has made my life much easier and my work much valuable. It converts a whole data frame into numerical values by making dummy variables (categoricals turned into new columns with 1s and 0s, ordered by frequencies and grouping less frequent into a single column) and dates into new features (such as month, year, week of the year, minutes if time is present, holidays given a country, currency exchange rates, etc).

What else is there?

You can check all active functions and documentations here or type lares:: in RStudio and you will get a pop-up with all the functions that are currently available within the package. You might also want to check the whole documentation by running help(package = "lares") in your RStudio or in the Online Official Documentation. Remember to check the families and similar functions on the See Also sections as well.

Getting further help

If you need help with any of the functions when using RStudio, use the ? function (i.e. ?lares::function) and the Help tab will display a short explanation on each function and its parameters. You might also be interested in the Online Official Documentation to check all functions and parameters.

If you encounter a bug, please share with me a reproducible example on Github issues and I'll take care of it. For inquiries, and other matters, you can LinkedIn me anytime!

Copy Link

Version

Install

install.packages('lares')

Monthly Downloads

3,658

Version

4.8.4

License

AGPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Bernardo Lares

Last Published

February 19th, 2025

Functions in lares (4.8.4)

cleanText

Clean text
bindfiles

Bind Files into Dataframe
check_opts

Validate options within vector
ROC

AUC and ROC Curves Data
check_attr

Attribute checker
autoline

New Line Feed for Long Character Strings
balance_data

Balance Binary Data by Resampling: Under-Over Sampling
bring_api

Get API (JSON) and Transform into data.frame
clusterKmeans

Automated K-Means Clustering + PCA
categ_reducer

Reduce categorical values
corr_cross

Ranked Cross-Correlation
corr

Correlation table
conf_mat

Confussion Matrix
corr_var

Correlation between variable and dataframe
daily_stocks

Daily Stocks Dataframe
dalex_explainer

DALEX Explainer for H2O
db_download

Download Dropbox File by File's Name
db_upload

Upload Dropbox File
dfl

Dataset: Random Transactional Data
dfr

Dataset: Results for AutoML Predictions
date_feats

One Hot Encoding for Date/Time Variables (Dummy Variables)
dateformat

Transform any date input into Date
dalex_residuals

DALEX Residuals
dalex_variable

DALEX Partial Dependency Plots (PDP)
dalex_local

DALEX Local
daily_portfolio

Daily Portfolio Dataframe
crosstab

Weighted Cross Tabulation
deg2num

Convert from degrees to numeric coordinates [Deprecated]
date_cuts

Convert Date into Year + Cut
f1_contacts

Hubspot contacts (Somos F1)
export_results

Export h2o_automl's Results
df_str

Dataset columns and rows structure
dft

Dataset: Titanic Sub-dataset por Examples
export_plot

Install latest version of H2O
etf_sector

ETF's Sectors Breakdown
dist2d

Distance from specific point to line
errors

Calculate Continuous Values Errors
distr

Compare Variables with their Distributions
fb_insights

Facebook Insights API
fb_creatives

Facebook Creatives API
fb_accounts

Facebook Ad Accounts
fb_ads

Facebook Ads API
formatNum

Nicely Format Numerical Values
forecast_arima

ARIMA Forecast
freqs

Frequencies Calculations and Plot
formatTime

Auto Detect Time-Date Format
fb_post

Get Facebook's Post Comments (API Graph)
fb_posts

Get Facebook's Page Posts (API Graph)
freqs_list

Frequencies on Lists and UpSet Plot
freqs_df

Plot for All Frequencies on Dataframe
geoAddress

Get Google's Geodata given the Addresses
geoGrid

Check, Cross, and Plot Coordinates with Polygons
geoMap

Plot Map or Shapefile
geoStratum

Get Colombia's Stratum given the Coordinates
flatten_list

Flatten lists into data.frame
font_exists

Check if Font is Installed
freqs_plot

Combinated Frequencies Plot for Categoral Features
get_mp3

Download MP3 from URL
gain_lift

Cumulative Gain, Lift and Response
get_tweets

Get Tweets
h2o_automl

Automated H2O's AutoML
image_metadata

Get Meta Data from Image Files
importxlsx

Import Excel File with All Its Tabs
h2o_predict_API

H2O Predict using API Service
gg_colour_customs

Custom colours for scale_color_manual [Deprecated]
h2o_results

Automated H2O's AutoML Results
gg_bars

Quick Nice Bar Plot
h2o_predict_model

H2O Predict using H2O Model Object
gg_fill_customs

Custom colours for scale_fill_manual [Deprecated]
gg_pie

Density plot for discrete and continuous values
h2o_selectmodel

Select Model from h2o_automl's Leaderboard
grepl_letters

Pattern Matching for Letters considering Blanks
gg_text_customs

Custom colours for scale_color_manual on texts [Deprecated]
ip_country

Find country from a given IP
get_currency

Download Historical Currency Exchange Rate
holidays

Holidays in your Country
haveInternet

Internet Connection Check
get_credentials

Load Credentials from a YML File
impute

Impute Missing Values (using MICE)
h2o_update

Install latest version of H2O
h2o_predict_MOJO

H2O Predict using MOJO file
h2o_predict_binary

H2O Predict using Binary file
model_metrics

Model Metrics and Performance
lares

Analytics, Visualization & Machine Learning Tasks Library
lares_pal

Personal Colours Palette
mplot_conf

Confussion Matrix Plot
list_cats

List categorical values for data.frame
iter_seeds

Iterate and Search for Best Seed
list_fun_file

List all functions used in an R script file by package
json2vector

Convert Python JSON string to R vector (data.frame with 1 row)
listfiles

List files in a directory
lares-exports

Pipe operator
install_recommended

Install/Update Additional Recommended Libraries
loglossBinary

Loggarithmic Loss Function for Binary Models
lasso_vars

Most Relevant Features Using Lasso Regression
li_auth

OAuth Linkedin
mplot_importance

Variables Importances Plot
li_profile

Get My Personal LinkedIn Data
left

Left: First n characters
mplot_roc

ROC Curve Plot
mplot_splits

Split and compare quantiles plot
plot_timeline

Plot timeline as Gantt Plot
mplot_full

MPLOTS Score Full Report Plots
plot_survey

Visualize Survey Results
mplot_gain

Cumulative Gain Plot
mailSend

Send Emails with Attachments (POST)
mplot_cuts

Cuts by quantiles for score plot
noPlot

Plot Result with Nothing to Plot
msplit

Split a dataframe for training and testing sets
move_files

Move files from A to B
queryDB

PostgreSQL Queries on Database (Read)
pass

Pass Through a dplyr's Pipeline
plot_cats

Plot All Categorical Features (Frequencies)
queryGA

Queries on Google Analytics
normalize

Normalize Vector
mplot_metrics

Model Metrics and Performance Plots
mplot_response

Cumulative Response Plot
right

Right: Last n characters
mplot_lineal

Linear Regression Results Plot
missingness

Calculate and Visualize Missingness
scrabble_dictionary

Scrabble: Dictionaries
splot_roi

Portfolio Plots: Daily ROI
plot_df

Plot Summary of Numerical and Categorical Features
slackSend

Send Slack Message (Webhook)
rbind_full

Smart rbind
splot_growth

Portfolio Plots: Growth (Cash + Invested)
quiet

Quiet prints and verbose noice
plot_chord

Chords Plot
sentimentBreakdown

Sentiment Breakdown on Text
summer

Sum Calculations and Plot
mplot_cuts_error

Cuts by quantiles on absolut and percentual errors plot
scale_x_comma

Axis scales format
stocks_quote

Download Stocks Historical Data
topics_rake

Keyword/Topic identification using RAKE
try_require

Check if Specific Package is Installed
stocks_report

Portfolio's Full Report and Email
tree_var

Recursive Partitioning and Regression Trees
scrabble_points

Scrabble: Tiles Points
updateLares

Update the library
target_set

Set Target Value in Target Variable
ohse

One Hot Smart Encoding (Dummy Variables)
ohe_commas

One Hot Encoding for a Vector with Comma Separated Values
mplot_density

Density plot for discrete and continuous values
num_abbr

Abbreviate numbers
stocks_hist

Download Stocks Historical Data
splot_summary

Portfolio Plots: Total Summary
read.file

Read Files Quickly (Auto-detected)
prophesize

Facebook's Prophet Forecast
textTokenizer

Tokenize Vectors into Words
splot_types

Portfolio Plots: Types of Stocks
readGS

Google Sheets Reading (API v4)
myip

What's my IP
quants

Calculate cuts by quantiles
theme_lares

Old theme for ggplot2 [Deprecated]
stocks_obj

Portfolio's Calculations and Plots
textCloud

Wordcloud Plot
plot_nums

Plot All Numerical Features (Boxplots)
numericalonly

Filter only Numerical Values and
plot_palette

Plot Palette Colours
removenacols

Remove/Drop Columns in which ALL or SOME values are NAs
splot_etf

Portfolio's Sector Distribution (ETFs)
removenarows

Remove/Drop Rows in which ALL or SOME values are NAs
splot_change

Portfolio Plots: Daily Change
year_week

Convert Date into Year-Week (YYYY-WW)
replaceall

Replace Values With
replacefactor

Replace Factor Values
textFeats

Create features out of text
zerovar

Zero Variance Columns
trendsRelated

Google Trends: Related Plot
trendsTime

Google Trends: Timelines Plot
scrabble_score

Scrabble: Word Scores
statusbar

Progressive Status Bar (Loading)
scrabble_words

Scrabble: Highest score words finder
stocks_file

Get Personal Portfolio's Data
theme_lares2

lares Theme for ggplot2
tic

Stopwatch to measure R Timings
vector2text

Convert a vector into a comma separated text
year_month

Convert Date into Year-Month (YYYY-MM)