Javier Luraschi

Javier Luraschi

14 packages on CRAN

2 packages on GitHub

nomnoml

cran
99.99th

Percentile

A tool for drawing sassy 'UML' diagrams based on a simple syntax, see <http://www.nomnoml.com>. Supports styling, R Markdown and exporting diagrams in the PNG format.

pins

cran
99.99th

Percentile

Pin remote resources into a local cache to work offline, improve speed and avoid recomputing; discover and share resources in local folders, 'GitHub', 'Kaggle' or 'RStudio Connect'. Resources can be anything from 'CSV', 'JSON', or image files to arbitrary R objects.

sparklyr

cran
99.99th

Percentile

R interface to Apache Spark, a fast and general engine for big data processing, see <http://spark.apache.org>. This package supports connecting to local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end, and provides an interface to Spark's built-in machine learning algorithms.

sparkwarc

cran
99.99th

Percentile

Load WARC (Web ARChive) files into Apache Spark using 'sparklyr'. This allows to read files from the Common Crawl project <http://commoncrawl.org/>.

sparkapi

github
99.99th

Percentile

Low-level socket-based interface to calling the Spark API via the RBackend server included in Spark.

swagger

cran
99.99th

Percentile

A collection of 'HTML', 'JavaScript', and 'CSS' assets that dynamically generate beautiful documentation from a 'Swagger' compliant API: <https://swagger.io/specification/>.

arrow

cran
99.99th

Percentile

'Apache' 'Arrow' <https://arrow.apache.org/> is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. This package provides an interface to the 'Arrow C++' library.

cloudml

cran
99.99th

Percentile

Interface to the Google Cloud Machine Learning Platform <https://cloud.google.com/ml-engine>, which provides cloud tools for training machine learning models.

geospark

cran
99.99th

Percentile

R binds 'GeoSpark' <http://geospark.datasyslab.org/> extending 'sparklyr' <https://spark.rstudio.com/> R package to make distributed 'geocomputing' easier. Sf is a package that provides [simple features] <https://en.wikipedia.org/wiki/Simple_Features> access for R and which is a leading 'geospatial' data processing tool. 'Geospark' R package bring the same simple features access like sf but running on Spark distributed system.

knitr

cran
99.99th

Percentile

Provides a general-purpose tool for dynamic report generation in R using Literate Programming techniques.

mlflow

cran
99.99th

Percentile

R interface to 'MLflow', open source platform for the complete machine learning life cycle, see <https://mlflow.org/>. This package supports installing 'MLflow', tracking experiments, creating and running projects, and saving and serving models.

profvis

cran
99.99th

Percentile

Interactive visualizations for profiling R code.

sparkhail

cran
99.99th

Percentile

'Hail' is an open-source, general-purpose, 'python' based data analysis tool with additional data types and methods for working with genomic data, see <https://hail.is/>. 'Hail' is built to scale and has first-class support for multi-dimensional structured data, like the genomic data in a genome-wide association study (GWAS). 'Hail' is exposed as a 'python' library, using primitives for distributed queries and linear algebra implemented in 'scala', 'spark', and increasingly 'C++'. The 'sparkhail' is an R extension using 'sparklyr' package. The idea is to help R users to use 'hail' functionalities with the well-know 'tidyverse' syntax, see <https://www.tidyverse.org/>.

rmarkdown

github
99.99th

Percentile

Convert R Markdown documents into a variety of formats.

tfdeploy

cran
99.99th

Percentile

Tools to deploy 'TensorFlow' <https://www.tensorflow.org/> models across multiple services. Currently, it provides a local server for testing 'cloudml' compatible services.

99.99th

Percentile

This is a 'sparklyr' extension integrating 'VariantSpark' and R. 'VariantSpark' is a framework based on 'scala' and 'spark' to analyze genome datasets, see <https://bioinformatics.csiro.au/>. It was tested on datasets with 3000 samples each one containing 80 million features in either unsupervised clustering approaches and supervised applications, like classification and regression. The genome datasets are usually writing in VCF, a specific text file format used in bioinformatics for storing gene sequence variations. So, 'VariantSpark' is a great tool for genome research, because it is able to read VCF files, run analyses and return the output in a 'spark' data frame.