# Hadley Wickham

#### 92 packages on CRAN

#### 14 packages on GitHub

A system for 'declaratively' creating graphics, based on "The Grammar of Graphics". You provide the data, tell 'ggplot2' how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details.

An alternative approach to non-standard evaluation using formulas. Provides a full implementation of LISP style 'quasiquotation', making it easier to generate code with other code.

A consistent, simple and easy to use set of wrappers around the fantastic 'stringi' package. All function and argument names (and positions) are consistent, all functions deal with "NA"'s and zero length vectors in the same way, and the output from one function is easy to feed into the input of another.

A set of tools that solves a common set of problems: you need to break a big problem down into manageable pieces, operate on each piece and then put all the pieces back together. For example, you might want to fit a model to each spatial location or time point in your study, summarise data by panels or collapse high-dimensional arrays to simpler summary statistics. The development of 'plyr' has been generously supported by 'Becton Dickinson'.

Graphical scales map data to aesthetics, and provide methods for automatically determining breaks and labels for axes and legends.

A fast, consistent tool for working with data frame like objects, both in memory and out of memory.

Flexibly restructure and aggregate data using just two functions: melt and 'dcast' (or 'acast').

Useful tools for working with HTTP organised by HTTP verbs (GET(), POST(), etc). Configuration functions make it easy to control additional request components (authenticate(), add_headers() and so on).

An evolution of 'reshape2'. It's designed specifically for data tidying (not general reshaping or aggregating) and works well with 'dplyr' data pipelines.

Import excel files into R. Supports '.xls' via the embedded 'libxls' C library (http://sourceforge.net/projects/libxls/) and '.xlsx' via the embedded 'RapidXML' C++ library (http://rapidxml.sourceforge.net). Works on Windows, Mac and Linux without external dependencies.

Make your pure functions purr with the 'purrr' package. This package completes R's functional programming tools with missing features present in other programming languages.

An object oriented system using object-based, also called prototype-based, rather than class-based object oriented ideas.

The 'tidyverse' is a set of packages that work in harmony because they share common data representations and 'API' design. This package is designed to make it easy to install and load multiple 'tidyverse' packages in a single step. Learn more about the 'tidyverse' at <https://github.com/hadley/tidyverse>.

Import foreign statistical formats into R via the embedded 'ReadStat' C library (https://github.com/WizardMac/ReadStat).

Wrappers around the 'xml2' and 'httr' packages to make it easy to download, then manipulate, HTML and XML.

Generate your Rd documentation, 'NAMESPACE' file, and collation field using specially formatted comments. Writing documentation in-line with code makes it easier to keep your documentation up-to-date as your requirements change. 'Roxygen2' is inspired by the 'Doxygen' system for C++.

Helpers for reordering factor levels (including moving specified levels to front, ordering by first appearance, reversing, and randomly shuffling), and tools for modifying factor levels (including collapsing rare levels into other, 'anonymising', and manually 'recoding').

Functions for modelling that help you seamlessly integrate modelling into a pipeline of data manipulation and visualisation.

Airline on-time data for all flights departing NYC in 2013. Also includes useful 'metadata' on airlines, airports, weather, and planes.

A dataset about movies. This was previously contained in ggplot2, but has been moved its own package to reduce the download size of ggplot2.

An easy way to determine which directories on the users computer you should use to save data, caches and logs. A port of Python's 'Appdirs' (\url{https://github.com/ActiveState/appdirs}) to R.

This implements the data table back-end for 'dplyr' so that you can seamlessly use data table and 'dplyr' together.

Read and write feather files, a lightweight binary columnar data store designed for maximum speed.

A command-line interface to 'GGobi', an interactive and dynamic graphics package. 'Rggobi' complements the graphical user interface of 'GGobi' providing a way to fluidly transition between analysis and exploration, as well as automating common tasks.

A data only package containing commercial domestic flights that departed Houston (IAH and HOU) in 2011.

US baby names provided by the SSA. This package contains all names used for at least 5 children of either sex.

Implements the letter value 'boxplot' which extends the standard 'boxplot' to deal with both larger and smaller number of data points by dynamically selecting the appropriate number of letter values to display.

R's raw vector is useful for storing a single binary object. What if you want to put a vector of them in a data frame? The blob package provides the blob object, a list of raw vectors, suitable for use as a column in data frame.

Framework for visualising tables of counts, proportions and probabilities. The framework is called product plots, alluding to the computation of area as a product of height and width, and the statistical concept of generating a joint distribution from the product of conditional and marginal distributions. The framework, with extensions, is sufficient to encompass over 20 visualisations previously described in fields of statistical graphics and 'infovis', including bar charts, mosaic plots, 'treemaps', equal area plots and fluctuation diagrams.

profr provides an alternative data structure and visual rendering for the profiling information generated by Rprof.

Visualise clustering algorithms with GGobi. Contains both general code for visualising clustering results and specific visualisations for model-based, hierarchical and SOM clustering.

Implements geodesic interpolation and basis generation functions that allow you to create new tour methods from R.

Given $p$-dimensional training data containing $d$ groups (the design space), a classification algorithm (classifier) predicts which group new data belongs to. Generally the input to these algorithms is high dimensional, and the boundaries between groups will be high dimensional and perhaps curvilinear or multi-faceted. This package implements methods for understanding the division of space between the groups.

Exploratory model analysis. Fit and graphical explore ensembles of linear models.

Tools for monads.

A dplyr backend that partitions a data frame across multiple nodes in a cluster (e.g. cores on your computer) to make common operations faster.

A toolbox for working with calls, unevaluated code, and anything related to evaluation in R.

A Doxygen-like in-source documentation system for Rd, collation, and NAMESPACE. (This is the third rewrite)

Convert package rd files to static html pages, suitable for serving on a website.

Provides a 'tbl_df' class that offers better checking and printing capabilities than traditional data frames.

Provides a mechanism for chaining commands with a new forward-pipe operator, %>%. This operator will forward a value, or the result of an expression, into the next function call/expression. There is flexible support for the type of right-hand side expressions. For more information, see package vignette. To quote Rene Magritte, "Ceci n'est pas un pipe."

The curl() and curl_download() functions provide highly configurable drop-in replacements for base url() and download.file() with better performance, support for encryption (https, ftps), gzip compression, authentication, and other 'libcurl' goodies. The core of the package implements a framework for performing fully customized requests where data can be processed either in memory, on disk, or streaming via the callback or connection interfaces. Some knowledge of 'libcurl' is recommended; for a more-user-friendly web client see the 'httr' package which builds on this package with http specific tools and logic.

A database interface definition for communication between R and relational database management systems. All classes in this package are virtual and need to be extended by the various R/DBMS implementations.

Provides a general-purpose tool for dynamic report generation in R using Literate Programming techniques.

Parsing and evaluation tools that make it easy to recreate the command line behaviour of R.

Cache the results of a function so that when you call it again with the same arguments it returns the pre-computed value.

Functions to work with date-times and time-spans: fast and user friendly parsing of date-time data, extraction and updating of components of a date-time (years, months, days, hours, minutes, and seconds), algebraic manipulation on date-time and time-span objects. The 'lubridate' package has a consistent and memorable syntax that makes working with dates easy and fun.

Work with XML files using a simple, consistent interface. Built on top of the 'libxml2' C library.

Access the RStudio API (if available) and provide informative error messages when it's not.

A set of functions to run code 'with' safely and temporarily modified global state. Many of these functions were originally a part of the 'devtools' package, this provides a simple package with limited dependencies to provide access to these functions.

Convert statistical analysis objects from R into tidy data frames, so that they can more easily be combined, reshaped and otherwise processed with tools like 'dplyr', 'tidyr' and 'ggplot2'. The package provides three S3 generics: tidy, which summarizes a model's statistical findings such as coefficients of a regression; augment, which adds columns to the original data such as predictions, residuals and cluster assignments; and glance, which provides a one-row summary of model-level statistics.

Embeds the 'SQLite' database engine in R and provides an interface compliant with the 'DBI' package. The source for the 'SQLite' engine (version 3.8.8.2) is included.

A collection of functions to visualize spatial data and models on top of static maps from various online sources (e.g Google Maps and Stamen Maps). It includes tools common to those tasks, including functions for geolocation and routing.

Some extra themes, geoms, and scales for 'ggplot2'. Provides 'ggplot2' themes and scales that replicate the look of plots by Edward Tufte, Stephen Few, 'Fivethirtyeight', 'The Economist', 'Stata', 'Excel', and 'The Wall Street Journal', among others. Provides 'geoms' for Tufte's box plot and range frame.

The R package '\href{http://docs.ggplot2.org/current/}{ggplot2}' is a plotting system based on the grammar of graphics. '\href{https:// ggobi.github.io/ggally}{GGally}' extends 'ggplot2' by adding several functions to reduce the complexity of combining geometric objects with transformed data. Some of these functions include a pairwise plot matrix, a two group pairwise plot matrix, a parallel coordinates plot, a survival plot, and several functions to plot networks.

Create and customize interactive maps using the 'Leaflet' JavaScript library and the 'htmlwidgets' package. These maps can be used directly from the R console, from 'RStudio', in Shiny apps and R Markdown documents.

A graphics device for R that produces 'Scalable Vector Graphics'. 'svglite' is a fork of the older 'RSvgDevice' package.

Various tools for creating iterators, many patterned after functions in the Python itertools module, and others patterned after functions in the 'snow' package.

Download and install R packages stored in 'GitHub', 'BitBucket', or plain 'subversion' or 'git' repositories. This package is a lightweight replacement of the 'install_*' functions in 'devtools'. Indeed most of the code was copied over from 'devtools'.

User-facing R functions are provided to parse, compile, test, estimate, and analyze Stan models by accessing the header-only Stan library provided by the 'StanHeaders' package. The Stan project develops a probabilistic programming language that implements full Bayesian statistical inference via Markov Chain Monte Carlo, rough Bayesian inference via 'variational' approximation, and (optionally penalized) maximum likelihood estimation via optimization. In all three cases, automatic differentiation is used to quickly and accurately evaluate gradients without burdening the user with the need to derive the partial derivatives.

An implementation of an interactive grammar of graphics, taking the best parts of 'ggplot2', combining them with the reactive framework of 'shiny' and drawing web graphics using 'vega'.

A collection of miscellaneous basic statistic functions and convenience wrappers for efficiently describing data. The author's intention was to create a toolbox, which facilitates the (notoriously time consuming) first descriptive tasks in data analysis, consisting of calculating descriptive statistics, drawing graphical summaries and reporting the results. The package contains furthermore functions to produce documents using MS Word (or PowerPoint) and functions to import data from Excel. Many of the included functions can be found scattered in other packages and other sources written partly by Titans of R. The reason for collecting them here, was primarily to have them consolidated in ONE instead of dozens of packages (which themselves might depend on other packages which are not needed at all), and to provide a common and consistent interface as far as function and arguments naming, NA handling, recycling rules etc. are concerned. Google style guides were used as naming rules (in absence of convincing alternatives). The 'camel style' was consequently applied to functions borrowed from contributed R packages as well.

Some helpful extensions and modifications to the 'ggplot2' package. In particular, this package makes it easy to combine multiple 'ggplot2' plots into one and label them with letters, e.g. A, B, C, etc., as is often required for scientific publications. The package also provides a streamlined and clean theme that is used in the Wilke lab, hence the package name, which stands for Claus O. Wilke's plot package.

The 'HistData' package provides a collection of small data sets that are interesting and important in the history of statistics and data visualization. The goal of the package is to make these available, both for instructional use and for historical research. Some of these present interesting challenges for graphics or analysis in R.

These functions were developed to support functional data analysis as described in Ramsay, J. O. and Silverman, B. W. (2005) Functional Data Analysis. New York: Springer. They were ported from earlier versions in Matlab and S-PLUS. An introduction appears in Ramsay, J. O., Hooker, Giles, and Graves, Spencer (2009) Functional Data Analysis with R and Matlab (Springer). The package includes data sets and script files working many examples including all but one of the 76 figures in this latter book. Matlab versions of the code and sample analyses are no longer distributed through CRAN, as they were when the book was published. For those, ftp from http://www.psych.mcgill.ca/misc/fda/downloads/FDAfuns/ There you find a set of .zip files containing the functions and sample analyses, as well as two .txt files giving instructions for installation and some additional information. The changes from Version 2.4.1 are fixes of bugs in density.fd and removal of functions create.polynomial.basis, polynompen, and polynomial. These were deleted because the monomial basis does the same thing and because there were errors in the code.

Support for simple features, a standardized way to encode spatial data, with bindings to GDAL, GEOS and Proj.4.

Helper functions to work with spreadsheets and the "A1:D10" style of cell range specification.

Build a package documentation and function reference site and use it as the package vignette.

A suite of custom R Markdown formats and templates for authoring journal articles and conference submissions.

Mosaic plots in the 'ggplot2' framework. Mosaic plot functionality is provided in a single 'ggplot2' layer by calling the geom 'mosaic'.

A 'ggplot2' extension that provides flipped components: horizontal versions of 'Stats' and 'Geoms', and vertical versions of 'Positions'.

Interface with 'Google BigQuery', see <https://cloud.google.com/bigquery/> for more information. This package uses 'googleAuthR' so is compatible with similar packages, including 'Google Cloud Storage' (<https://cloud.google.com/storage/>) for result extracts.

Creating tiny yet beautiful documents and vignettes from R Markdown. The package provides the 'html_pretty' output format as an alternative to the 'html_document' and 'html_vignette' engines that convert R Markdown into HTML pages. Various themes and syntax highlight styles are supported.

Imports non-tabular from Excel files into R. Exposes cell content, position and formatting in a tidy structure for further manipulation. Provides functions for selecting cells by position and relative position, and for associating data cells with header cells by proximity in given directions. Supports '.xlsx' and '.xlsm' via the embedded 'RapidXML' C++ library <http://rapidxml.sourceforge.net>. Does not support '.xlsb' or '.xls'.

Work with labelled data imported from 'SPSS' or 'Stata' with 'haven' or 'foreign'.

This package provides user-level functions to manage namespaces not (yet) available in base R: 'registerNamespace', 'unregisterNamespace', 'makeNamespace', and 'getRegisteredNamespace' ('makeNamespaces' is extracted from the R 'base' package source code: src/library/base/R/namespace.R)

Provides a set of functions for interacting with the 'Digital Ocean' API at <https://developers.digitalocean.com/documentation/v2>, including creating images, destroying them, rebooting, getting details on regions, and available images.

Geometric objects defined in 'geozoo' can be simulated or displayed in the R package 'tourr'.

Produce publication quality graphics from output of GGobi's describe display plugin.

The GUI allows user to control the tour by checkboxes for the variable selection, slider for the speed, and toggle boxes for pause.

A 'ggplot2' extension to visualize two variables through one color aesthetic via mapping to a color space projection. With this technique for 2-D color mapping, one can create a bivariate choropleth in R as well as other visualizations with multivariate color scales. Includes two new scales and a new guide for 'ggplot2'.

Functions to convert Rd to roxygen documentation. It can parse an Rd file to a list, create the roxygen documentation and update the original R script (e.g. the one containing the definition of the function) accordingly. This package also provides utilities which can help developers build packages using roxygen more easily. The formatR package can be used to reformat the R code in the examples sections so that the code will be more readable.

The base R data.frame, like any vector, is copied upon modification. This behavior is at odds with that of GUIs and interactive graphics. To rectify this, plumbr provides a mutable, dynamic tabular data model. Models may be chained together to form the complex plumbing necessary for sophisticated graphical interfaces. Also included is a general framework for linking datasets; an typical use case would be a linked brush.

Tools for visual inference. Generate null data sets and null plots using permutation and simulation. Calculate distance metrics for a lineup, and examine the distributions of metrics.

Asks a custom Yes-No question with variable responses. The order and phrasing of the possible responses varies randomly to ensure the user consciously chooses (as opposed to automatically types their response).

ggsubplot makes it easy to embed customized subplots within larger graphics. Subplots may be used as a geom to explore interaction effects, spatial data, and hierarchical data. Subplots can also be used to explore big data without overplotting.

Better html documentation for R

This is an R package as the next generation of GGobi, a software package for interactive and dynamic statistical graphics. It includes most of features in GGobi such as brushing, zooming, panning, identifying and linking, as well as common types of statistical graphics, e.g. bar plot, scatter plot, boxplot, histogram, density plot, spine plot, parallel coordinates plot, mosaic plot, maps, missing value plot, time series plot, tour, scatter plot matrix, hexagons and tiles (color images), etc. Based on the support of several other packages, cranvas aims for speed (from Qt) and flexibility (from R), with the style and design borrowed from ggplot2.

Minimal client to access 'GitHub''s 'API'.

Interface to local and remote Git operations. Interface to local and remote Git operations. Interface to local and remote Git operations. Interface to local and remote Git operations.

Provides a simple interface to lookup and print R function definitions, including C and C++ compiled code from .Call, .C, .Internal and .External calls. Also lookup of S3 and S4 generics, including a simple dialog to print any or all of the loaded methods for the generic.