tidyr v0.6.0

0

Monthly downloads

0th

Percentile

by Hadley Wickham

Easily Tidy Data with `spread()` and `gather()` Functions

An evolution of 'reshape2'. It's designed specifically for data tidying (not general reshaping or aggregating) and works well with 'dplyr' data pipelines.

Readme

tidyr

Build Status codecov.io CRAN_Status_Badge

tidyr is a reframing of reshape2 designed to accompany the tidy data framework, and to work hand-in-hand with magrittr and dplyr to build a solid pipeline for data analysis.

Just as reshape2 did less than reshape, tidyr does less than reshape2. It's designed specifically for tidying data, not the general reshaping that reshape2 does, or the general aggregation that reshape did. In particular, built-in methods only work for data frames, and tidyr provides no margins or aggregation.

There are two fundamental verbs of data tidying:

  • gather() takes multiple columns, and gathers them into key-value pairs: it makes "wide" data longer.

  • spread(). takes two columns (key & value) and spreads in to multiple columns, it makes "long" data wider.

These verbs have a number of synonyms:

tidyr gather spread
reshape(2) melt cast
spreadsheets unpivot pivot
databases fold unfold

tidyr also provides separate() and extract() functions which makes it easier to pull apart a column that represents multiple variables. The complement to separate() is unite().

Installation

tidyr is available from CRAN. Install it with:

install.packages("tidyr")

The development version can be installed using:

# install.packages("devtools")
devtools::install_github("hadley/tidyr")

Getting started

To get started, read the tidy data vignette (vignette("tidy-data")) and check out the demos, demo(package = "tidyr")).

Note that tidyr is designed for use in conjunction with dplyr, so you should always load both:

library(tidyr)
library(dplyr)

References

If you'd like to learn more about these data reshaping operators, I'd recommend the following papers:

Functions in tidyr

Name Description
complete_ Standard-evaluation version of complete.
expand Expand data frame to include all combinations of values
complete Complete a data frame with missing combinations of data.
expand_ Expand (standard evaluation).
fill Fill in missing values.
separate_rows_ Standard-evaluation version of separate_rows.
replace_na Replace missing values
full_seq Create the full sequence of values in a vector.
nest_ Standard-evaluation version of nest.
%>% Pipe operator
separate_ Standard-evaluation version of separate.
gather_ Gather (standard-evaluation).
gather Gather columns into key-value pairs.
nest Nest repeated values in a list-variable.
table1 Example tabular representations
unite_ Standard-evaluation version of unite
unnest Unnest a list column.
smiths Some data about the Smith family.
spread_ Standard-evaluation version of spread.
separate_rows Separate a collapsed column into multiple rows.
unite Unite multiple columns into one.
unnest_ Standard-evaluation version of unnest.
separate Separate one column into multiple columns.
spread Spread a key-value pair across multiple columns.
who World Health Organization TB data
extract_ Standard-evaluation version of extract.
extract_numeric Extract numeric component of variable.
fill_ Standard-evaluation version of fill.
drop_na_ Standard-evaluation version of drop_na.
drop_na Drop rows containing missing values
extract Extract one column into multiple columns.
No Results!

Last month downloads

Details

License MIT + file LICENSE
LazyData true
URL https://github.com/hadley/tidyr
BugReports https://github.com/hadley/tidyr/issues
VignetteBuilder knitr
LinkingTo Rcpp
RoxygenNote 5.0.1
NeedsCompilation yes
Packaged 2016-08-11 22:23:45 UTC; hadley
Repository CRAN
Date/Publication 2016-08-12 10:20:52

Include our badge in your README

[![Rdoc](http://www.rdocumentation.org/badges/version/tidyr)](http://www.rdocumentation.org/packages/tidyr)