Learn R Programming

⚠️There's a newer version (1.8.9) of this package.Take me there.

plyr

plyr is a set of tools for a common set of problems: you need to split up a big data structure into homogeneous pieces, apply a function to each piece and then combine all the results back together. For example, you might want to:

  • fit the same model each patient subsets of a data frame
  • quickly calculate summary statistics for each group
  • perform group-wise transformations like scaling or standardising

It's already possible to do this with base R functions (like split and the apply family of functions), but plyr makes it all a bit easier with:

  • totally consistent names, arguments and outputs
  • convenient parallelisation through the foreach package
  • input from and output to data.frames, matrices and lists
  • progress bars to keep track of long running operations
  • built-in error recovery, and informative error messages
  • labels that are maintained across all transformations

Considerable effort has been put into making plyr fast and memory efficient, and in many cases plyr is as fast as, or faster than, the built-in equivalents.

A detailed introduction to plyr has been published in JSS: "The Split-Apply-Combine Strategy for Data Analysis", http://www.jstatsoft.org/v40/i01/. You can find out more at http://had.co.nz/plyr/, or track development at http://github.com/hadley/plyr. You can ask questions about plyr (and data manipulation in general) on the plyr mailing list. Sign up at http://groups.google.com/group/manipulatr.

Status

plyr is retired: this means only changes necessary to keep it on CRAN will be made. We recommend using dplyr (for data frames) or purrr (for lists) instead.

Copy Link

Version

Install

install.packages('plyr')

Monthly Downloads

358,508

Version

1.8.5

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Hadley Wickham

Last Published

December 10th, 2019

Functions in plyr (1.8.5)

daply

Split data frame, apply function, and return results in an array.
amv_dimnames

Dimension names.
d_ply

Split data frame, apply function, and discard results.
create_progress_bar

Create progress bar.
eval.quoted

Evaluate a quoted list of variables.
colwise

Column-wise function.
id_var

Numeric id for a vector.
id

Compute a unique numeric id for each unique row in a data frame.
baseball

Yearly batting records for all major league baseball players
defaults

Set defaults.
ddply

Split data frame, apply function, and return results in a data frame.
[.split

Subset splits.
desc

Descending order.
each

Aggregate multiple functions into a single function.
empty

Check if a data frame is empty.
here

Capture current evaluation context.
compact

Compact list.
count

Count the number of occurences.
join

Join two data frames together.
failwith

Fail with specified value.
dims

Number of dimensions.
idata.frame

Construct an immutable data frame.
indexed_df

An indexed data frame.
indexed_array

An indexed array.
llply

Split list, apply function, and return results in a list.
list_to_dataframe

List to data frame.
join_all

Recursively join a list of data frames.
loop_apply

Loop apply
is.discrete

Determine if a vector is discrete.
dlply

Split data frame, apply function, and return results in a list.
is.formula

Is a formula? Checks if argument is a formula
l_ply

Split list, apply function, and discard results.
rename

Modify names by name, not position.
mapvalues

Replace specified values with new values, in a vector or factor.
revalue

Replace specified values with new values, in a factor or character vector.
progress_tk

Graphical progress bar, powered by Tk.
split_labels

Generate labels for split data frame.
match_df

Extract matching rows of a data frame.
list_to_vector

List to vector.
progress_win

Graphical progress bar, powered by Windows.
liply

Experimental iterator based version of llply.
splitter_a

Split an array by .margins.
mutate

Mutate a data frame by adding new or replacing existing columns.
ozone

Monthly ozone measurements over Central America.
list_to_array

List to array.
plyr-deprecated

Deprecated Functions in Package plyr
mlply

Call function with arguments in array or data frame, returning a list.
isplit2

Split iterator that returns values, not indices.
m_ply

Call function with arguments in array or data frame, discarding results.
maply

Call function with arguments in array or data frame, returning an array.
progress_text

Text progress bar.
mdply

Call function with arguments in array or data frame, returning a data frame.
join.keys

Join keys. Given two data frames, create a unique key for each row.
vaggregate

Vector aggregate.
progress_time

Text progress bar with time.
name_rows

Toggle row names between explicit and implicit.
r_ply

Replicate expression and discard results.
tryapply

Apply with built in try. Uses compact, lapply and tryNULL
plyr

plyr: the split-apply-combine paradigm for R.
take

Take a subset along an arbitrary dimension
summarise

Summarise a data frame.
print.quoted

Print quoted variables.
round_any

Round to multiple of any number.
laply

Split list, apply function, and return results in an array.
unrowname

Un-rowname.
rlply

Replicate expression and return results in a list.
raply

Replicate expression and return results in a array.
ldply

Split list, apply function, and return results in a data frame.
nunique

Number of unique values.
.

Quote variables to create a list of unevaluated expressions for later evaluation.
quickdf

Quick data frame.
names.quoted

Compute names of quoted variables.
splat

`Splat' arguments to a function.
rbind.fill.matrix

Bind matrices by row, and fill missing columns with NA.
rbind.fill

Combine data.frames by row, filling in missing columns.
split_indices

Split indices.
splitter_d

Split a data frame by variables.
rdply

Replicate expression and return results in a data frame.
print.split

Print split.
progress_none

Null progress bar
reduce_dim

Reduce dimensions.
true

Function that always returns true.
strip_splits

Remove splitting variables from a data frame.
try_default

Try, with default in case of error.
a_ply

Split array, apply function, and discard results.
as.list.split

Convert split list to regular list.
adply

Split array, apply function, and return results in a data frame.
arrange

Order a data frame by its colums.
amv_dim

Dimensions.
alply

Split array, apply function, and return results in a list.
as.quoted

Convert input to quoted variables.
as.data.frame.function

Make a function return a data frame.
aaply

Split array, apply function, and return results in an array.