plyr v1.8.4

0

Monthly downloads

0th

Percentile

by Hadley Wickham

Tools for Splitting, Applying and Combining Data

A set of tools that solves a common set of problems: you need to break a big problem down into manageable pieces, operate on each piece and then put all the pieces back together. For example, you might want to fit a model to each spatial location or time point in your study, summarise data by panels or collapse high-dimensional arrays to simpler summary statistics. The development of 'plyr' has been generously supported by 'Becton Dickinson'.

Readme

plyr

Build Status Coverage Status

plyr is a set of tools for a common set of problems: you need to split up a big data structure into homogeneous pieces, apply a function to each piece and then combine all the results back together. For example, you might want to:

  • fit the same model each patient subsets of a data frame
  • quickly calculate summary statistics for each group
  • perform group-wise transformations like scaling or standardising

It's already possible to do this with base R functions (like split and the apply family of functions), but plyr makes it all a bit easier with:

  • totally consistent names, arguments and outputs
  • convenient parallelisation through the foreach package
  • input from and output to data.frames, matrices and lists
  • progress bars to keep track of long running operations
  • built-in error recovery, and informative error messages
  • labels that are maintained across all transformations

Considerable effort has been put into making plyr fast and memory efficient, and in many cases plyr is as fast as, or faster than, the built-in equivalents.

A detailed introduction to plyr has been published in JSS: "The Split-Apply-Combine Strategy for Data Analysis", http://www.jstatsoft.org/v40/i01/. You can find out more at http://had.co.nz/plyr/, or track development at http://github.com/hadley/plyr. You can ask questions about plyr (and data manipulation in general) on the plyr mailing list. Sign up at http://groups.google.com/group/manipulatr.

Functions in plyr

Name Description
as.data.frame.function Make a function return a data frame.
arrange Order a data frame by its colums.
a_ply Split array, apply function, and discard results.
amv_dimnames Dimension names.
as.quoted Convert input to quoted variables.
alply Split array, apply function, and return results in a list.
amv_dim Dimensions.
aaply Split array, apply function, and return results in an array.
adply Split array, apply function, and return results in a data frame.
as.list.split Convert split list to regular list.
ddply Split data frame, apply function, and return results in a data frame.
defaults Set defaults.
d_ply Split data frame, apply function, and discard results.
create_progress_bar Create progress bar.
eval.quoted Evaluate a quoted list of variables.
here Capture current evaluation context.
empty Check if a data frame is empty.
dlply Split data frame, apply function, and return results in a list.
dims Number of dimensions.
id Compute a unique numeric id for each unique row in a data frame.
is.formula Is a formula? Checks if argument is a formula
idata.frame Construct an immutable data frame.
count Count the number of occurences.
join_all Recursively join a list of data frames.
l_ply Split list, apply function, and discard results.
join.keys Join keys. Given two data frames, create a unique key for each row.
desc Descending order.
baseball Yearly batting records for all major league baseball players
indexed_df An indexed data frame.
indexed_array An indexed array.
colwise Column-wise function.
compact Compact list.
join Join two data frames together.
daply Split data frame, apply function, and return results in an array.
liply Experimental iterator based version of llply.
laply Split list, apply function, and return results in an array.
m_ply Call function with arguments in array or data frame, discarding results.
list_to_vector List to vector.
list_to_array List to array.
[.split Subset splits.
mapvalues Replace specified values with new values, in a vector or factor.
mutate Mutate a data frame by adding new or replacing existing columns.
name_rows Toggle row names between explicit and implicit.
match_df Extract matching rows of a data frame.
names.quoted Compute names of quoted variables.
plyr-deprecated Deprecated Functions in Package plyr
mdply Call function with arguments in array or data frame, returning a data frame.
ozone Monthly ozone measurements over Central America.
nunique Number of unique values.
id_var Numeric id for a vector.
. Quote variables to create a list of unevaluated expressions for later evaluation.
progress_win Graphical progress bar, powered by Windows.
each Aggregate multiple functions into a single function.
quickdf Quick data frame.
failwith Fail with specified value.
progress_none Null progress bar
print.split Print split.
rbind.fill Combine data.frames by row, filling in missing columns.
raply Replicate expression and return results in a array.
plyr plyr: the split-apply-combine paradigm for R.
print.quoted Print quoted variables.
rename Modify names by name, not position.
revalue Replace specified values with new values, in a factor or character vector.
round_any Round to multiple of any number.
splitter_a Split an array by .margins.
take Take a subset along an arbitrary dimension
vaggregate Vector aggregate.
summarise Summarise a data frame.
try_default Try, with default in case of error.
rdply Replicate expression and return results in a data frame.
rbind.fill.matrix Bind matrices by row, and fill missing columns with NA.
reduce_dim Reduce dimensions.
tryapply Apply with built in try. Uses compact, lapply and tryNULL
split_labels Generate labels for split data frame.
unrowname Un-rowname.
is.discrete Determine if a vector is discrete.
loop_apply Loop apply
isplit2 Split iterator that returns values, not indices.
maply Call function with arguments in array or data frame, returning an array.
llply Split list, apply function, and return results in a list.
ldply Split list, apply function, and return results in a data frame.
list_to_dataframe List to data frame.
mlply Call function with arguments in array or data frame, returning a list.
progress_tk Graphical progress bar, powered by Tk.
progress_text Text progress bar.
progress_time Text progress bar with time.
rlply Replicate expression and return results in a list.
r_ply Replicate expression and discard results.
true Function that always returns true.
splat `Splat' arguments to a function.
split_indices Split indices.
splitter_d Split a data frame by variables.
strip_splits Remove splitting variables from a data frame.
No Results!

Last month downloads

Details

URL http://had.co.nz/plyr, https://github.com/hadley/plyr
BugReports https://github.com/hadley/plyr/issues
LinkingTo Rcpp
License MIT + file LICENSE
LazyData true
RoxygenNote 5.0.1
NeedsCompilation yes
Packaged 2016-06-07 19:58:36 UTC; hadley
Repository CRAN
Date/Publication 2016-06-08 10:40:15

Include our badge in your README

[![Rdoc](http://www.rdocumentation.org/badges/version/plyr)](http://www.rdocumentation.org/packages/plyr)