plyr v1.8.4
Monthly downloads
Tools for Splitting, Applying and Combining Data
A set of tools that solves a common set of problems: you
need to break a big problem down into manageable pieces, operate on each
piece and then put all the pieces back together. For example, you might
want to fit a model to each spatial location or time point in your study,
summarise data by panels or collapse high-dimensional arrays to simpler
summary statistics. The development of 'plyr' has been generously supported
by 'Becton Dickinson'.
Readme
plyr
plyr is a set of tools for a common set of problems: you need to split up a big data structure into homogeneous pieces, apply a function to each piece and then combine all the results back together. For example, you might want to:
- fit the same model each patient subsets of a data frame
- quickly calculate summary statistics for each group
- perform group-wise transformations like scaling or standardising
It's already possible to do this with base R functions (like split and the apply family of functions), but plyr makes it all a bit easier with:
- totally consistent names, arguments and outputs
- convenient parallelisation through the foreach package
- input from and output to data.frames, matrices and lists
- progress bars to keep track of long running operations
- built-in error recovery, and informative error messages
- labels that are maintained across all transformations
Considerable effort has been put into making plyr fast and memory efficient, and in many cases plyr is as fast as, or faster than, the built-in equivalents.
A detailed introduction to plyr has been published in JSS: "The Split-Apply-Combine Strategy for Data Analysis", http://www.jstatsoft.org/v40/i01/. You can find out more at http://had.co.nz/plyr/, or track development at http://github.com/hadley/plyr. You can ask questions about plyr (and data manipulation in general) on the plyr mailing list. Sign up at http://groups.google.com/group/manipulatr.
Functions in plyr
Name | Description | |
as.data.frame.function | Make a function return a data frame. | |
arrange | Order a data frame by its colums. | |
a_ply | Split array, apply function, and discard results. | |
amv_dimnames | Dimension names. | |
as.quoted | Convert input to quoted variables. | |
alply | Split array, apply function, and return results in a list. | |
amv_dim | Dimensions. | |
aaply | Split array, apply function, and return results in an array. | |
adply | Split array, apply function, and return results in a data frame. | |
as.list.split | Convert split list to regular list. | |
ddply | Split data frame, apply function, and return results in a data frame. | |
defaults | Set defaults. | |
d_ply | Split data frame, apply function, and discard results. | |
create_progress_bar | Create progress bar. | |
eval.quoted | Evaluate a quoted list of variables. | |
here | Capture current evaluation context. | |
empty | Check if a data frame is empty. | |
dlply | Split data frame, apply function, and return results in a list. | |
dims | Number of dimensions. | |
id | Compute a unique numeric id for each unique row in a data frame. | |
is.formula | Is a formula? Checks if argument is a formula | |
idata.frame | Construct an immutable data frame. | |
count | Count the number of occurences. | |
join_all | Recursively join a list of data frames. | |
l_ply | Split list, apply function, and discard results. | |
join.keys | Join keys. Given two data frames, create a unique key for each row. | |
desc | Descending order. | |
baseball | Yearly batting records for all major league baseball players | |
indexed_df | An indexed data frame. | |
indexed_array | An indexed array. | |
colwise | Column-wise function. | |
compact | Compact list. | |
join | Join two data frames together. | |
daply | Split data frame, apply function, and return results in an array. | |
liply | Experimental iterator based version of llply. | |
laply | Split list, apply function, and return results in an array. | |
m_ply | Call function with arguments in array or data frame, discarding results. | |
list_to_vector | List to vector. | |
list_to_array | List to array. | |
[.split | Subset splits. | |
mapvalues | Replace specified values with new values, in a vector or factor. | |
mutate | Mutate a data frame by adding new or replacing existing columns. | |
name_rows | Toggle row names between explicit and implicit. | |
match_df | Extract matching rows of a data frame. | |
names.quoted | Compute names of quoted variables. | |
plyr-deprecated | Deprecated Functions in Package plyr | |
mdply | Call function with arguments in array or data frame, returning a data frame. | |
ozone | Monthly ozone measurements over Central America. | |
nunique | Number of unique values. | |
id_var | Numeric id for a vector. | |
. | Quote variables to create a list of unevaluated expressions for later evaluation. | |
progress_win | Graphical progress bar, powered by Windows. | |
each | Aggregate multiple functions into a single function. | |
quickdf | Quick data frame. | |
failwith | Fail with specified value. | |
progress_none | Null progress bar | |
print.split | Print split. | |
rbind.fill | Combine data.frames by row, filling in missing columns. | |
raply | Replicate expression and return results in a array. | |
plyr | plyr: the split-apply-combine paradigm for R. | |
print.quoted | Print quoted variables. | |
rename | Modify names by name, not position. | |
revalue | Replace specified values with new values, in a factor or character vector. | |
round_any | Round to multiple of any number. | |
splitter_a | Split an array by .margins. | |
take | Take a subset along an arbitrary dimension | |
vaggregate | Vector aggregate. | |
summarise | Summarise a data frame. | |
try_default | Try, with default in case of error. | |
rdply | Replicate expression and return results in a data frame. | |
rbind.fill.matrix | Bind matrices by row, and fill missing columns with NA. | |
reduce_dim | Reduce dimensions. | |
tryapply | Apply with built in try. Uses compact, lapply and tryNULL | |
split_labels | Generate labels for split data frame. | |
unrowname | Un-rowname. | |
is.discrete | Determine if a vector is discrete. | |
loop_apply | Loop apply | |
isplit2 | Split iterator that returns values, not indices. | |
maply | Call function with arguments in array or data frame, returning an array. | |
llply | Split list, apply function, and return results in a list. | |
ldply | Split list, apply function, and return results in a data frame. | |
list_to_dataframe | List to data frame. | |
mlply | Call function with arguments in array or data frame, returning a list. | |
progress_tk | Graphical progress bar, powered by Tk. | |
progress_text | Text progress bar. | |
progress_time | Text progress bar with time. | |
rlply | Replicate expression and return results in a list. | |
r_ply | Replicate expression and discard results. | |
true | Function that always returns true. | |
splat | `Splat' arguments to a function. | |
split_indices | Split indices. | |
splitter_d | Split a data frame by variables. | |
strip_splits | Remove splitting variables from a data frame. | |
No Results! |
Last month downloads
Details
URL | http://had.co.nz/plyr, https://github.com/hadley/plyr |
BugReports | https://github.com/hadley/plyr/issues |
LinkingTo | Rcpp |
License | MIT + file LICENSE |
LazyData | true |
RoxygenNote | 5.0.1 |
NeedsCompilation | yes |
Packaged | 2016-06-07 19:58:36 UTC; hadley |
Repository | CRAN |
Date/Publication | 2016-06-08 10:40:15 |
suggests | abind , covr , doParallel , foreach , iterators , itertools , tcltk , testthat |
depends | R (>= 3.1.0) |
imports | Rcpp (>= 0.11.0) |
Contributors |
Include our badge in your README
[](http://www.rdocumentation.org/packages/plyr)