groupBy: Group a datafame by a factor and perform aggreate functions.

Description

The R equvalent of a SQL 'group by' call.

Usage

groupBy(df, by, aggregation,  clmns=names(df), collapse=',',
                distinct=FALSE, sql=FALSE, full.names=FALSE, ...)

Value

an summary/aggregate dataframe

Arguments

df: a data frame.
by: the factor (or name of a factor in df) used to determine the grouping.
aggregation: the functions to perform on the output (default is to sum). Suggested functions are: 'sum','mean','var','sd','max','min','length','paste',NULL.
clmns: the colums to include in the output.
collapse: string delimiter for columns aggregated via 'paste' concatenation.
distinct: used in conjunction with paste and collapse to only return unique elements in a delimited concatenated string
sql: whether or not to use SQLite to perform the grouping (not yet implimented).
full.names: names of the aggregation functions should be appended to the output column names
...: additional parameters (such as na.rm) passed to the underlying aggregate functions.

Examples

Run this code

df <- data.frame(a=runif(12),b=c(runif(11),NA), 
                 z=rep(letters[13:18],2),w=rep(letters[20:23],3))

groupBy(df=df, by='w', clmns=c(rep(c('a','b'),3),'z','w'), 
 aggregation=c('sum','mean','var','sd','min','max','paste','length'), 
 full.names=TRUE, na.rm=TRUE)
# or using SQLite
groupBy(df=df, by='w', clmns=c(rep(c('a','b'),2),'z','w'), 
        aggregation=c('sum','mean','min','max','paste','length'), 
        full.names=TRUE, sql=TRUE)


## passing a custom function
meantop <- function(x,n=2, ...)
  mean(x[order(x, decreasing=TRUE)][1:n], ...)
  
groupBy(df, by='w', aggregation=rep(c('mean','max','meantop'),2), 
                    clmns=rep(c('a','b'),3), na.rm=TRUE)

Run the code above in your browser using DataLab

Description

Usage

Value

Arguments

See Also

Examples