groupBy: Group a datafame by a factor and perform aggreate functions.

Description

The R equvalent of a SQL 'group by' call.

Usage

groupBy(df, by, aggregation,  clmns=names(df), collapse=',', distinct=FALSE, sql=FALSE, full.names=FALSE, ...)

Arguments

a data frame.

the factor (or name of a factor in df) used to determine the grouping.

aggregation

the functions to perform on the output (default is to sum). Suggested functions are: 'sum','mean','var','sd','max','min','length','paste',NULL.

clmns

the colums to include in the output.

collapse

string delimiter for columns aggregated via 'paste' concatenation.

distinct

used in conjunction with paste and collapse to only return unique elements in a delimited concatenated string

sql

whether or not to use SQLite to perform the grouping (not yet implimented).

full.names

names of the aggregation functions should be appended to the output column names

...

additional parameters (such as na.rm) passed to the underlying aggregate functions.

Value

an summary/aggregate dataframe

Examples

Run this code

df <- data.frame(a=runif(12),b=c(runif(11),NA), z=rep(letters[13:18],2),w=rep(letters[20:23],3))

groupBy(df=df, by='w', clmns=c(rep(c('a','b'),3),'z','w'), aggregation=c('sum','mean','var','sd','min','max','paste','length'), full.names=TRUE, na.rm=TRUE)
# or using SQLite
groupBy(df=df, by='w', clmns=c(rep(c('a','b'),2),'z','w'), aggregation=c('sum','mean','min','max','paste','length'), full.names=TRUE, sql=TRUE)


## passing a custom function
meantop <- function(x,n=2, ...)
  mean(x[order(x, decreasing=TRUE)][1:n], ...)
  
groupBy(df, by='w', aggregation=rep(c('mean','max','meantop'),2), clmns=rep(c('a','b'),3), na.rm=TRUE)

Run the code above in your browser using DataLab

Description

Usage

Arguments

Value

See Also

Examples