Learn R Programming

caroline (version 0.9.9)

groupBy: Group a datafame by a factor and perform aggreate functions.

Description

The R equvalent of a SQL 'group by' call.

Usage

groupBy(df, by, aggregation,  clmns=names(df), collapse=',',
                distinct=FALSE, sql=FALSE, full.names=FALSE, ...)

Value

an summary/aggregate dataframe

Arguments

df

a data frame.

by

the factor (or name of a factor in df) used to determine the grouping.

aggregation

the functions to perform on the output (default is to sum). Suggested functions are: 'sum','mean','var','sd','max','min','length','paste',NULL.

clmns

the colums to include in the output.

collapse

string delimiter for columns aggregated via 'paste' concatenation.

distinct

used in conjunction with paste and collapse to only return unique elements in a delimited concatenated string

sql

whether or not to use SQLite to perform the grouping (not yet implimented).

full.names

names of the aggregation functions should be appended to the output column names

...

additional parameters (such as na.rm) passed to the underlying aggregate functions.

See Also

aggregate, bestBy

Examples

Run this code
df <- data.frame(a=runif(12),b=c(runif(11),NA), 
                 z=rep(letters[13:18],2),w=rep(letters[20:23],3))

groupBy(df=df, by='w', clmns=c(rep(c('a','b'),3),'z','w'), 
 aggregation=c('sum','mean','var','sd','min','max','paste','length'), 
 full.names=TRUE, na.rm=TRUE)
# or using SQLite
groupBy(df=df, by='w', clmns=c(rep(c('a','b'),2),'z','w'), 
        aggregation=c('sum','mean','min','max','paste','length'), 
        full.names=TRUE, sql=TRUE)


## passing a custom function
meantop <- function(x,n=2, ...)
  mean(x[order(x, decreasing=TRUE)][1:n], ...)
  
groupBy(df, by='w', aggregation=rep(c('mean','max','meantop'),2), 
                    clmns=rep(c('a','b'),3), na.rm=TRUE)

Run the code above in your browser using DataLab