Learn R Programming

caroline (version 0.9.9)

bestBy: Find the "best" record within subgroups of a dataframe.

Description

Finding the an extreme record for each group within a dataset is a more challenging routine task in R and SQL. This function provides a easy interface to that functionality either using R (fast for small data frames) or SQL (fastest for large data)

Usage

bestBy(df, by, best, clmns=names(df), inverse=FALSE, sql=FALSE)

Value

A data frame of 'best' records from each factor level

Author

David Schruth

Arguments

df

a data frame.

by

the factor (or name of a factor in df) used to determine the grouping.

clmns

the colums to include in the output.

best

the column to sort on (both globally and for each sub/group)

inverse

the sorting order of the sort column as specified by 'best'

sql

whether or not to use SQLite to perform the operation.

See Also

groupBy

Examples

Run this code

blast.results <- data.frame(score=c(1,2,34,4,5,3,23), 
                            query=c('z','x','y','z','x','y','z'), 
                            target=c('a','b','c','d','e','f','g')
                            )
best.hits.R <- bestBy(blast.results, by='query', best='score', inverse=TRUE)
best.hits.R
## or using SQLite
best.hits.sql <- bestBy(blast.results, by='query', best='score', inverse=TRUE, sql=TRUE)
best.hits.sql

Run the code above in your browser using DataLab