fprod: Fast (Grouped, Weighted) Product for Matrix-Like Objects

Description

fprod is a generic function that computes the (column-wise) product of all values in x, (optionally) grouped by g and/or weighted by w. The TRA argument can further be used to transform x using its (grouped) product.

Usage

fprod(x, ...)
# S3 method for default
fprod(x, g = NULL, w = NULL, TRA = NULL, na.rm = TRUE,
      use.g.names = TRUE, ...)
# S3 method for matrix
fprod(x, g = NULL, w = NULL, TRA = NULL, na.rm = TRUE,
      use.g.names = TRUE, drop = TRUE, ...)
# S3 method for data.frame
fprod(x, g = NULL, w = NULL, TRA = NULL, na.rm = TRUE,
      use.g.names = TRUE, drop = TRUE, ...)
# S3 method for grouped_df
fprod(x, w = NULL, TRA = NULL, na.rm = TRUE,
      use.g.names = FALSE, keep.group_vars = TRUE, keep.w = TRUE, ...)

Arguments

a numeric vector, matrix, data.frame or grouped tibble (dplyr::grouped_df).

a factor, GRP object, atomic vector (internally converted to factor) or a list of vectors / factors (internally converted to a GRP object) used to group x.

a numeric vector of (non-negative) weights, may contain missing values.

TRA

an integer or quoted operator indicating the transformation to perform: 1 - "replace_fill" | 2 - "replace" | 3 - "-" | 4 - "-+" | 5 - "/" | 6 - "%" | 7 - "+" | 8 - "*" | 9 - "%%" | 10 - "-%%". See TRA.

na.rm

logical. Skip missing values in x. Defaults to TRUE and implemented at very little computational cost. If na.rm = FALSE a NA is returned when encountered.

use.g.names

make group-names and add to the result as names (vector method) or row-names (matrix and data.frame method). No row-names are generated for data.tables and grouped tibbles.

drop

matrix and data.frame method: drop dimensions and return an atomic vector if g = NULL and TRA = NULL.

keep.group_vars

grouped_df method: Logical. FALSE removes grouping variables after computation.

keep.w

grouped_df method: Logical. Retain product of weighting variable after computation (if contained in grouped_df).

...

arguments to be passed to or from other methods.

Value

The product of x, grouped by g, or (if TRA is used) x transformed by its product, grouped by g.

Details

Non-grouped product computations internally utilize long-doubles in C++, for additional numeric precision.

Missing-value removal as controlled by the na.rm argument is done very efficiently by simply skipping them in the computation (thus setting na.rm = FALSE on data with no missing values doesn't give extra speed). Large performance gains can nevertheless be achieved in the presence of missing values if na.rm = FALSE, since then the corresponding computation is terminated once a NA is encountered and NA is returned (unlike base::prod which just runs through without any checks).

This all seamlessly generalizes to grouped computations, which are performed in a single pass (without splitting the data) and therefore extremely fast.

The weighted product is computed as prod(x * w). If na.rm = TRUE, missing values will be removed from both x and w i.e. utilizing only x[complete.cases(x,w)] and w[complete.cases(x,w)].

When applied to data frame's with groups or drop = FALSE, fprod preserves all column attributes (such as variable labels) but does not distinguish between classed and unclassed objects. The attributes of the data.frame itself are also preserved.

Examples

Run this code

# NOT RUN {
## default vector method
mpg <- mtcars$mpg
fprod(mpg)                         # Simple product
fprod(mpg, w = mtcars$hp)          # Weighted product
fprod(mpg, TRA = "/")              # Simple transformation: Divide by product
fprod(mpg, mtcars$cyl)             # Grouped product
fprod(mpg, mtcars$cyl, mtcars$hp)  # Weighted grouped product
fprod(mpg, mtcars[c(2,8:9)])       # More groups...
g <- GRP(mtcars, ~ cyl + vs + am)  # Precomputing groups gives more speed !!
fprod(mpg, g)
fprod(mpg, g, TRA = "/")           # Groupwise divide by product

## data.frame method
fprod(mtcars)
fprod(mtcars, TRA = "/")
fprod(mtcars, g)
fprod(mtcars, g, use.g.names = FALSE) # No row-names generated

## matrix method
m <- qM(mtcars)
fprod(m)
fprod(m, TRA = "/")
fprod(m, g) # etc...

## method for grouped tibbles - for use with dplyr
library(dplyr)
mtcars %>% group_by(cyl,vs,am) %>% fprod(hp)   # Weighted grouped product
mtcars %>% group_by(cyl,vs,am) %>% fprod(TRA = "/")
mtcars %>% group_by(cyl,vs,am) %>% select(mpg) %>% fprod
# }

Run the code above in your browser using DataLab