Learn R Programming

cheese (version 0.0.3)

dish: Dish out a function to combinations of variables

Description

Evaluates a two argument function on subsets of a data frame by evaluating each combination of columns or subsets.

Usage

dish(
    data,
    f,
    left = NULL,
    right = NULL,
    each_left = TRUE,
    each_right = TRUE,
    bind = FALSE,
    ...
)

Arguments

data

A data.frame.

f

Any function that takes a vector and/or data.frame in the first two arguments.

left

Variables to be used in the first argument of f. If NULL (default), all variables except those in right are used. Has tidyselect::vars_select capabilities.

right

Variables to be used in the second argument of f. If NULL (default), all variables except those in left are used. Has tidyselect::vars_select capabilities.

each_left

Should each left variable be separately evaluated in f? Defaults to TRUE. If FALSE, left is entered into f as a data.frame.

each_right

Should each right variable be separately evaluated in f? Defaults to TRUE. If FALSE, right is entered into f as a data.frame.

bind

Should results be binded into a single data.frame? Defaults to FALSE.

...

Additional arguments to be passed to f.

Value

A list (if bind = FALSE) or a tibble (if bind = TRUE) with the results of f evaluated on data subsets.

Examples

Run this code
# NOT RUN {
require(tidyverse)

#1) Default uses every variable on both sides
heart_disease %>%
    select_if(
        is.numeric
    ) %>%
    dish(
        f = cor
    )

#2) Simple regression of Age and BP on each variable
heart_disease %>%
    dish(
        f =
            function(y, x) {
                mod <- lm(y ~ x)
                tibble(
                    Parameter = names(mod$coef),
                    Estimate = mod$coef
                )
            },
        left = c("Age", "BP"),
        bind = TRUE
    )

#3) Multiple regression with Age, BP on all variables simultaneously
heart_disease %>%
    dish(
        f =
            function(y, x) {
                mod <- lm(y ~ ., data = x)
                tibble(
                    Parameter = names(mod$coef),
                    Estimate = mod$coef
                )
            },
        left = c("Age", "BP"),
        each_right = FALSE,
        bind = TRUE
    )

# }

Run the code above in your browser using DataLab