Learn R Programming

distrr (version 0.0.6)

dcc: Data cube creation (dcc)

Description

Data cube creation (dcc)

Usage

dcc(.data, .variables, .fun = jointfun_, ...)

dcc2(.data, .variables, .fun = jointfun_, order_type = extract_unique2, ...)

dcc5( .data, .variables, .fun = jointfun_, .total = "Totale", order_type = extract_unique4, .all = TRUE, ... )

Arguments

.data

data frame to be processed

.variables

variables to split data frame by, as a character vector (c("var1", "var2")).

.fun

function to apply to each piece (default: jointfun_)

...

additional functions passed to .fun.

order_type

a function like extract_unique or extract_unique2.

.total

character string with the name to give to the subset of data that includes all the observations of a variable (default: "Totale").

.all

logical, indicating if functions' have to be evaluated on the complete dataset.

Value

a data cube, with a column for each cateogorical variable used, and a row for each combination of all the categorical variables' modalities. In addition to all the modalities, each variable will also have a "Total" possibility, which includes all the others. The data cube will contain marginal, conditional and joint empirical distributions...

Examples

Run this code
# NOT RUN {
data("invented_wages")
str(invented_wages)
tmp <- dcc(.data = invented_wages, 
           .variables = c("gender", "sector"), .fun = jointfun_)
tmp
str(tmp)
tmp2 <- dcc2(.data = invented_wages, 
            .variables = c("gender", "education"), 
            .fun = jointfun_, 
            order_type = extract_unique2)
tmp2
str(tmp2)

# dcc5 works like dcc2, but has an additional optional argument, .total,
# that can be added to give a name to the groups that include all the 
# observations of a variable.
tmp5 <- dcc5(.data = invented_wages, 
            .variables = c("gender", "education"),
            .fun = jointfun_,
            .total = "TOTAL",
            order_type = extract_unique2)
tmp5

# }

Run the code above in your browser using DataLab