select_.list: emil and dplyr integration

Description

Modeling results can be converted to tabular format and manipulated using dplyr and other Hadleyverse packages. This is accomplished by a class specific select_ function that differs somewhat in syntax from the default select_.

Usage

## S3 method for class 'list':
select_(.data, ..., .dots)
## S3 method for class 'modeling_result':
select_(.data, ..., .dots)

Arguments

.data

Modeling results, as returned by evaluate.

...

Not used, kept for consistency with dplyr.

.dots

Indices to select on each level of .data, i.e. the first index specifies which top level elements of .data to select, the second specifies second-level-elements etc. The last index must select elements that can be converted

Value

A data.frame in long format.

Examples

Run this code

# Produce some results
x <- iris[-5]
y <- iris$Species
names(y) <- sprintf("orchid%03i", seq_along(y))
cv <- resample("crossvalidation", y, nfold=3, nreplicate=2)
procedures <- list(nsc = modeling_procedure("pamr"),
                   rf = modeling_procedure("randomForest"))
result <- evaluate(procedures, x, y, resample=cv)

# Get the foldwise error for the NSC method
result %>% select(fold = TRUE, "nsc", error = "error")

# Compare both methods
require(tidyr)
result %>%
    select(fold = TRUE, method = TRUE, error = "error") %>%
    spread(method, error)
result %>%
    select(fold = TRUE, method = TRUE, error = "error") %>%
    group_by(method) %>% summarize(mean_error = mean(error))

# Investigate the variability in estimated class 2 probability across folds
result %>%
    select(fold = cv, "nsc", "prediction", probability = function(x) x$probability[,2]) %>%
    spread(fold, probability)

Run the code above in your browser using DataLab

Description

Usage

Arguments

Value

See Also

Examples