replyr (version 1.0.5)

replyr_summary: Compute usable summary of columns of tbl.

Description

Compute per-column summaries and return as a data.frame. Warning: can be an expensive operation.

Usage

replyr_summary(x, ..., countUniqueNum = FALSE,
  countUniqueNonNum = FALSE, cols = NULL, compute = TRUE)

Arguments

x

tbl or item that can be coerced into such.

...

force additional arguments to be bound by name.

countUniqueNum

logical, if true include unique non-NA counts for numeric cols.

countUniqueNonNum

logical, if true include unique non-NA counts for non-numeric cols.

cols

if not NULL set of columns to restrict to.

compute

logical if TRUE call compute before working

Value

summary of columns.

Details

Can be slow compared to dplyr::summarize_all() (but serves a different purpose). Also, for numeric columns includes NaN in nna count (as is typical for R, e.g., is.na(NaN)). And note: replyr_summary() currently skips "raw" columns.

See Also

rsummary

Examples

Run this code
# NOT RUN {
d <- data.frame(p= c(TRUE, FALSE, NA),
                r= I(list(1,2,3)),
                s= NA,
                t= as.raw(3:5),
                w= 1:3,
                x= c(NA,2,3),
                y= factor(c(3,5,NA)),
                z= c('a',NA,'z'),
                stringsAsFactors=FALSE)
# sc <- sparklyr::spark_connect(version='2.2.0',
#                                  master = "local")
# dS <- replyr_copy_to(sc, dplyr::select(d, -r, -t), 'dS',
#                      temporary=TRUE, overwrite=TRUE)
# replyr_summary(dS)
# sparklyr::spark_disconnect(sc)
if (requireNamespace("RSQLite", quietly = TRUE)) {
  my_db <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")
  RSQLite::initExtension(my_db)
  dM <- replyr_copy_to(my_db, dplyr::select(d, -r, -t), 'dM',
                       temporary=TRUE, overwrite=TRUE)
  print(replyr_summary(dM))
  DBI::dbDisconnect(my_db)
}
d$q <- list(1,2,3)
replyr_summary(d)

# }

Run the code above in your browser using DataLab