crplyr v0.3.8

0

Monthly downloads

0th

Percentile

A 'dplyr' Interface for Crunch

In order to facilitate analysis of datasets hosted on the Crunch data platform <https://crunch.io/>, the 'crplyr' package implements 'dplyr' methods on top of the Crunch backend. The usual methods 'select', 'filter', 'group_by', 'summarize', and 'collect' are implemented in such a way as to perform as much computation on the server and pull as little data locally as possible.

Readme

crplyr: A 'dplyr' Interface for Crunch

R build status codecov cran

dplyr defines "a grammar of data manipulation" popular among R users. In order to facilitate analysis of datasets hosted by Crunch, this package implements 'dplyr' methods on top of the Crunch backend. The usual methods "select", "filter", "group_by", "summarize", and "collect" are implemented in such a way as to perform as much computation on the server and pull as little data locally as possible.

With a local data.frame, you might chain together a series of manipulations and create a table, such as:

> library(dplyr)
> data(mtcars)
> mtcars %>%
    filter(vs == 1) %>%
    group_by(gear) %>%
    summarize(horses=mean(hp), sd_horses=sd(hp), count=n())

## # A tibble: 3 × 4
##    gear horses sd_horses count
##   <dbl>  <dbl>     <dbl> <int>
## 1     3  104.0  6.557439     3
## 2     4   85.4 26.596575    10
## 3     5  113.0        NA     1

With crplyr, you can do the same operations, except that the dataset you're working with sits in the Crunch platform, and Crunch is doing the aggregations in the cloud:

> library(crplyr)
> login()
[crunch] > mtcars <- loadDataset("mtcars from R")
[crunch] > mtcars %>%
    filter(vs == 1) %>%
    group_by(gear) %>%
    summarize(horses=mean(hp), sd_horses=sd(hp), count=n())

## # A tibble: 3 × 4
##    gear horses sd_horses count
##  <fctr>  <dbl>     <dbl> <dbl>
## 1     3  104.0  6.557439     3
## 2     4   85.4 26.596575    10
## 3     5  113.0        NA     1

Obviously, the fact that the calculations in crplyr are happening remotely doesn't matter as much when working with a tiny dataset like "mtcars", but Crunch allows you to work with datasets larger than can fit in memory on your machine, and it enables you to collaborate naturally with others on the same dataset.

Installing

Install the CRAN release of crplyr with

install.packages("crplyr")

The pre-release version of the package can be pulled from GitHub using the remotes package:

# install.packages("remotes")
remotes::install_github("Crunch-io/crplyr")

For developers

The repository includes a Makefile to facilitate some common tasks, if you're into that sort of thing.

Running tests

$ make test. Requires the httptest package. You can also specify a specific test file or files to run by adding a "file=" argument, like $ make test file=select. test_package will do a regular-expression pattern match within the file names. See its documentation in the testthat package.

Updating documentation

$ make doc. Requires the roxygen2 package.

Functions in crplyr

Name Description
summarize Aggregate a Crunch dataset
mutate Mutate Crunch datasets (not implemented)
theme_crunch Crunch ggplot theme
unweighted_n Return the unweighted counts from summarize
as_cr_tibble Flatten a Crunch Cube
collect Collect a Crunch dataset from the server
filter_.CrunchDataset Filter a Crunch dataset (deprecated)
filter Filter a Crunch dataset
group_by Group-by for Crunch datasets
autoplot Autoplot methods for Crunch Objects
select Select columns from a Crunch dataset
GroupedCrunchDataset-class A Crunch Dataset "Grouped By" Something
No Results!

Vignettes of crplyr

Name
plotting.Rmd
No Results!

Last month downloads

Details

Type Package
URL https://crunch.io/r/crplyr/, https://github.com/Crunch-io/crplyr
BugReports https://github.com/Crunch-io/crplyr/issues
License LGPL (>= 3)
RoxygenNote 7.1.0
Language en-US
VignetteBuilder knitr
NeedsCompilation no
Packaged 2021-02-01 23:37:34 UTC; gregfreedmanellis
Repository CRAN
Date/Publication 2021-02-02 02:40:03 UTC

Include our badge in your README

[![Rdoc](http://www.rdocumentation.org/badges/version/crplyr)](http://www.rdocumentation.org/packages/crplyr)