wk v0.3.0

0

Monthly downloads

0th

Percentile

Lightweight Well-Known Geometry Parsing

Provides a minimal R and C++ API for parsing well-known binary and well-known text representation of geometries to and from R-native formats. Well-known binary is compact and fast to parse; well-known text is human-readable and is useful for writing tests. These formats are only useful in R if the information they contain can be accessed in R, for which high-performance functions are provided here.

Readme

wk

Lifecycle:
experimental R build
status Codecov test
coverage

The goal of wk is to provide lightweight R and C++ infrastructure for packages to use well-known formats (well-known binary and well-known text) as input and/or output without requiring external software. Well-known binary is very fast to read and write, whereas well-known text is human-readable and human-writable. Together, these formats allow for efficient interchange between software packages (WKB), and highly readable tests and examples (WKT).

Installation

You can install the released version of s2 from CRAN with:

install.packages("wk")

You can install the development version from GitHub with:

# install.packages("remotes")
remotes::install_github("paleolimbot/wk")

If you can load the package, you’re good to go!

library(wk)

Basic vector classes for WKT and WKB

Use wkt() to mark a character vector as containing well-known text, or wkb() to mark a vector as well-known binary. These have some basic vector features built in, which means you can subset, repeat, concatenate, and put these objects in a data frame or tibble. These come with built-in format() and print() methods.

wkt("POINT (30 10)")
#> <wk_wkt[1]>
#> [1] POINT (30 10)
as_wkb(wkt("POINT (30 10)"))
#> <wk_wkb[1]>
#> [1] <POINT (30 10)>

Extract coordinates and meta information

One of the main drawbacks to passing around geometries in WKB is that the format is opaque to R users, who need coordinates as R objects rather than binary vectors. In addition to print() methods for wkb() vectors, the wk*_meta() and wk*_coords() functions provide usable coordinates and feature meta.

wkt_coords("POINT ZM (1 2 3 4)")
#>   feature_id part_id ring_id x y z m
#> 1          1       1       0 1 2 3 4
wkt_meta("POINT ZM (1 2 3 4)")
#>   feature_id part_id type_id size srid has_z has_m n_coords
#> 1          1       1       1    1   NA  TRUE  TRUE        1

Well-known R objects

The wk package experimentally generates (and parses) a plain R object format, which is needed because well-known binary can’t natively represent the empty point and reading/writing well-known text is too slow. The format of the wksxp() object is designed to be as close as possible to well-known text and well-known binary to make the translation code as clean as possible.

wkt_translate_wksxp("POINT (30 10)")
#> [[1]]
#>      [,1] [,2]
#> [1,]   30   10
#> attr(,"class")
#> [1] "wk_point"

Dependencies

The wk package imports Rcpp.

Using the C++ headers

The wk package takes an event-based approach to parsing inspired by the event-based SAX XML parser. This makes the readers and writers highly re-usable! This system is class-based, so you will have to make your own subclass of WKGeometryHandler and wire it up to a WKReader to do anything useful.

// If you're writing code in a package, you'll also
// have to put 'wk' in your `LinkingTo:` description field
// [[Rcpp::depends(wk)]]

#include <Rcpp.h>
#include "wk/rcpp-io.hpp"
#include "wk/wkt-reader.hpp"
using namespace Rcpp;

class CustomHandler: public WKGeometryHandler {
public:

  void nextFeatureStart(size_t featureId) {
    Rcout << "Do something before feature " << featureId << "\n";
  }

  void nextFeatureEnd(size_t featureId) {
    Rcout << "Do something after feature " << featureId << "\n";
  }
};

// [[Rcpp::export]]
void wkt_read_custom(CharacterVector wkt) {
  WKCharacterVectorProvider provider(wkt);
  WKTReader reader(provider);

  CustomHandler handler;
  reader.setHandler(&handler);

  while (reader.hasNextFeature()) {
    reader.iterateFeature();
  }
}

On our example point, this prints the following:

wkt_read_custom("POINT (30 10)")
#> Do something before feature 0
#> Do something after feature 0

The full handler interface includes methods for the start and end of features, geometries (which may be nested), linear rings, coordinates, and parse errors. You can preview what will get called for a given geometry using wkb|wkt_debug() functions.

wkt_debug("POINT (30 10)")
#> nextFeatureStart(0)
#>     nextGeometryStart(POINT [1], WKReader::PART_ID_NONE)
#>         nextCoordinate(POINT [1], WKCoord(x = 30, y = 10), 0)
#>     nextGeometryEnd(POINT [1], WKReader::PART_ID_NONE)
#> nextFeatureEnd(0)

Performance

This package was designed to stand alone and be flexible, but also happens to be really fast for some common operations.

Read WKB + Write WKB:

bench::mark(
  wk = wk:::wksxp_translate_wkb(wk:::wkb_translate_wksxp(nc_wkb)),
  sf = sf:::CPL_read_wkb(sf:::CPL_write_wkb(nc_sfc, EWKB = TRUE), EWKB = TRUE),
  check = FALSE
)
#> # A tibble: 2 x 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 wk            316µs    369µs     2620.   114.2KB     13.6
#> 2 sf            412µs    453µs     2106.    99.8KB     13.6

Read WKB + Write WKT:

bench::mark(
  wk = wk:::wkb_translate_wkt(nc_wkb),
  sf = sf:::st_as_text.sfc(sf:::st_as_sfc.WKB(nc_WKB, EWKB = TRUE)),
  check = FALSE
)
#> Warning: Some expressions had a GC in every iteration; so filtering is disabled.
#> # A tibble: 2 x 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 wk           3.03ms   3.52ms    282.      3.32KB      0  
#> 2 sf         205.77ms 208.71ms      4.81  566.66KB     14.4

Read WKT + Write WKB:

bench::mark(
  wk = wk:::wkt_translate_wkb(nc_wkt),
  sf = sf:::CPL_write_wkb(sf:::st_as_sfc.character(nc_wkt), EWKB = TRUE),
  check = FALSE
)
#> # A tibble: 2 x 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 wk           1.91ms   2.11ms      464.    53.6KB     0   
#> 2 sf           3.44ms   3.95ms      250.   185.7KB     4.20

Read WKT + Write WKT:

bench::mark(
  wk = wk::wksxp_translate_wkt(wk::wkt_translate_wksxp(nc_wkt)),
  sf = sf:::st_as_text.sfc(sf:::st_as_sfc.character(nc_wkt)),
  check = FALSE
)
#> Warning: Some expressions had a GC in every iteration; so filtering is disabled.
#> # A tibble: 2 x 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 wk           5.08ms   5.86ms    166.      63.8KB     1.98
#> 2 sf         209.88ms 211.35ms      4.68   226.6KB    14.0

Generate coordinates:

bench::mark(
  wk_wkb = wk::wksxp_coords(nc_sxp),
  sfheaders = sfheaders::sfc_to_df(nc_sfc),
  sf = sf::st_coordinates(nc_sfc),
  check = FALSE
)
#> # A tibble: 3 x 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 wk_wkb      180.8µs 204.21µs     4643.     131KB     19.8
#> 2 sfheaders   573.5µs 680.57µs     1431.     627KB     35.9
#> 3 sf           2.54ms   2.76ms      359.     507KB     24.1

Send polygons to a graphics device (note that the graphics device is the main holdup in real life):

devoid::void_dev()
wksxp_plot_new(nc_sxp)

bench::mark(
  wk_wkb = wk::wksxp_draw_polypath(nc_sxp),
  sf = sf:::plot.sfc_MULTIPOLYGON(nc_sfc, add = TRUE),
  check = FALSE
)
#> # A tibble: 2 x 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 wk_wkb     327.76µs 360.79µs     2577.     358KB     15.9
#> 2 sf           3.48ms   3.85ms      254.     243KB     15.9
dev.off()
#> quartz_off_screen 
#>                 2

Functions in wk

Name Description
wkb_format Format well-known geometry for printing
wkt Mark character vectors as well-known text
wkb_draw_points Draw well-known geometries
wkb_translate_wkt Translate between WKB and WKT
wksxp Mark lists as well-known "S" expressions
wkb_meta Extract meta information
wkb_problems Validate well-known binary and well-known text
wkb_ranges Extract ranges information
coords_point_translate_wkt Parse coordinates into well-known formats
wkb Mark lists of raw vectors as well-known binary
wkb_coords Extract coordinates from well-known geometries
new_wk_wkb S3 Details for wk_wkb
wkb_debug Debug well-known geometry
new_wk_wksxp S3 Details for wk_wksxp
new_wk_wkt S3 Details for wk_wkt
vctrs-methods Vctrs methods
wk-package wk: Lightweight Well-Known Geometry Parsing
No Results!

Last month downloads

Details

License LGPL (>= 2.1)
Copyright file COPYRIGHTS
Encoding UTF-8
LazyData true
RoxygenNote 7.1.0.9000
LinkingTo Rcpp
URL https://paleolimbot.github.io/wk, https://github.com/paleolimbot/wk
BugReports https://github.com/paleolimbot/wk/issues
NeedsCompilation yes
Packaged 2020-06-21 13:33:28 UTC; dewey
Repository CRAN
Date/Publication 2020-06-21 14:10:02 UTC

Include our badge in your README

[![Rdoc](http://www.rdocumentation.org/badges/version/wk)](http://www.rdocumentation.org/packages/wk)