# Edwin Jonge

#### 12 packages on CRAN

#### 1 packages on GitHub

Extends the out of memory vectors of 'ff' with statistical functions and other utilities to ease their usage.

Facilitates reading and manipulating (multivariate) data restrictions (edit rules) on numerical and categorical data. Rules can be defined with common R syntax and parsed to an internal (matrix-like format). Rules can be manipulated with variable elimination and value substitution methods, allowing for feasibility checks and more. Data can be tested against the rules and erroneous fields can be found based on Fellegi and Holt's generalized principle. Rules dependencies can be visualized with using the igraph package.

Errors in data can be located and removed using validation rules from package 'validate'.

Diff, patch and merge for data frames. Document changes in data sets and use them to apply patches. Changes to data can be made visible by using render_diff. The V8 package is used to wrap the 'daff.js' JavaScript library which is included in the package.

Text data can be processed chunkwise using 'dplyr' commands. These are recorded and executed per data chunk, so large files can be processed with limited memory using the 'LaF' package.

Rule sets with validation rules may contain redundancies or contradictions. Functions for finding redundancies and problematic rules are provided, given a set a rules formulated with 'validate'.

The data and meta data from Statistics Netherlands (www.cbs.nl) can be browsed and downloaded. The client uses the open data API of Statistics Netherlands.

A tableplot is a visualisation of a (large) dataset with a dozen of variables, both numeric and categorical. This package contains an interactive version of tableplot working in your browser.

A tableplot is a visualisation of a (large) dataset with a dozen of variables, both numeric and categorical. Each column represents a variable and each row bin is an aggregate of a certain number of records. Numeric variables are visualized as bar charts, and categorical variables as stacked bar charts. Missing values are taken into account. Also supports large 'ffdf' datasets from the 'ff' package.