schemate
A small, checkmate-first schema DSL for R data.
schemate provides a small,
checkmate-first schema DSL for R
data. It can infer schemas from example objects, edit schema documents,
save them as JSON, read them back, and validate new inputs against the
schema.
The package is meant for package authors and pipeline authors who want a compact R-native schema format without adopting the full JSON Schema vocabulary. A typical workflow is:
- infer a conservative schema with
schema_infer(); - edit it with
schema_*()authoring verbs; - save it with
schema_write(); - read it back with
schema_read(); - validate inputs with
schema_validate().
Installation
install.packages("schemate")Development Version
To get a bug fix or to use a feature from the development version, you can install the development version of schemate from GitHub.
# install.packages("pak")
pak::pak("hongyuanjia/schemate")Quick Start
The public API uses a single schema_ prefix and works well in
pipelines. Start from an example object, infer a conservative schema,
then compact it into something easier to edit and review.
library(schemate)
payload <- list(
items = list(
list(id = 1L, name = "alpha", label = "Alpha", slug = "alpha"),
list(id = 2L, name = "beta", label = "Beta", slug = "beta")
)
)
schema <- payload |>
schema_infer(keys = "named", arrays = "rest") |>
schema_compact() |>
schema_set_desc("$items", "Repository-like result items")
schema## {
## "check": {
## "kind": "list"
## },
## "keys": {
## "type": "named"
## },
## "fields": {
## "items": {
## "description": "Repository-like result items",
## "check": {
## "kind": "list"
## },
## "keys": {
## "type": "unnamed"
## },
## "rest": {
## "check": {
## "kind": "list"
## },
## "keys": {
## "type": "named"
## },
## "fields": {
## "id": {
## "check": {
## "kind": "int"
## }
## }
## },
## "groups": [
## {
## "names": ["name", "label", "slug"],
## "check": {
## "kind": "string"
## }
## }
## ]
## }
## }
## }
## }schema |>
schema_validate(payload, mode = "test")## [1] TRUEschema_validate() defaults to assert mode: invalid input raises an
error and valid input is returned invisibly. Other modes are available
when you need a message or a boolean result.
bad_payload <- payload
bad_payload$items[[1L]]$id <- "bad"
schema |>
schema_validate(bad_payload, mode = "check", name = "payload")## [1] "payload$items[[1]]$id: Must be of type 'single integerish value', not 'character'"schema |>
schema_validate(bad_payload, mode = "test", name = "payload")## [1] FALSEFor a data frame example, see the Get started article.
JSON Workflow
Schemas are stored as a compact JSON DSL. The DSL is not JSON Schema; it
is a thin representation of checkmate checks, field schemas, local
definitions, and combinators. See the Schema DSL
article
for the complete format reference. schema_read() and schema_write()
require the suggested package
jsonlite.
path <- tempfile(fileext = ".json")
schema_write(schema, path)
restored <- schema_read(path)
restored## {
## "check": {
## "kind": "list"
## },
## "keys": {
## "type": "named"
## },
## "fields": {
## "items": {
## "description": "Repository-like result items",
## "check": {
## "kind": "list"
## },
## "keys": {
## "type": "unnamed"
## },
## "rest": {
## "check": {
## "kind": "list"
## },
## "keys": {
## "type": "named"
## },
## "fields": {
## "id": {
## "check": {
## "kind": "int"
## }
## }
## },
## "groups": [
## {
## "names": ["name", "label", "slug"],
## "check": {
## "kind": "string"
## }
## }
## ]
## }
## }
## }
## }restored |>
schema_validate(payload)Example schema files are installed under inst/extdata:
system.file("extdata", "person-schema.json", package = "schemate")Validation Modes
schema_validate() supports four modes:
| Mode | Return value on success | Return value on failure |
|---|---|---|
assert | invisibly returns the input | throws an error |
check | TRUE | diagnostic string |
test | TRUE | FALSE |
expect | testthat-style expectation object | expectation failure object |
Use assert inside application code, check when displaying
diagnostics, test for control flow, and expect in tests.
Standalone Use
schemate also publishes a generated standalone bundle for packages
that want the schema features without depending on schemate at
runtime.
usethis::use_standalone("hongyuanjia/schemate", "schema", ref = "standalone")Relation to Other Tools
schemate is closest in spirit to checkmate: schemas ultimately
validate R objects by calling checkmate checks. It adds a schema
lifecycle around those checks: infer, edit, serialize, read, and
validate.
pointblank is a better fit
for tabular data quality workflows, reporting, and column-oriented
validation plans. schemate is deliberately narrower and more
structural: it describes R values, R object names, nested lists,
JSON-like payloads, and package-facing input contracts. It is not a
replacement for JSON Schema or
jsonvalidate, which are
better choices when you need standards-compliant JSON document
validation.
The R validation ecosystem is broad:
validatecaptures data validation rules that can be documented, stored, and applied to data sets.assertris designed for assertive data checks inside analysis pipelines.data.validatorfocuses on dataset validation with reporting.vetrprovides template-based structural checks for R objects.testthatis the right home for unit-test expectations;schema_validate(..., mode = "expect")is intended to fit into that style.
License
The project is released under the terms of MIT License.