tidyjson (version 0.2.4)

json_complexity: Compute the complexity (recursively unlisted length) of JSON data

Description

When investigating complex JSON data it can be helpful to identify the complexity of deeply nested documents. The json_complexity function adds a column (default name "complexity") that contains the 'complexity' of the JSON associated with each row. Essentially, every on-null scalar value is found in the object by recursively stripping away all objects or arrays, and the complexity is the count of these scalar values. Note that 'null' has complexity 0, as do empty objects and arrays.

Usage

json_complexity(.x, column.name = "complexity")

Arguments

.x

a json string or tbl_json object

column.name

the name to specify for the length column

Value

a tbl_json object

See Also

json_lengths to compute the length of each value

Examples

Run this code
# NOT RUN {
# A simple example
json <- c('[1, 2, [3, 4]]', '{"k1": 1, "k2": [2, [3, 4]]}', '1', 'null')

# Complexity is larger than length for nested objects
json %>% json_lengths %>% json_complexity

# Worldbank has complexity ranging from 8 to 17
library(magrittr)
worldbank %>% json_complexity %$% table(complexity)

# Commits are much more regular
commits %>% gather_array %>% json_complexity %$% table(complexity)
# }

Run the code above in your browser using DataLab