export_abnFit: Export abnFit object to structured JSON format

Description

Exports a fitted Additive Bayesian Network (ABN) model to a structured JSON format suitable for storage, sharing, and interoperability with other analysis tools. The export includes network structure (variables and arcs) and model parameters (coefficients, variances, and their associated metadata).

Usage

export_abnFit(
  object,
  format = "json",
  include_network = TRUE,
  file = NULL,
  pretty = TRUE,
  scenario_id = NULL,
  label = NULL,
  ...
)

Value

If file is NULL, returns a character string containing the JSON representation of the model. If file is provided, writes the JSON to the specified file and invisibly returns the file path.

Arguments

object: An object of class abnFit, typically created by fitAbn.
format: Character string specifying the export format. Currently, only "json" is supported.
include_network: Logical, whether to include network structure (variables and arcs). Default is TRUE.
file: Optional character string specifying a file path to save the JSON output. If NULL (default), the JSON string is returned.
pretty: Logical, whether to format the JSON output with indentation for readability. Default is TRUE. Set to FALSE for more compact output.
scenario_id: Optional character string or numeric identifier for the model run or scenario. Useful for tracking multiple model versions or experiments. Default is NULL.
label: Optional character string providing a descriptive name or label for the scenario. Default is NULL.
...: Additional export options (currently unused, reserved for future extensions).

JSON Schema

Top-Level Fields

scenario_id: Optional string or numeric identifier for the model run. Can be null.

label

Optional descriptive name for the model. Can be null.

variables

Array of variable objects (see Variables section).

parameters

Array of parameter objects (see Parameters section).

arcs

Array of arc objects (see Arcs section).

Variables Array

Each variable object contains:

variable_id: Unique identifier for the variable (string). This ID is used throughout the JSON to reference this variable in parameters' source fields and in arcs' source_variable_id/target_variable_id fields.
attribute_name: Original attribute name from the data (string).
model_type: Distribution type: "gaussian", "binomial", "poisson", or "multinomial".
states: Array of state objects for multinomial variables only. Each state has state_id (used to reference specific categories in parameters), value_name (the category label), and is_baseline (whether this is the reference category). NULL for continuous variables.

Parameters Array

Each parameter object contains:

parameter_id: Unique identifier for the parameter (string).
name: Parameter name (e.g., "intercept", "prob_2", coefficient name, "sigma", "sigma_alpha").
link_function_name: Link function: "identity" (Gaussian), "logit" (Binomial, Multinomial), or "log" (Poisson).
source: Object identifying which variable and state this parameter belongs to. Contains variable_id (required, references a variable from the variables array) and optional state_id (references a specific state for category-specific parameters in multinomial models).
coefficients: Array of coefficient objects (typically length 1), each with value, stderr (or NULL for mixed models), condition_type, and conditions array.

Coefficient Condition Types

"intercept": Baseline parameter with no parent dependencies
"linear_term": Effect of a parent variable
"CPT_combination": Conditional probability table entry (future use)
"variance": Residual variance (Gaussian/Poisson only)
"random_variance": Random effect variance (mixed models)
"random_covariance": Random effect covariance (multinomial mixed models)

Arcs Array

Each arc object contains:

source_variable_id: Identifier of the parent/source node.
target_variable_id: Identifier of the child/target node.

Design Rationale

The JSON structure uses a flat architecture with three parallel arrays rather than deeply nested objects. This design offers several advantages:

Database compatibility: Easy to store in relational or document databases with minimal transformation.
Extensibility: New parameter types or metadata can be added without restructuring existing fields.
Parsability: Simpler to query and transform programmatically.
Flexibility: Supports both CPT-style and GLM(M)-style models through the polymorphic source and conditions structure.

Parameters are linked to variables through the source.variable_id field, with optional source.state_id for category-specific parameters in multinomial models. Parent dependencies are encoded in the conditions array within each coefficient.

Details

This function provides a standardized way to export fitted ABN models to JSON, facilitating model sharing, archiving, and integration with external tools or databases. The JSON structure is designed to be both human-readable and machine-parseable, following a flat architecture to avoid deep nesting.

Supported Model Types

The function handles different model fitting methods:

MLE without grouping: Standard maximum likelihood estimation for all supported distributions (Gaussian, Binomial, Poisson, Multinomial). Exports fixed-effect parameters with standard errors.
MLE with grouping: Generalized Linear Mixed Models (GLMM) with group-level random effects. Exports both fixed effects (mu, betas) and random effects (sigma, sigma_alpha).
Bayesian: Placeholder for future implementation of Bayesian model exports including posterior distributions.

JSON Structure Overview

The exported JSON follows a three-component structure:

variables: An array where each element represents a node/variable in the network with metadata including identifier, attribute name, distribution type, and states (for categorical variables).
parameters: An array where each element represents a model parameter (intercepts, coefficients, variances) with associated values, standard errors, link functions, and parent variable conditions.
arcs: An array where each element represents a directed edge in the network, specifying source and target variable identifiers.

Additionally, optional top-level fields scenario_id and label can be used to identify and describe the model.

Examples

Run this code

if (FALSE) {
# Load example data and fit a model
library(abn)
data(ex1.dag.data)

# Define distributions
mydists <- list(b1 = "binomial", p1 = "poisson", g1 = "gaussian",
                b2 = "binomial", p2 = "poisson", g2 = "gaussian",
                b3 = "binomial", g3 = "gaussian")

# Build score cache
mycache <- buildScoreCache(data.df = ex1.dag.data,
                            data.dists = mydists,
                            method = "mle",
                            max.parents = 2)

# Find most probable DAG
mp_dag <- mostProbable(score.cache = mycache)

# Fit the model
myfit <- fitAbn(object = mp_dag, method = "mle")

# Export to JSON string with metadata
json_export <- export_abnFit(myfit,
                             scenario_id = "example_model_v1",
                             label = "Example ABN Model")

# View the structure
library(jsonlite)
parsed <- fromJSON(json_export)
str(parsed, max.level = 2)

# Export to file
export_abnFit(myfit,
              file = "my_abn_model.json",
              scenario_id = "example_model_v1",
              label = "Example ABN Model",
              pretty = TRUE)

# Export with compact formatting
compact_json <- export_abnFit(myfit, pretty = FALSE)

# ---
# Mixed-effects model example
# (Requires data with grouping structure)

# Add grouping variable
ex1.dag.data$group <- rep(1:5, length.out = nrow(ex1.dag.data))

# Build cache with grouping
mycache_grouped <- buildScoreCache(data.df = ex1.dag.data,
                                    data.dists = mydists,
                                    method = "mle",
                                    group.var = "group",
                                    max.parents = 2)

# Fit grouped model
myfit_grouped <- fitAbn(object = mp_dag,
                        method = "mle",
                        group.var = "group")

# Export grouped model (includes random effects)
json_grouped <- export_abnFit(myfit_grouped,
                              scenario_id = "grouped_model_v1",
                              label = "Mixed Effects ABN")
}

Run the code above in your browser using DataLab