The bgm function estimates the pseudoposterior distribution of
category thresholds (main effects) and pairwise interaction parameters of a
Markov Random Field (MRF) model for binary and/or ordinal variables.
Optionally, it performs Bayesian edge selection using spike-and-slab
priors to infer the network structure.
bgm(
x,
variable_type = "ordinal",
baseline_category,
iter = 1000,
warmup = 1000,
pairwise_scale = 2.5,
main_alpha = 0.5,
main_beta = 0.5,
edge_selection = TRUE,
edge_prior = c("Bernoulli", "Beta-Bernoulli", "Stochastic-Block"),
inclusion_probability = 0.5,
beta_bernoulli_alpha = 1,
beta_bernoulli_beta = 1,
beta_bernoulli_alpha_between = 1,
beta_bernoulli_beta_between = 1,
dirichlet_alpha = 1,
lambda = 1,
na_action = c("listwise", "impute"),
update_method = c("nuts", "adaptive-metropolis", "hamiltonian-mc"),
target_accept,
hmc_num_leapfrogs = 100,
nuts_max_depth = 10,
learn_mass_matrix = TRUE,
chains = 4,
cores = parallel::detectCores(),
display_progress = c("per-chain", "total", "none"),
seed = NULL,
standardize = FALSE,
verbose = getOption("bgms.verbose", TRUE),
interaction_scale,
burnin,
save,
threshold_alpha,
threshold_beta
)A list of class "bgms" with posterior summaries, posterior mean
matrices, and access to raw MCMC draws. The object can be passed to
print(), summary(), and coef().
Main components include:
posterior_summary_main: Data frame with posterior summaries
(mean, sd, MCSE, ESS, Rhat) for category threshold parameters.
posterior_summary_pairwise: Data frame with posterior
summaries for pairwise interaction parameters.
posterior_summary_indicator: Data frame with posterior
summaries for edge inclusion indicators (if edge_selection = TRUE).
posterior_mean_main: Matrix of posterior mean thresholds
(rows = variables, cols = categories or parameters).
posterior_mean_pairwise: Symmetric matrix of posterior mean
pairwise interaction strengths.
posterior_mean_indicator: Symmetric matrix of posterior mean
inclusion probabilities (if edge selection was enabled).
Additional summaries returned when
edge_prior = "Stochastic-Block". For more details about this prior
see SekulovskiEtAl_2025;textualbgms.
posterior_summary_pairwise_allocations: Data frame with
posterior summaries (mean, sd, MCSE, ESS, Rhat) for the pairwise
cluster co-occurrence of the nodes. This serves to indicate
whether the estimated posterior allocations,co-clustering matrix
and posterior cluster probabilities (see blow) have converged.
posterior_coclustering_matrix: a symmetric matrix of
pairwise proportions of occurrence of every variable. This matrix
can be plotted to visually inspect the estimated number of clusters
and visually inspect nodes that tend to switch clusters.
posterior_mean_allocations: A vector with the posterior mean
of the cluster allocations of the nodes. This is calculated using the method
proposed in Dahl2009;textualbgms.
posterior_mode_allocations: A vector with the posterior
mode of the cluster allocations of the nodes.
posterior_num_blocks: A data frame with the estimated
posterior inclusion probabilities for all the possible number of clusters.
raw_samples: A list of raw MCMC draws per chain:
mainList of main effect samples.
pairwiseList of pairwise effect samples.
indicatorList of indicator samples (if edge selection enabled).
allocationsList of cluster allocations (if SBM prior used).
nchainsNumber of chains.
niterNumber of post–warmup iterations per chain.
parameter_namesNamed lists of parameter labels.
arguments: A list of function call arguments and metadata
(e.g., number of variables, warmup, sampler settings, package version).
The summary() method prints formatted posterior summaries, and
coef() extracts posterior mean matrices.
NUTS diagnostics (tree depth, divergences, energy, E-BFMI) are included
in fit$nuts_diag if update_method = "nuts".
A data frame or matrix with n rows and p columns
containing binary and ordinal responses. Variables are automatically
recoded to non-negative integers (0, 1, ..., m). For regular
ordinal variables, unobserved categories are collapsed; for
Blume–Capel variables, all categories are retained.
Character or character vector. Specifies the type of
each variable in x. Allowed values: "ordinal" or
"blume-capel". Binary variables are automatically treated as
"ordinal". Default: "ordinal".
Integer or vector. Baseline category used in
Blume–Capel variables. Can be a single integer (applied to all) or a
vector of length p. Required if at least one variable is of type
"blume-capel".
Integer. Number of post–burn-in iterations (per chain).
Default: 1e3.
Integer. Number of warmup iterations before collecting
samples. A minimum of 1000 iterations is enforced, with a warning if a
smaller value is requested. Default: 1e3.
Double. Scale of the Cauchy prior for pairwise
interaction parameters. Default: 2.5.
Double. Shape parameters of the
beta-prime prior for threshold parameters. Must be positive. If equal,
the prior is symmetric. Defaults: main_alpha = 0.5 and
main_beta = 0.5.
Logical. Whether to perform Bayesian edge selection.
If FALSE, the model estimates all edges. Default: TRUE.
Character. Specifies the prior for edge inclusion.
Options: "Bernoulli", "Beta-Bernoulli", or
"Stochastic-Block". Default: "Bernoulli".
Numeric scalar. Prior inclusion probability
of each edge (used with the Bernoulli prior). Default: 0.5.
Double. Shape parameters
for the beta distribution in the Beta–Bernoulli and the Stochastic-Block
priors. Must be positive. For the Stochastic-Block prior these are the shape
parameters for the within-cluster edge inclusion probabilities.
Defaults: beta_bernoulli_alpha = 1 and beta_bernoulli_beta = 1.
Double.
Shape parameters for the between-cluster edge inclusion probabilities in the
Stochastic-Block prior. Must be positive.
Default: beta_bernoulli_alpha_between = 1 and beta_bernoulli_beta_between = 1
Double. Concentration parameter of the Dirichlet
prior on block assignments (used with the Stochastic Block model).
Default: 1.
Double. Rate of the zero-truncated Poisson prior on the
number of clusters in the Stochastic Block Model. Default: 1.
Character. Specifies missing data handling. Either
"listwise" (drop rows with missing values) or "impute"
(perform single imputation during sampling). Default: "listwise".
Character. Specifies how the MCMC sampler updates the model parameters:
Componentwise adaptive Metropolis–Hastings with Robbins–Monro proposal adaptation.
Hamiltonian Monte Carlo with fixed path length
(number of leapfrog steps set by hmc_num_leapfrogs).
The No-U-Turn Sampler, an adaptive form of HMC with dynamically chosen trajectory lengths.
Default: "nuts".
Numeric between 0 and 1. Target acceptance rate for
the sampler. Defaults are set automatically if not supplied:
0.44 for adaptive Metropolis, 0.65 for HMC,
and 0.80 for NUTS.
Integer. Number of leapfrog steps for Hamiltonian
Monte Carlo. Must be positive. Default: 100.
Integer. Maximum tree depth in NUTS. Must be positive.
Default: 10.
Logical. If TRUE, adapt a diagonal mass
matrix during warmup (HMC/NUTS only). If FALSE, use the identity
matrix. Default: TRUE.
Integer. Number of parallel chains to run. Default: 4.
Integer. Number of CPU cores for parallel execution.
Default: parallel::detectCores().
Character. Controls progress reporting during
sampling. Options: "per-chain" (separate bar per chain),
"total" (single combined bar), or "none" (no progress).
Default: "per-chain".
Optional integer. Random seed for reproducibility. Must be a single non-negative integer.
Logical. If TRUE, the Cauchy prior scale for each
pairwise interaction is adjusted based on the range of response scores.
Variables with more response categories have larger score products
\(x_i \cdot x_j\), which typically correspond to smaller interaction
effects \(\sigma_{ij}\). Without standardization, a fixed prior scale
is relatively wide for these smaller effects, resulting in less shrinkage
for high-category pairs and more shrinkage for low-category pairs.
Standardization scales the prior proportionally to the maximum score
product, ensuring equivalent relative shrinkage across all pairs.
After internal recoding, regular ordinal variables have scores
\(0, 1, \ldots, m\). The adjusted scale for the interaction between
variables \(i\) and \(j\) is pairwise_scale * m_i * m_j,
so that pairwise_scale itself applies to the unit interval case
(binary variables where \(m_i = m_j = 1\)). For Blume-Capel variables
with reference category \(b\), scores are centered as
\(-b, \ldots, m-b\), and the adjustment uses the maximum absolute
product of the score endpoints. For mixed pairs, ordinal variables use
raw score endpoints \((0, m)\) and Blume-Capel variables use centered
score endpoints \((-b, m-b)\).
Default: FALSE.
Logical. If TRUE, prints informational messages
during data processing (e.g., missing data handling, variable recoding).
Defaults to getOption("bgms.verbose", TRUE). Set
options(bgms.verbose = FALSE) to suppress messages globally.
`r lifecycle::badge("deprecated")` Deprecated arguments as of **bgms 0.1.6.0**. Use `pairwise_scale`, `warmup`, `main_alpha`, and `main_beta` instead.
The function supports two types of ordinal variables:
Regular ordinal variables: Assigns a category threshold parameter to each response category except the lowest. The model imposes no additional constraints on the distribution of category responses.
Blume-Capel ordinal variables: Assume a baseline category (e.g., a “neutral” response) and score responses by distance from this baseline. Category thresholds are modeled as:
$$\mu_{c} = \alpha \cdot (c-b) + \beta \cdot (c - b)^2$$
where:
\(\mu_{c}\): category threshold for category \(c\)
\(\alpha\): linear trend across categories
\(\beta\): preference toward or away from the baseline
If \(\beta < 0\), the model favors responses near the baseline category;
if \(\beta > 0\), it favors responses farther away (i.e., extremes).
\(b\): baseline category
Accordingly, pairwise interactions between Blume-Capel variables are modeled in terms of \(c-b\) scores.
When edge_selection = TRUE, the function performs Bayesian variable
selection on the pairwise interactions (edges) in the MRF using
spike-and-slab priors.
Supported priors for edge inclusion:
Bernoulli: Fixed inclusion probability across edges.
Beta-Bernoulli: Inclusion probability is assigned a Beta prior distribution.
Stochastic-Block: Cluster-based edge priors with Beta, Dirichlet, and Poisson hyperpriors.
All priors operate via binary indicator variables controlling the inclusion or exclusion of each edge in the MRF.
Pairwise effects: Modeled with a Cauchy (slab) prior.
Main effects: Modeled using a beta-prime distribution.
Edge indicators: Use either a Bernoulli, Beta-Bernoulli, or Stochastic-Block prior (as above).
Parameters are updated within a Gibbs framework, but the conditional updates can be carried out using different algorithms:
Adaptive Metropolis–Hastings: Componentwise random–walk updates for main effects and pairwise effects. Proposal standard deviations are adapted during burn–in via Robbins–Monro updates toward a target acceptance rate.
Hamiltonian Monte Carlo (HMC): Joint updates of all
parameters using fixed–length leapfrog trajectories. Step size is
tuned during warmup via dual–averaging; the diagonal mass matrix can
also be adapted if learn_mass_matrix = TRUE.
No–U–Turn Sampler (NUTS): An adaptive extension of HMC that dynamically chooses trajectory lengths. Warmup uses a staged adaptation schedule (fast–slow–fast) to stabilize step size and, if enabled, the mass matrix.
When edge_selection = TRUE, updates of edge–inclusion indicators
are carried out with Metropolis–Hastings steps. These are switched on
after the core warmup phase, ensuring that graph updates occur only once
the samplers’ tuning parameters (step size, mass matrix, proposal SDs)
have stabilized.
After warmup, adaptation is disabled. Step size and mass matrix are fixed at their learned values, and proposal SDs remain constant.
The warmup procedure in bgm uses a multi-stage adaptation
schedule stan-manualbgms. Warmup iterations are
split into several phases:
Stage 1 (fast adaptation): A short initial interval where only step size (for HMC/NUTS) is adapted, allowing the chain to move quickly toward the typical set.
Stage 2 (slow windows): A sequence of expanding,
memoryless windows where both step size and, if
learn_mass_matrix = TRUE, the diagonal mass matrix are
adapted. Each window ends with a reset of the dual–averaging scheme
for improved stability.
Stage 3a (final fast interval): A short interval at the end of the core warmup where the step size is adapted one final time.
Stage 3b (proposal–SD tuning): Only active when
edge_selection = TRUE under HMC/NUTS. In this phase,
Robbins–Monro adaptation of proposal standard deviations is
performed for the Metropolis steps used in edge–selection moves.
Stage 3c (graph selection warmup): Also only relevant
when edge_selection = TRUE. At the start of this phase, a
random graph structure is initialized, and Metropolis–Hastings
updates for edge inclusion indicators are switched on.
When edge_selection = FALSE, the total number of warmup iterations
equals the user–specified burnin. When edge_selection = TRUE
and update_method is "nuts" or "hamiltonian-mc",
the schedule automatically appends additional Stage-3b and Stage-3c
intervals, so the total warmup is strictly greater than the requested
burnin.
After all warmup phases, the sampler transitions to the sampling phase with adaptation disabled. Step size and mass matrix (for HMC/NUTS) are fixed at their learned values, and proposal SDs remain constant.
This staged design improves stability of proposals and ensures that both local parameters (step size) and global parameters (mass matrix, proposal SDs) are tuned before collecting posterior samples.
For adaptive Metropolis–Hastings runs, step size and mass matrix adaptation are not relevant. Proposal SDs are tuned continuously during burn–in using Robbins–Monro updates, without staged fast/slow intervals.
If na_action = "listwise", observations with missing values are
removed.
If na_action = "impute", missing values are imputed during Gibbs
sampling.
This function models the joint distribution of binary and ordinal variables using a Markov Random Field, with support for edge selection through Bayesian variable selection. The statistical foundation of the model is described in MarsmanVandenBerghHaslbeck_2025;textualbgms, where the ordinal MRF model and its Bayesian estimation procedure were first introduced. While the implementation in bgms has since been extended and updated (e.g., alternative priors, parallel chains, HMC/NUTS warmup), it builds on that original framework.
Key components of the model are described in the sections below.
vignette("intro", package = "bgms") for a worked example.
# \donttest{
# Run bgm on subset of the Wenchuan dataset
fit = bgm(x = Wenchuan[, 1:5], chains = 2)
# Posterior inclusion probabilities
summary(fit)$indicator
# Posterior pairwise effects
summary(fit)$pairwise
# }
Run the code above in your browser using DataLab