hergm: Hierarchical exponential-family random graph models with local dependence

Description

The function hergm estimates and simulates three classes of hierarchical exponential-family random graph models:

1. The p_1 model of Holland and Leinhardt (1981) in exponential-family form and extensions by Vu, Hunter, and Schweinberger (2013) and Schweinberger, Petrescu-Prahova, and Vu (2014) to both directed and undirected random graphs with additional model terms, with and without covariates, and with parametric and nonparametric priors (see arcs_i, arcs_j, edges_i, edges_ij, mutual_i, mutual_ij).

2. The stochastic block model of Snijders and Nowicki (1997) and Nowicki and Snijders (2001) in exponential-family form and extensions by Vu, Hunter, and Schweinberger (2013) and Schweinberger, Petrescu-Prahova, and Vu (2014) with additional model terms, with and without covariates, and with parametric and nonparametric priors (see arcs_i, arcs_j, edges_i, edges_ij, mutual_i, mutual_ij).

3. The exponential-family random graph models with local dependence of Schweinberger and Handcock (2015), with and without covariates, and with parametric and nonparametric priors (see arcs_i, arcs_j, edges_i, edges_ij, mutual_i, mutual_ij, twostar_ijk, triangle_ijk, ttriple_ijk, ctriple_ijk). The exponential-family random graph models with local dependence replace the long-range dependence of conventional exponential-family random graph models by short-range dependence. Therefore, exponential-family random graph models with local dependence replace the strong dependence of conventional exponential-family random graph models by weak dependence, reducing the problem of model degeneracy (Handcock, 2003; Schweinberger, 2011) and improving goodness-of-fit (Schweinberger and Handcock, 2015). In addition, exponential-family random graph models with local dependence satisfy a weak form of self-consistency in the sense that these models are self-consistent under neighborhood sampling (Schweinberger and Handcock, 2015), which enables consistent estimation of neighborhood-dependent parameters (Schweinberger and Stewart, 2017; Schweinberger, 2017).

Usage

hergm(formula,
      max_number = 2,
      hierarchical = TRUE,
      parametric = FALSE,
      parameterization = "offset",
      initialize = FALSE,
      initialization_method = 1,
      estimate_parameters = TRUE,
      initial_estimate = NULL,
      n_em_step_max = 100,
      max_iter = 4,
      perturb = FALSE,
      scaling = NULL,
      alpha = NULL,
      alpha_shape = NULL,
      alpha_rate = NULL,
      eta = NULL,
      eta_mean = NULL,
      eta_sd = NULL,
      eta_mean_mean = NULL,
      eta_mean_sd = NULL,
      eta_precision_shape = NULL,
      eta_precision_rate = NULL,
      mean_between = NULL,
      indicator = NULL,
      parallel = 1,
      simulate = FALSE,
      method = "ml",
      seeds = NULL,
      sample_size = NULL,
      sample_size_multiplier_blocks = 20,
      NR_max_iter = 200,
      NR_step_len = NULL,
      NR_step_len_multiplier = 0.2, 
      interval = 1024,
      burnin = 16*interval,
      mh.scale = 0.25,
      variational = FALSE,
      temperature = c(1,100),
      predictions = FALSE,
      posterior.burnin = 2000,
      posterior.thinning = 1,
      relabel = 1,
      number_runs = 1,
      verbose = 0,
      ...)

Value

The function hergm returns an object of class hergm with components:

network: network is an object of class network and can be created by calling the function network.
formula: formula of the form network ~ terms. network is an object of class network and can be created by calling the function network. Possible terms can be found in ergm.terms and hergm.terms.
n: number of nodes.
hyper_prior: indicator of whether hyper prior has been specified, i.e., whether the parameters alpha, eta_mean, and eta_precision are estimated.
alpha: concentration parameter of truncated Dirichlet process prior of parameters of hergm-terms.
ergm_theta: parameters of ergm-terms.
eta_mean: mean parameters of Gaussian base distribution of parameters of hergm-terms.
eta_precision: precision parameters of Gaussian base distribution of parameters of hergm-terms.
d1: total number of parameters of ergm terms.
d2: total number of parameters of hergm terms.
hergm_theta: parameters of hergm-terms.
relabeled.hergm_theta: relabeled parameters of hergm-terms by using relabel = 1 or relabel = 2.
number_fixed: number of fixed indicators of block memberships of nodes.
indicator: indicators of block memberships of nodes.
relabel: if relabel > 0, relabel MCMC sample by minimizing the posterior expected loss of Schweinberger and Handcock (2015) (relabel = 1) or Peng and Carvalho (2016) (relabel = 2).
relabeled.indicator: relabeled indicators of block memberships of nodes by using relabel = 1 or relabel = 2.
size: the size of the blocks, i.e., the number of nodes of blocks.
parallel: number of computing nodes; if parallel > 1, hergm is run on parallel computing nodes.
p_i_k: posterior probabilities of block membership of nodes.
p_k: probabilities of block memberships of nodes.
predictions: if predictions = TRUE and simulate = FALSE, returns posterior predictions of statistics in the model.
simulate: if simulate = TRUE, simulation of networks, otherwise Bayesian inference.
prediction: posterior predictions of statistics.
edgelist: edge list of simulated network.
sample_size: if simulate = TRUE, number of network draws, otherwise number of posterior draws minus number of burn-in iterations; if parallel > 1, number of draws on each computing node.
extract: indicator of whether function hergm.postprocess has postprocessed the object of class hergm generated by function hergm and thus whether the MCMC sample generated by function hergm has been extracted from the object of class hergm.
verbose: if verbose = -1, no console output; if verbose = 0, short console output; if verbose = +1, long console output.

Arguments

formula

formula of the form network ~ terms. network is an object of class network and can be created by calling the function network. Possible terms can be found in ergm.terms and hergm.terms.

max_number

maximum number of blocks.

hierarchical

hierarchical prior; if hierarchical = TRUE, prior is hierarchical (i.e., the means and variances of block parameters are governed by a hyper-prior), otherwise non-hierarchical (i.e., the means and variances of block parameters are fixed).

parametric

parametric prior; if parametric = FALSE, prior is truncated Dirichlet process prior, otherwise parametric Dirichlet prior.

parameterization

There are three possible parameterizations of within-block terms when using method == "ml". Please note that between-block terms do not use these parameterizations, and method == "bayes" allows the parameters of all within-block terms to vary across blocks and hence does not use them either.

standard: The parameters of all within-block terms are constant across blocks.
offset: The offset log(n[k]) is subtracted from the parameters of the within-block edge terms and is added to the parameters of the within-block mutual edge terms along the lines of Krivitsky, Handcock, and Morris (2011), Krivitsky and Kolaczyk (2015), and Stewart, Schweinberger, Bojanowski, and Morris (2019), where n[k] is the number of nodes in block k. The parameters of all other within-block terms are constant across blocks.
size: The parameters of all within-block terms are multiplied by log(n[k]) along the lines of Babkin et al. (2020), where n[k] is the number of nodes in block k.

initialize

if initialize = TRUE, initialize block memberships of nodes.

initialization_method

if initialization_method = 1, block memberships of nodes are initialized by walk trap; if initialization_method = 2, block memberships of nodes are initalized by spectral clustering.

estimate_parameters

if method = "ml" and estimate_parameters = TRUE, estimate parameters.

initial_estimate

if method = "ml" and estimate_parameters = TRUE, specifies starting point.

n_em_step_max

if method = "ml", maximum number of iterations of Generalized Expectation Maximization algorithm estimating the block structure.

max_iter

if method = "ml", maximum number of iterations of Monte Carlo maximization algorithm estimating parameters given block structure.

perturb

if initialize = TRUE and perturb = TRUE, initialize block memberships of nodes by spectral clustering and perturb.

scaling

if scaling = TRUE, use size-dependent parameterizations which ensure that the scaling of between- and within-block terms is consistent with sparse edge terms.

alpha

concentration parameter of truncated Dirichlet process prior of natural parameters of exponential-family model.

alpha_shape, alpha_rate

shape and rate parameter of Gamma prior of concentration parameter.

eta

the parameters of ergm.terms and hergm.terms; the parameters of hergm.terms must consist of max_number within-block parameters and one between-block parameter.

eta_mean, eta_sd

means and standard deviations of Gaussian baseline distribution of Dirichlet process prior of natural parameters.

eta_mean_mean, eta_mean_sd

means and standard deviations of Gaussian prior of mean of Gaussian baseline distribution of Dirichlet process prior.

eta_precision_shape, eta_precision_rate

shape and rate (inverse scale) parameter of Gamma prior of precision parameter of Gaussian baseline distribution of Dirichlet process prior.

mean_between

if simulate = TRUE and eta = NULL, then mean_between specifies the mean-value parameter of edges between blocks.

indicator

if the indicators of block memberships of nodes are specified as integers between 1 and max_number, the specified indicators are fixed, which is useful when indicators of block memberhips are observed (e.g., in multilevel networks).

parallel

number of computing nodes; if parallel > 1, hergm is run on parallel computing nodes.

simulate

if simulate = TRUE, simulate networks from model, otherwise estimate model given observed network.

method

if method = "bayes", Bayesian methods along the lines of Schweinberger and Handcock (2015) and Schweinberger and Luna (2018) are used; otherwise, if method = "ml", then approximate maximum likelihood methods along the lines of Babkin et al. (2020) are used; note that Bayesian methods are the gold standard but are too time-consuming to be applied to networks with more than 100 nodes, whereas the approximate maximum likelihood methods can be applied to networks with thousands of nodes.

seeds

seed of pseudo-random number generator; if parallel > 1, number of seeds must equal number of computing nodes.

sample_size

if simulate = TRUE, number of network draws, otherwise number of posterior draws; if parallel > 1, number of draws on each computing node.

sample_size_multiplier_blocks

if method = "ml", multiplier of the number of network draws from within-block subgraphs; the total number of network draws from within-block subgraphs is sample_size_multiplier_blocks * number of possible edges of largest within-block subgraph; if sample_size_multiplier_blocks = NULL, then total number of network draws from within-block subgraphs is sample_size.

NR_max_iter

if method = "ml", the maximum number of iterationns to be used in the estimation of parameters.

NR_step_len

if method = "ml", the step-length to be used for increments in the estimation of parameters. If set to NULL (default), then an adaptive step length procedure is used.

NR_step_len_multiplier

if method = "ml", multiplier for adjusting the step-length in the estimation procedure after a divergent increment.

interval

if simulate = TRUE, number of proposals between sampled networks.

burnin

if simulate = TRUE, number of burn-in iterations.

mh.scale

if simulate = FALSE, scale factor of candicate-generating distribution of Metropolis-Hastings algorithm.

variational

if simulate = FALSE and variational = TRUE, variational methods are used to construct the proposal distributions of block memberships of nodes; limited to selected models.

temperature

if simulate = FALSE and variational = TRUE, minimum and maximum temperature; the temperature is used to melt down the proposal distributions of indicators, which are based on the full conditional distributions of indicators but can have low entropy, resulting in slow mixing of the Markov chain; the temperature is a function of the entropy of the full conditional distributions and is designed to increase the entropy of the proposal distributions, and the minimum and maximum temperature are user-defined lower and upper bounds on the temperature.

predictions

if predictions = TRUE and simulate = FALSE, returns posterior predictions of statistics in the model.

posterior.burnin

number of posterior burn-in iterations; if computing is parallel, posterior.burnin is applied to the sample generated by each processor; please note that hergm returns min(sample_size, 10000) sample points and the burn-in is applied to the sample of size min(sample_size, 10000), therefore posterior.burnin should be smaller than min(sample_size, 10000).

posterior.thinning

if posterior.thinning > 1, every posterior.thinning-th sample point is used while all others discarded; if computing is parallel, posterior.thinning is applied to the sample generated by each processor; please note that hergm returns min(sample_size, 10000) sample points and the thinning is applied to the sample of size min(sample_size, 10000) - posterior.burnin, therefore posterior.thinning should be smaller than min(sample_size, 10000) - posterior.burnin.

relabel

if relabel > 0, relabel MCMC sample by minimizing the posterior expected loss of Schweinberger and Handcock (2015) (relabel = 1) or Peng and Carvalho (2016) (relabel = 2).

number_runs

if relabel = 1, number of runs of relabeling algorithm.

verbose

if verbose = -1, no console output; if verbose = 0, short console output; if verbose = +1, long console output. If, e.g., simulate = FALSE and verbose = 1, then hergm reports the following console output:

Progress: 50.00% of 1000000

...

means of block parameters: -0.2838 1.3323

precisions of block parameters: 0.9234 1.4682

block parameters:

-0.2544 -0.2560 -0.1176 -0.0310 -0.1915 -1.9626

0.4022 1.8887 1.9719 0.6499 1.7265 0.0000

block indicators: 1 3 1 1 1 1 3 1 1 2 2 2 2 2 1 1 1

block sizes: 10 5 2 0 0

block probabilities: 0.5396 0.2742 0.1419 0.0423 0.0020

block probabilities prior parameter: 0.4256

posterior prediction of statistics: 66 123

where ... indicates additional information about the Markov chain Monte Carlo algorithm that is omitted here. The console output corresponds to:

- "means of block parameters" correspond to the mean parameters of the Gaussian base distribution of parameters of hergm-terms.

- "precisions of block parameters" correspond to the precision parameters of the Gaussian base distribution of parameters of hergm-terms.

- "block parameters" correspond to the parameters of hergm-terms.

- "block indicators" correspond to the indicators of block memberships of nodes.

- "block sizes" correspond to the block sizes.

- "block probabilities" correspond to the prior probabilities of block memberships of nodes.

- "block probabilities prior parameter" corresponds to the concentration parameter of truncated Dirichlet process prior of parameters of hergm-terms.

- if predictions = TRUE, "posterior prediction of statistics" correspond to posterior predictions of sufficient statistics.

...

additional arguments, to be passed to lower-level functions in the future.

References

Babkin, S., Stewart, J., Long, X., and M. Schweinberger (2020). Large-scale estimation of random graph models with local dependence. Computational Statistics and Data Analysis, 152, 1--19.

Cao, M., Chen, Y., Fujimoto, K., and M. Schweinberger (2018). A two-stage working model strategy for network analysis under hierarchical exponential random graph models. Proceedings of the 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 290--298.

Handcock, M. S. (2003). Assessing degeneracy in statistical models of social networks. Technical report, Center for Statistics and the Social Sciences, University of Washington, Seattle. http://www.csss.washington.edu/Papers.

Holland, P. W. and S. Leinhardt (1981). An exponential family of probability distributions for directed graphs. Journal of the American Statistical Association, Theory & Methods, 76, 33--65.

Krivitsky, P. N., Handcock, M. S., & Morris, M. (2011). Adjusting for network size and composition effects in exponential-family random graph models. Statistical Methodology, 8(4), 319-339.

Krivitsky, P.N, and Kolaczyk, E. D. (2015). On the question of effective sample size in network modeling: An asymptotic inquiry. Statistical science: a review journal of the Institute of Mathematical Statistics, 30(2), 184.

Nowicki, K. and T. A. B. Snijders (2001). Estimation and prediction for stochastic blockstructures. Journal of the American Statistical Association, Theory & Methods, 96, 1077--1087.

Peng, L. and L. Carvalho (2016). Bayesian degree-corrected stochastic block models for community detection. Electronic Journal of Statistics 10, 2746--2779.

Schweinberger, M. (2011). Instability, sensitivity, and degeneracy of discrete exponential families. Journal of the American Statistical Association, Theory & Methods, 106, 1361--1370.

Schweinberger, M. (2020). Consistent structure estimation of exponential-family random graph models with block structure. Bernoulli, 26, 1205--1233.

Schweinberger, M. and M. S. Handcock (2015). Local dependence in random graph models: characterization, properties, and statistical inference. Journal of the Royal Statistical Society, Series B (Statistical Methodology), 7, 647-676.

Schweinberger, M., Krivitsky, P. N., Butts, C.T. and J. Stewart (2020). Exponential-family models of random graphs: Inference in finite, super, and infinite population scenarios. Statistical Science, 35, 627-662.

Schweinberger, M. and P. Luna (2018). HERGM: Hierarchical exponential-family random graph models. Journal of Statistical Software, 85, 1--39.

Schweinberger, M., Petrescu-Prahova, M. and D. Q. Vu (2014). Disaster response on September 11, 2001 through the lens of statistical network analysis. Social Networks, 37, 42--55.

Schweinberger, M. and J. Stewart (2020). Concentration and consistency results for canonical and curved exponential-family random graphs. The Annals of Statistics, 48, 374--396.

Snijders, T. A. B. and K. Nowicki (1997). Estimation and prediction for stochastic blockmodels for graphs with latent block structure. Journal of Classification, 14, 75--100.

Stewart, J., Schweinberger, M., Bojanowski, M., and M. Morris (2019). Multilevel network data facilitate statistical inference for curved ERGMs with geometrically weighted terms. Social Networks, 59, 98--119.

Vu, D. Q., Hunter, D. R. and M. Schweinberger (2013). Model-based clustering of large networks. Annals of Applied Statistics, 7, 1010--1039.

Examples

Run this code


# \donttest{
data(example) 
m <- summary(d ~ edges)
# }