mlergm (version 0.1)

mlergm: Multilevel Exponential-Family Random Graph Models

Description

This function estimates an exponential-family random graph model for multilevel network data. At present, mlergm covers network data where the set of nodes is nested within known blocks (see, e.g., Schweinberger and Handcock, 2015). An example is groups of students nested within classrooms, which is covered in the classes data set. It is assumed that the node membership, that to which block each node is associated, is known (or has been previously estimated).

Usage

mlergm(form, node_memb, parameterization = "standard",
  options = set_options(), theta_init = NULL, verbose = 0,
  eval_loglik = TRUE, seed = NULL)

# S3 method for mlergm print(x, ...)

# S3 method for mlergm summary(object, ...)

Arguments

form

Formula of the form: network ~ term1 + term2 + ...; allowable model terms are a subset of those in R package ergm, see ergm.terms.

node_memb

Vector (length equal to the number of nodes in the network) indicating to which block or group the nodes belong. If the network provided in form is an object of class mlnet, then node_memb can be exctracted directly from the network and need not be provided.

parameterization

Parameterization options include 'standard' and 'offset'. The offset parameterization uses edge and mutual offsets along the lines of Krivitsky, Handcock, and Morris (2011) and Krivitsky and Kolaczyk (2015).

options

See set_options for details.

theta_init

Parameter vector of initial estimates for theta to be used.

verbose

Controls the level of output. A value of 0 corresponds to no output, except for warnings; a value of 1 corresponds to minimal output, and a value of 2 corresponds to full output.

eval_loglik

(Logical TRUE or FALSE) If set to TRUE, the bridge estimation procedure of Hunter and Handcock (2006) is used to estimate the loglikelihood for BIC calculations, otherwise the loglikelihood and therefore the BIC is not estimated.

seed

For reproducibility, an integer-valued seed may be specified.

x

An object of class mlergm, probably produced by mlergm.

Additional arguments to be passed if necessary.

object

An object of class mlergm, probably produced by mlergm.

Value

mlergm returns an object of class mlergm which is a list containing:

theta

Estimated parameter vector of the exponential-family random graph model.

between_theta

Estimated parameter vector of the between group model.

se

Standard error vector for theta.

between_se

Standard error vector for between_theta.

pvalue

A vector of p-values for the estimated parameter vector.

between_pvalue

A vector of p-values for the estimated parameter vector.

logLikval

The loglikelihood for at the estimated MLE.

bic

The BIC for the estimated model.

mcmc_chain

The MCMC sample used in the final estimation step, which can be used to diagnose non-convergence.

estimation_status

Indicator of whether the estimation procedure had succcess or failed.

parameterization

The model parameterization (either standard or offset).

formula

The model formula.

network

The network for which the model is estimated.

node_memb

Vector indicating to which group or block the nodes belong.

size_quantiles

The quantiles of the block sizes.

Methods (by generic)

  • print: Print method for objects of class mlergm. Indicates whether the model was succesfully estimated, as well as the model formula provided.

  • summary: Prints a summary of the estimated mlergm model.

Details

The estimation procedures performs Monte-Carlo maximum likelihood for the specified ERGM using a version of the Fisher scoring method detailed by Hunter and Handcock (2006). Settings governing the MCMC procedure (such as burnin, interval, and sample_size) as well as more general settings for the estimation procedure can be adjusted through set_options. The estimation procedure uses the the stepping algorithm of Hummel, et al., (2012) for added stability.

References

Schweinberger, M. and Handcock, M. S. (2015). Local dependence in random graph models: characterization, properties and statistical inference. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 77(3), 647-676.

Hunter, D. R., and Handcock, M. S. (2006). Inference in curved exponential family models for networks. Journal of Computational and Graphical Statistics, 15(3), 565-583.

Hummel, R. M., Hunter, D. R., and Handcock, M. S. (2012). Improving simulation-based algorithms for fitting ERGMs. Journal of Computational and Graphical Statistics, 21(4), 920-939.

Krivitsky, P. N., Handcock, M. S., & Morris, M. (2011). Adjusting for network size and composition effects in exponential-family random graph models. Statistical methodology, 8(4), 319-339.

Krivitsky, P.N, and Kolaczyk, E. D. (2015). On the question of effective sample size in network modeling: An asymptotic inquiry. Statistical science: a review journal of the Institute of Mathematical Statistics, 30(2), 184.

Hunter D., Handcock M., Butts C., Goodreau S., and Morris M. (2008). ergm: A Package to Fit, Simulate and Diagnose Exponential-Family Models for Networks. Journal of Statistical Software, 24(3), 1-29.

Butts, C. (2016). sna: Tools for Social Network Analysis. R package version 2.4. https://CRAN.R-project.org/package=sna.

Butts, C. (2008). network: a Package for Managing Relational Data in R. Journal of Statistical Software, 24(2). http://www.jstatsoft.org/v24/i02/paper.

See Also

gof.mlergm, mlnet

Examples

Run this code
# NOT RUN {
### Load the school classes data-set 
data(classes) 

# Estimate a curved multilevel ergm model with offset parameter 
# Approximate run time (2 cores): 1.2m, Run time (3 cores): 55s 
model_est <- mlergm(classes ~ edges + mutual + nodematch("sex") +  gwesp(fixed = FALSE), 
                    seed = 123, 
                    options = set_options(number_cores = 2))

# To access a summary of the fitted model, call the 'summary' function 
summary(model_est)

# Goodness-of-fit can be run by calling the 'gof.mlergm' method 
# Approximate run time (2 cores): 48s, Run time (3 cores): 34s  
gof_res <- gof(model_est, options = set_options(number_cores = 2))
plot(gof_res, cutoff = 15)
# }

Run the code above in your browser using DataCamp Workspace