mmsbm: Dynamic mixed-membership stochastic blockmodel with covariates

Description

The function estimates a dynamic mixed-membership stochastic blockmodel that incorporates covariates.

Usage

mmsbm(
  formula.dyad,
  formula.monad = ~1,
  senderID,
  receiverID,
  nodeID = NULL,
  timeID = NULL,
  data.dyad,
  data.monad = NULL,
  n.blocks,
  n.hmmstates = 1,
  directed = TRUE,
  mmsbm.control = list()
)

Arguments

formula.dyad

A formula object. The variable in data.dyad that contains binary edges should be used as a LHS, and any dyadic predictors can be included on the RHS (when no dyadic covariates are available, use y ~ 1). Same syntax as a glm formula.

formula.monad

An optional formula object. LHS is ignored. RHS contains names of nodal atrributes found in data.monad.

senderID

Character string. Quoted name of the variable in data.dyad identifying the sender node. For undirected networks, the variable simply contains name of first node in dyad. Cannot contain special charecter "`@`".

receiverID

Character string. Quoted name of the variable in data.dyad identifying the receiver node. For undirected networks, the variable simply contains name of second node in dyad. Cannot contain special charecter "`@`".

nodeID

Character string. Quoted name of the variable in data.monad identifying a node in either data.dyad[,senderID] or data.dyad[,senderID]. If not NULL, every node data.dyad[,senderID] or data.dyad[,senderID] must be present in data.monad[,nodeID]. Cannot contain special charecter "`@`".

timeID

Character string. Quoted name of the variable in both data.dyad and data.monad indicating the time in which network (and correspding nodal atrributes) were observed. The variable itself must be composed of integers. Cannot contain special charecter "`@`".

data.dyad

Data frame. Sociomatrix in ``long'' (i.e. dyadic) format. Must contain at least three variables: the sender identifier (or identifier of the first node in an undirected networks dyad), the receiver identifier (or identifier of the second node in an undirected network dyad), and the value of the edge between them. Currently, only edges between zero and one (inclusive) are supported.

data.monad

Data frame. Nodal atributes. Must contain a node identifier matching the names of nodes used in the data.dyad data frame.

n.blocks

Integer value. How many latent groups should be used to estimate the model?

n.hmmstates

Integer value. How many hidden Markov state should be used in the HMM? Defaults to 1 (i.e. no HMM).

directed

Boolean. Is the network directed? Defaults to TRUE.

mmsbm.control

A named list of optional algorithm control parameters.

seed: Integer value. Seed the RNG. By default, a random seed is generated and returned for reproducibility purposes.
spectral: Boolean. Type of initialization algorithm for mixed-membership vectors in static case. If TRUE (default), use spectral clustering with degree correction; otherwise, use kmeans algorithm.
init_gibbs: Boolean. Should a collapsed Gibbs sampler of non-regression mmsbm be used to initialize mixed-membership vectors, instead of a spectral or simple kmeans initialization? Setting to TRUE will result in slower initialization and faster model estimation. When TRUE, results are typically very sensitive to choice of alpha (see below).
alpha: Numeric positive value. Concentration parameter for collapsed Gibbs sampler to find initial mixed-membership values when init_gibbs=TRUE. Defaults to 1.0.
missing: Means of handling missing data. One of "indicator method" (default) or "listwise deletion".
em_iter: Number of maximum iterations in variational EM. Defaults to 5e3.
opt_iter: Number of maximum iterations of BFGS in M-step. Defaults to 10e3.
hessian: Boolean indicating whether the Hessian matrix of regression coefficients should e returned. Defaults to TRUE.
mu_b: Numeric vector with two elements: prior mean of blockmodel's main diagonal elements, and and prior mean of blockmodel's offdiagonal elements. Defaults to c(5.0, -5.0).
var_b: Numeric vector with two positive elements: prior variance of blockmodel's main diagonal elements, and and prior variance of blockmodel's offdiagonal elements. Defaults to c(1.0, 1.0).
mu_beta: Either single numeric value, in which case the same prior mean is applied to all monadic coefficients, or an array with that is npredictors by n.blocks by n.hmmstates, where npredictors is the number of monadic predictors for which a prior mean is being set (prior means need not be set for all) predictors). The rows in the array should be named to identify which variables a prior mean is being set for. Defaults to a common prior mean of 0.0 for all monadic coefficients.
var_beta: See mu_beta. Defaults to a single common prior variance of 1.0 for all monadic coefficients.
mu_gamma: Either a single numeric value, in which case the same prior mean is applied to all dyadic coefficients, or a named vector of numeric values (with names corresponding to the name of the variable for which a prior mean is being set). Defaults to a copmmon prior mean of 0.0 for all dyadic coefficients.
var_gamma: See mu_gamma. Defaults to a single common prior variance of 1.0 for all dyadic coefficients.
eta: Numeric positive value. Concentration hyper-parameter for HMM. Defaults to 10.3.
se_sim: Number of samples from variational posterior of latent variables on which approximation to variance-covariance matrices are based. Defaults to 10.
dyad_vcov_samp: Number of dyads to sample in computation of variance-covariance of dyadic and blockmodel parameters. Defaults to 1000.
phi_init_t: Matrix, n.blocks by total number of nodes across years. Optional initial values for variational parameters for mixed-membership vectors. Column names must be of the form nodeid\@year .
kappa_init_t: Matrix, n.hmmstates by number of years. Optional initial values for variational parameters for state probabilities.
b_init_t: Matrix, n.blocks by n.blocks. Optional initial values for blockmodel.
beta_init: Array, predictors by n.blocks by n.hmmstates. Optional initial values for monadic coefficients. If
gamma_init: Vector. Optional initial values for dyadic coefficients.
permute: Boolean. Should all permutations be tested to realign initial block models in dynamic case? If FALSE, realignment is done via faster graph matching algorithm, but may not be exact. Defaults to TRUE.
conv_tol: Numeric value. Absolute tolerance for VI convergence. Defaults to 1e-3.
verbose: Boolean. Should extra information be printed as model iterates? Defaults to FALSE.

Value

Object of class mmsbm. List with named components:

MixedMembership: Matrix of variational posterior of mean of mixed-membership vectors. nodes by \ n.groups.
BlockModel: n.groups by n.groups matrix of estimated tie log-odds between members of corresponding latent groups. The blockmodel.
vcov_blockmodel: If hessian=TRUE, variance-covariance matrix of parameters in blockmodel, ordered in column-major order.
MonadCoef: Array of estimated coefficient values for monadic covariates. Has n.groups columns, and n.hmmstates slices.
vcov_monad: If hessian=TRUE, variance-covariance matrix of monadic coefficients.
DyadCoef: Vector estimated coefficient values for dyadic covariates.
vcov_dyad: If hessian=TRUE, variance-covariance matrix of dyadic coefficients.
TransitionKernel: Matrix of estimated HMM transition probabilities.
Kappa: Matrix of marginal probabilities of being in an HMM state at any given point in time. n.hmmstates by years (or whatever time interval networks are observed at).
niter: Final number of VI iterations.
converged: Convergence indicator; zero indicates failure to converge.
NodeIndex: Order in which nodes are stored in all return objects.
monadic.data, dyadic.data: Model frames used during estimation (stripped of attributes).
forms: Values of selected formal arguments used by other methods.
seed: The value of RNG seed used during estimation.
call: Original (unevaluated) function call.

Examples

Run this code

# NOT RUN {
library(NetMix)
## Load datasets
data("lazega_dyadic")
data("lazega_monadic")
## Estimate model with 2 groups
## Setting to `hessian=TRUE` increases computation time
## but is needed if standard errors are to be computed. 
lazega_mmsbm <- mmsbm(SocializeWith ~ Coworkers,
                      ~  School + Practice + Status,
                      senderID = "Lawyer1",
                      receiverID = "Lawyer2",
                      nodeID = "Lawyer",
                      data.dyad = lazega_dyadic,
                      data.monad = lazega_monadic,
                      n.blocks = 2,
                      mmsbm.control = list(seed = 123,
                                           hessian = FALSE))

# }

Run the code above in your browser using DataLab