The function estimates a dynamic mixed-membership stochastic blockmodel that incorporates covariates.
mmsbm(
formula.dyad,
formula.monad = ~1,
senderID,
receiverID,
nodeID = NULL,
timeID = NULL,
data.dyad,
data.monad = NULL,
n.blocks,
n.hmmstates = 1,
directed = TRUE,
mmsbm.control = list()
)
Object of class mmsbm
. List with named components:
Matrix of variational posterior of mean of mixed-membership vectors. nodes
by
n.blocks
.
n.blocks
by n.blocks
matrix of estimated tie log-odds between members
of corresponding latent groups. The blockmodel.
If hessian=TRUE
, variance-covariance matrix of parameters in blockmodel, ordered in column-major order.
Array of estimated coefficient values for monadic covariates. Has n.blocks
columns,
and n.hmmstates
slices.
If hessian=TRUE
, variance-covariance matrix of monadic coefficients.
Vector estimated coefficient values for dyadic covariates.
If hessian=TRUE
, variance-covariance matrix of dyadic coefficients.
Matrix of estimated HMM transition probabilities.
Matrix of marginal probabilities of being in an HMM state at any given point in time.
n.hmmstates
by years (or whatever time interval networks are observed at).
Final LB value
Vector of all LB across iterations, useful to check early convergence issues.
Final number of VI iterations.
Convergence indicator; zero indicates failure to converge.
Order in which nodes are stored in all return objects.
Model frames used during estimation (stripped of attributes).
Values of selected formal arguments used by other methods.
The value of RNG seed used during estimation.
Original (unevaluated) function call.
A formula
object. The variable in data.dyad
that contains
binary edges should be used as a LHS, and any dyadic predictors
can be included on the RHS (when no dyadic covariates are available, use y ~ 1
).
Same syntax as a glm
formula.
An optional formula
object. LHS is ignored. RHS contains
names of nodal atrributes found in data.monad
.
Character string. Quoted name of the variable in data.dyad
identifying
the sender node. For undirected networks, the variable simply contains name of first node
in dyad. Cannot contain special charecter "`@`".
Character string. Quoted name of the variable in data.dyad
identifying
the receiver node. For undirected networks, the variable simply contains name of second node
in dyad. Cannot contain special charecter "`@`".
Character string. Quoted name of the variable in data.monad
identifying
a node in either data.dyad[,senderID]
or data.dyad[,senderID]
. If not NULL
,
every node data.dyad[,senderID]
or data.dyad[,senderID]
must be present in
data.monad[,nodeID]
. Cannot contain special charecter "`@`".
Character string. Quoted name of the variable in both data.dyad
and
data.monad
indicating the time in which network (and correspding nodal atrributes)
were observed. The variable itself must be composed of integers. Cannot contain special charecter "`@`".
Data frame. Sociomatrix in ``long'' (i.e. dyadic) format. Must contain at least three variables: the sender identifier (or identifier of the first node in an undirected networks dyad), the receiver identifier (or identifier of the second node in an undirected network dyad), and the value of the edge between them. Currently, only edges between zero and one (inclusive) are supported.
Data frame. Nodal atributes. Must contain a node identifier matching the names of nodes
used in the data.dyad
data frame.
Integer value. How many latent groups should be used to estimate the model?
Integer value. How many hidden Markov state should be used in the HMM? Defaults to 1 (i.e. no HMM).
Boolean. Is the network directed? Defaults to TRUE
.
A named list of optional algorithm control parameters.
Integer. Seed the RNG. By default, a random seed is generated and returned for reproducibility purposes.
Integer. Number of random initialization trials. Defaults to 5.
Boolean. Type of initialization algorithm for mixed-membership vectors in static case. If TRUE
(default),
use spectral clustering with degree correction; otherwise, use kmeans algorithm.
Boolean. Should a collapsed Gibbs sampler of non-regression mmsbm be used to initialize
mixed-membership vectors, instead of a spectral or simple kmeans initialization?
Setting to TRUE
will result in slower initialization and faster model estimation. When TRUE
, results are typically very sensitive to
choice of alpha (see below).
Numeric positive value. Concentration parameter for collapsed Gibbs sampler to find initial
mixed-membership values when init_gibbs=TRUE
. Defaults to 1.0.
Means of handling missing data. One of "indicator method" (default) or "listwise deletion".
Boolean; should stochastic variational inference be used? Defaults to TRUE
.
Number of maximum iterations in stochastic variational updates. Defaults to 5e2.
When svi=TRUE
, proportion of nodes sampled in each local. Defaults to 0.05 when svi=TRUE
, and to 1.0 otherwise.
When svi=TRUE
, value between (0.5,1], controlling speed of decay of weight of prior
parameter values in global steps. Defaults to 0.75 when svi=TRUE
, and to 0.0 otherwise.
When svi=TRUE
, non-negative value controlling weight of past iterations in global steps. Defaults to 1.0 when svi=TRUE
,
and ignored otherwise.
Number of maximum iterations of BFGS in global step. Defaults to 10e3.
Boolean indicating whether the Hessian matrix of regression coefficients should e returned. Defaults to TRUE
.
Boolean indicating whether blockmodel should be assortative (i.e. stronger connections within groups) or disassortative
(i.e. stronger connections between groups). Defaults to TRUE
.
Numeric vector with two elements: prior mean of blockmodel's main diagonal elements, and
and prior mean of blockmodel's offdiagonal elements. Defaults to c(5.0, -5.0)
if assortative=TRUE
(default)
and to c(-5.0, 5.0)
otherwise.
Numeric vector with two positive elements: prior variance of blockmodel's main diagonal elements, and
and prior variance of blockmodel's offdiagonal elements. Defaults to c(5.0, 5.0)
.
Either single numeric value, in which case the same prior mean is applied to all monadic coefficients, or
an array that is npredictors
by n.blocks
by n.hmmstates
, where npredictors
is the number of monadic predictors for which a prior mean is being set (prior means need not be set for all)
predictors). The rows in the array should be named to identify which variables a prior mean is being set for.
Defaults to a common prior mean of 0.0 for all monadic coefficients.
See mu_beta
. Defaults to a single common prior variance of 5.0 for all (standardized) monadic coefficients.
Either a single numeric value, in which case the same prior mean is applied to all dyadic coefficients, or a named vector of numeric values (with names corresponding to the name of the variable for which a prior mean is being set). Defaults to a common prior mean of 0.0 for all dyadic coefficients.
See mu_gamma
. Defaults to a single common prior variance of 5.0 for all (standardized) dyadic coefficients.
Numeric positive value. Concentration hyper-parameter for HMM. Defaults to 1.0.
Number of samples from variational posterior of latent variables on which approximation to variance-covariance matrices are based. Defaults to 10.
Maximum number of dyads to sample in computation of variance-covariance of dyadic and blockmodel parameters, when compared to ten percent of the observed dyads. Defaults to 1000.
Optional character vector, with "nodeID@timeID"
as elements, indicating which mixed-membership vectors
should remain constant at their initial values throughout estimation. When only one year is observed, elements should be
"nodeID@1"
. Typically used with mm_init_t
.
Matrix, n.blocks
by nodes across years. Optional initial values for mixed-membership vectors.
Although initial values need not be provided for all nodes, column names must have a nodeID@timeID
format to
avoid ambiguity. When only one year is observed, names should be "nodeID@1"
.
Matrix, n.hmmstates
by number of years. Optional initial values for variational
parameters for state probabilities. Columns must be named according to unique year values.
Matrix, n.blocks
by n.blocks
. Optional initial values for blockmodel.
Array, predictors
by n.blocks
by n.hmmstates
. Optional initial values for monadic coefficients. If
Vector. Optional initial values for dyadic coefficients.
Boolean. Should all permutations be tested to realign initial block models in dynamic case? If FALSE
, realignment is
done via faster graph matching algorithm, but may not be exact. Defaults to TRUE
.
Numeric value. Absolute tolerance for VI convergence. Defaults to 1e-3.
Boolean. Should extra information be printed as model iterates? Defaults to FALSE.
Santiago Olivella (olivella@unc.edu), Adeline Lo (aylo@wisc.edu), Tyler Pratt (tyler.pratt@yale.edu), Kosuke Imai (imai@harvard.edu)
library(NetMix)
## Load datasets
data("lazega_dyadic")
data("lazega_monadic")
## Estimate model with 2 groups
## Setting to `hessian=TRUE` increases computation time
## but is needed if standard errors are to be computed.
lazega_mmsbm <- mmsbm(SocializeWith ~ Coworkers,
~ School + Practice + Status,
senderID = "Lawyer1",
receiverID = "Lawyer2",
nodeID = "Lawyer",
data.dyad = lazega_dyadic,
data.monad = lazega_monadic,
n.blocks = 2,
mmsbm.control = list(seed = 123,
conv_tol = 1e-2,
hessian = FALSE))
Run the code above in your browser using DataLab