ergm-terms: Terms used in Exponential Family Random Graph Models

Description

The function ergm is used to fit linear exponential random graph models, in which the probability of a given network, $y$, on a set of nodes is $\exp{\theta{\cdot}g(y)}/c(\theta)$, where $g(y)$ is a vector of network statistics for $y$, $\theta$ is a parameter vector of the same length and $c(\theta)$ is the normalizing constant for the distribution.

The network statistics $g(y)$ are entered as terms in the function call to ergm.

This page describes the possible terms (and hence network statistics).

Arguments

Specifying models

Terms to ergm are specified by a formula to represent the network and network statistics. This is done via a formula, that is, an Rformula object, of the form y ~ + ..., where y is a network object or a matrix that can be coerced to a network object, and , , etc, are each terms chosen from the list given below. To create a network object in R, use the network function, then add nodal attributes to it using the %v% operator if necessary.

Possible terms to represent network statistics

The ergm function allows the user to explore a large number of potential models for their network data. What follows is a list of model terms currently available by the program, and a brief description of each. In the formula for the model, the model terms are various function-like calls, some of which require arguments, separated by + signs.

Additional terms can be coded up by users via the statnetuserterms package.

The terms currently available are:

absdiff(attrname){Absolute difference: The attrname argument is a character string giving the name of a quantitative attribute in the network's vertex attribute list. This term adds one network statistic to the model equaling the sum of abs(attrname[i]-attrname[j]) for all edges (i,j) in the network. }

absdiffcat(attrname, base=NULL){Categorical absolute difference: The attrname argument is a character string giving the name of a quantitative attribute in the network's vertex attribute list. This term adds one statistic for every possible nonzero distinct value of abs(attrname[i]-attrname[j]) in the network; the value of each such statistic is the number of edges in the network with the corresponding absolute difference. The optional base argument is a vector indicating which nonzero differences, in order from smallest to largest, should be omitted from the model (i.e., treated like the zero-difference category). The base argument, if used, should contain indices, not differences themselves. For instance, if the possible values of abs(attrname[i]-attrname[j]) are 0, 0.5, 3, 3.5, and 10, then to omit 0.5 and 10 one should set base=c(1, 4). Note that this term should generally be used only when the quantitative attribute has a limited number of possible values; an example is the "Grade" attribute of the faux.mesa.high or faux.magnolia.high datasets.}

altkstar(lambda, fixed=FALSE){Alternating k-star: This term adds one network statistic to the model equal to a weighted alternating sequence of k-star statistics with weight parameter lambda. This is the version given in Snijders et al. (2006). We suggest using the gwdegree term instead. The gwdegree and altkstar produce mathematically equivalent models, as long as they are used together with the edges (or kstar(1)) term, yet the interpretation of the gwdegree parameters is slightly more straightforward than the interpretation of the altkstar parameters. For this reason, we recommend the use of the gwdegree instead of altkstar. See Section 3 and especially equation (13) of http://www.sna.unimelb.edu.au/publications/cef4.pdf for details. The optional argument fixed indicates whether the scale parameter lambda is to be fit as a curved exponential family model (see Hunter and Handcock, 2006). The default is FALSE, which means the scale parameter is not fixed and thus the model is a CEF model. This term can only be used with undirected networks.}

asymmetric{Asymmetric dyads: This term adds one network statistic to the model, equaling the number of pairs of actors for which exactly one of $(i{\rightarrow}j)$ or $(j{\rightarrow}i)$ exists. This term can only be used with directed networks. }

b1concurrent(attrname){Concurrent node count for the first mode in a bipartite (aka two-mode) network: This term adds one network statistic to the model, equal to the number of nodes in the first mode of the network with degree 2 or higher. The first mode of a bipartite network object is sometimes known as the "actor" mode. The optional term attrname is a character string giving the name of an attribute in the network's vertex attribute list. If this is specified then the count is the number of nodes in the first mode with ties to at least 2 other nodes with the same value for that attribute as the index node. This term can only be used with undirected bipartite networks. }

b1degree(d, attrname){Degree for the first mode in a bipartite (aka two-mode) network: The d argument is a vector of distinct integers. This term adds one network statistic to the model for each element in d; the $i$th such statistic equals the number of nodes of degree d[i] in the first mode of a bipartite network, i.e. with exactly d[i] edges. The first mode of a bipartite network object is sometimes known as the "actor" mode. The optional term attrname is a character string giving the name of an attribute in the network's vertex attribute list. If this is specified then the degree count is the number of nodes with the same value of the attribute as the ego node. This term can only be used with undirected bipartite networks.}

b1factor(attrname, base=1){Factor attribute effect for the first mode in a bipartite (aka two-mode) network : The attrname argument is a character string giving the name of a categorical attribute in the network's vertex attribute list. This term adds multiple network statistics to the model, one for each of (a subset of) the unique values of the attrname attribute. Each of these statistics gives the number of times a node with that attribute in the first mode of the network appears in an edge. The first mode of a bipartite network object is sometimes known as the "actor" mode. To include all attribute values is usually not a good idea, because the sum of all such statistics equals the number of edges and hence a linear dependency would arise in any model also including edges. Thus, the base argument tells which value(s), numbered in order according to the sort function, should be omitted. The default value, one, means that the smallest (i.e., first in sorted order) attribute value is omitted, making this value the reference category to which all other values are compared. For example, if the fruit factor has levels orange, apple, banana, and pear, then to add just two terms, one for apple and one for pear, set banana and orange to the base (remember to sort the values first) by using nodefactor("fruit", base=2:3). This term can only be used with undirected bipartite networks.}

b1star(k, attrname){k-Stars for the first mode in a bipartite (aka two-mode) network: The k argument is a vector of distinct integers. This term adds one network statistic to the model for each element in k. The $i$th such statistic counts the number of distinct k[i]-stars whose center node is in the first mode of the network. The first mode of a bipartite network object is sometimes known as the "actor" mode. A $k$-star is defined to be a center node $N$ and a set of $k$ different nodes ${O_1, \dots, O_k}$ such that the ties ${N, O_i}$ exist for $i=1, \dots, k$. The optional argument attrname is a character string giving the name of an attribute in the network's vertex attribute list. If this is specified then the count is over the number of $k$-stars (with center node in the first mode) where all nodes have the same value of the attribute. This term can only be used for undirected bipartite networks. Note that b1star(1) is equal to b2star(1) and to edges. }

b2concurrent(attrname){Concurrent node count for the second mode in a bipartite (aka two-mode) network: This term adds one network statistic to the model, equal to the number of nodes in the second mode of the network with degree 2 or higher. The second mode of a bipartite network object is sometimes known as the "event" mode. The optional term attrname is a character string giving the name of an attribute in the network's vertex attribute list. If this is specified then the count is the number of nodes in the second mode with ties to at least 2 other nodes with the same value for that attribute as the index node. This term can only be used with undirected bipartite networks. }

b2degree(d, attrname){Degree for the second mode in a bipartite (aka two-mode) network: The d argument is a vector of distinct integers. This term adds one network statistic to the model for each element in d; the $i$th such statistic equals the number of nodes of degree d[i] in the second mode of a bipartite network, i.e. with exactly d[i] edges. The second mode of a bipartite network object is sometimes known as the "event" mode. The optional term attrname is a character string giving the name of an attribute in the network's vertex attribute list. If this is specified then the degree count is the number of nodes with the same value of the attribute as the ego node. This term can only be used with undirected bipartite networks.}

b2factor(attrname, base=1){Factor attribute effect for the second mode in a bipartite (aka two-mode) network : The attrname argument is a character string giving the name of a categorical attribute in the network's vertex attribute list. This term adds multiple network statistics to the model, one for each of (a subset of) the unique values of the attrname attribute. Each of these statistics gives the number of times a node with that attribute in the second mode of the network appears in an edge. The second mode of a bipartite network object is sometimes known as the "event" mode. To include all attribute values is usually not a good idea, because the sum of all such statistics equals the number of edges and hence a linear dependency would arise in any model also including edges. Thus, the base argument tells which value(s), numbered in order according to the sort function, should be omitted. The default value, one, means that the smallest (i.e., first in sorted order) attribute value is omitted, making this value the reference category to which all other values are compared. For example, if the fruit factor has levels orange, apple, banana, and pear, then to add just two terms, one for apple and one for pear, set banana and orange to the base (remember to sort the values first) by using nodefactor("fruit", base=2:3). This term can only be used with undirected bipartite networks.}

b2star(k, attrname){k-Stars for the second mode in a bipartite (aka two-mode) network: The k argument is a vector of distinct integers. This term adds one network statistic to the model for each element in k. The $i$th such statistic counts the number of distinct k[i]-stars whose center node is in the second mode of the network. The second mode of a bipartite network object is sometimes known as the "event" mode. A $k$-star is defined to be a center node $N$ and a set of $k$ different nodes ${O_1, \dots, O_k}$ such that the ties ${N, O_i}$ exist for $i=1, \dots, k$. The optional argument attrname is a character string giving the name of an attribute in the network's vertex attribute list. If this is specified then the count is over the number of $k$-stars (with center node in the second mode) where all nodes have the same value of the attribute. This term can only be used for undirected bipartite networks. Note that b2star(1) is equal to b1star(1) and to edges. }

balance{Balanced triads: This term adds one network statistic to the model equaling the number of triads in the network that are balanced. Every unoriented directed triad may occupy one of 16 distinct states. These states were used by Davis and Leinhardt (1972)as a basis for classifying triads within a larger structure. The balanced triads are those of type 102 and 300. For details about triad types, see triad.classify in the sna package. For an undirected graph the balanced triads are those with an even number of ties (i.e., 0 and 2).}

concurrent(attrname){Concurrent node count: This term adds one network statistic to the model, equal to the number of nodes in the network with degree 2 or higher. The optional term attrname is a character string giving the name of an attribute in the network's vertex attribute list. If this is specified then the count is the number of nodes with ties to at least 2 other nodes with the same value for that attribute as the index node. This term can only be used with undirected networks. }

ctriple(attrname){Cyclic triples: This term adds one statistic to the model, equal to the number of cyclic triples in the network, defined as a set of edges of the form ${(i{\rightarrow}j), (j{\rightarrow}k), (k{\rightarrow}i)}$. Note that for all directed networks, triangle is equal to ttriple+ctriple, so at most two of these three terms can be in a model. The optional argument attrname is a character string giving the name of an attribute in the network's vertex attribute list. If this is specified then the count is over the number of cyclic triples where all three nodes have the same value of the attribute. This term can only be used with directed networks.}

cycle(k){Cycles: The k argument is a vector of distinct integers. This term adds one network statistic to the model for each element in k; the $i$th such statistic equals the number of cycles in the network with length exactly k[i]. The cycle statistic applies to both directed and undirected graphs. For directed networks, it counts directed cycles of length $k$, as opposed to undirected cycles in the undirected case. The directed cycle terms of lengths 2 and 3 are equivalent to mutual and ctriple (respectively). The undirected cycle term of length 3 is equivalent to triangle, and there is no undirected cycle term of length 2. }

degree(d, attrname){Degree: The d argument is a vector of distinct integers. This term adds one network statistic to the model for each element in d; the $i$th such statistic equals the number of nodes in the network of degree d[i], i.e. with exactly d[i] edges. The optional term attrname is a character string giving the name of an attribute in the network's vertex attribute list. If this is specified then the degree count is the number of nodes with the same value of the attribute as the ego node. This term can only be used with undirected networks; for directed networks see idegree and odegree. }

density{Density: This term adds one network statistic equal to the density of the network. For undirected networks, density equals kstar(1) or edges divided by $n(n-1)/2$; for directed networks, density equals edges or istar(1) or ostar(1) divided by $n(n-1)$. } dsp(d){Dyadwise shared partners: The d argument is a vector of distinct integers. This term adds one network statistic to the model for each element in d; the $i$th such statistic equals the number of dyads in the network with exactly d[i] shared partners. This term can be used with directed and undirected networks. For directed networks the count is over homogeneous shared partners only (i.e., only partners on a directed two-path connecting the nodes in the dyad and in the same direction).}

dyadcov(x, attrname){Dyadic covariate: If the network is directed, x is either a (symmetric) matrix of dyadic covariates, or an undirected network; if the latter, optional argument attrname provides the name of the edge attribute to use for edge values. This term adds three statistics to the model, representing the (polytomous) effect of the given covariate on the four possible dyad states (i.e., null, out-tie, in-tie, mutual). The statistics are the appearance of mutual, upper-triangular asymmetric, and lower-triangular asymmetric dyads (with the null state serving as a reference category). If the network is undirected, x is either a matrix of edgewise covariates, or a network; if the latter, optional argument attrname provides the name of the edge attribute to use for edge values. This term adds one statistic to the model, representing the effect of the given covariate on the appearance of edges. The edgecov and dyadcov terms are equivalent for undirected networks. dyadcov can be called more than once, to model the effects of multiple covariates. } edgecov(x, attrname=NULL){Edge covariate: The x argument is either a matrix of edgewise covariates, or a network; if the latter, optional argument attrname provides the name of the edge attribute to use for edge values. This term adds one statistic to the model, representing the effect of the given covariate on the appearance of edges. The edgecov term applies to both directed and undirected networks. For undirected networks the covariates are also assumed to be undirected. The edgecov and dyadcov terms are equivalent for undirected networks. edgecov can be called more than once, to model the effects of multiple covariates. } edges{Edges: This term adds one network statistic equal to the number of edges in the network. For undirected networks, edges is equal to kstar(1); for directed networks, edges is equal to both ostar(1) and istar(1). }

esp(d){Edgewise shared partners: The d argument is a vector of distinct integers. This term adds one network statistic to the model for each element in d; the $i$th such statistic equals the number of edges in the network with exactly d[i] shared partners. This term can be used with directed and undirected networks. For directed networks the count is over homogeneous shared partners only (i.e., only partners on a directed two-path connecting the nodes in the edge in the same direction as the edge itself).}

gwb1degree(decay, fixed=FALSE){Geometrically weighted degree distribution for the first mode in a bipartite (aka two-mode) network: This term adds one network statistic to the model equal to the weighted degree distribution with weight parameter decay, for nodes in the first mode of a bipartite network. The first mode of a bipartite network object is sometimes known as the "actor" mode. This statistic is based on the version given as equation (14) in http://www.sna.unimelb.edu.au/publications/cef4.pdf. See the "Remark" in section 3 of that paper to see why it is used rather than the version given in Snijders et al. (2006). The optional argument fixed indicates whether the scale parameter lambda is to be fit as a curved exponential family model (see Hunter and Handcock, 2006). The default is FALSE, which means the scale parameter is not fixed and thus the model is a CEF model. This term can only be used with undirected bipartite networks.}

gwb2degree(decay, fixed=FALSE){Geometrically weighted degree distribution for the second mode in a bipartite (aka two-mode) network: This term adds one network statistic to the model equal to the weighted degree distribution with weight parameter decay, for nodes in the second mode of a bipartite network. The second mode of a bipartite network object is sometimes known as the "event" mode. This statistic is based on the version given as equation (14) in http://www.sna.unimelb.edu.au/publications/cef4.pdf. See the "Remark" in section 3 of that paper to see why it is used rather than the version given in Snijders et al. (2006). The optional argument fixed indicates whether the scale parameter lambda is to be fit as a curved exponential family model (see Hunter and Handcock, 2006). The default is FALSE, which means the scale parameter is not fixed and thus the model is a CEF model. This term can only be used with undirected bipartite networks.}

gwdegree(decay, fixed=FALSE){Geometrically weighted degree distribution: This term adds one network statistic to the model equal to the weighted degree distribution with weight parameter decay. This is the version given as equation (14) in http://www.sna.unimelb.edu.au/publications/cef4.pdf. See the "Remark" in section 3 of that paper to see why it is used rather than the version given in Snijders et al. (2006). The optional argument fixed indicates whether the scale parameter lambda is to be fit as a curved exponential family model (see Hunter and Handcock, 2006). The default is FALSE, which means the scale parameter is not fixed and thus the model is a CEF model. This term can only be used with undirected networks.}

gwdsp(alpha, fixed=FALSE){Geometrically weighted dyadwise shared partner distribution: This term adds one network statistic to the model equal to the geometrically weighted dyadwise shared partner distribution with weight parameter alpha $> 0$. The optional argument fixed indicates whether the scale parameter lambda is to be fit as a curved exponential family model (see Hunter and Handcock, 2006). The default is FALSE, which means the scale parameter is not fixed and thus the model is a CEF model. This term can be used with directed and undirected networks. For directed networks the count is over homogeneous shared partners only (i.e., only partners on a directed two-path connecting the nodes in the dyad and in the same direction).}

gwesp(alpha, fixed=FALSE){Geometrically weighted edgewise shared partner distribution: This term adds one network statistic to the model equal to the geometrically weighted edgewise shared partner distribution with weight parameter alpha $> 0$. The optional argument fixed indicates whether the scale parameter lambda is to be fit as a curved exponential family model (see Hunter and Handcock, 2006). The default is FALSE, which means the scale parameter is not fixed and thus the model is a CEF model. This term can be used with directed and undirected networks. For directed networks the geometric weighting is over homogeneous shared partners only (i.e., only partners on a directed two-path connecting the nodes in the edge in the same direction as the edge itself).}

gwidegree(decay, fixed=FALSE){Geometrically weighted in-degree distribution: This term adds one network statistic to the model equal to the weighted in-degree distribution with weight parameter decay. The optional argument fixed indicates whether the scale parameter lambda is to be fit as a curved exponential family model (see Hunter and Handcock, 2006). The default is FALSE, which means the scale parameter is not fixed and thus the model is a CEF model. This term can only be used with directed networks.}

gwodegree(decay, fixed=FALSE){Geometrically weighted out-degree distribution: This term adds one network statistic to the model equal to the weighted out-degree distribution with weight parameter decay. The optional argument fixed indicates whether the scale parameter lambda is to be fit as a curved exponential family model (see Hunter and Handcock, 2006). The default is FALSE, which means the scale parameter is not fixed and thus the model is a CEF model. This term can only be used with directed networks.}

hamming(x, cov, attrname){Hamming distance: This term adds one statistic to the model equal to the weighted or unweighted Hamming distance of the network from the network specified by x. (If no argument is given, x is taken to be the observed network.) Unweighted Hamming distance is defined as the total number of pairs $(i,j)$ (ordered or unordered, depending on whether the network is directed or undirected) on which the two networks differ. If the optional argument cov is specified, then the weighted Hamming distance is computed instead, where each pair $(i,j)$ contributes a pre-specified weight toward the distance when the two networks differ on that pair. The argument cov is either a matrix of edgewise weights or a network; if the latter, the optional argument attrname provides the name of the edge attribute to use for weight values.} hammingmix(attrname, x, base=0, contrast=FALSE){Hamming distance within mixing: This term adds one statistic to the model equal to the Hamming distance of the subnetwork of dyads for each possible pairing of attribute values of the network from the network specified by x. The statistic equals the number of dyads in the subnetwork that differ in tie value. This term produces one statistic for every entry in the mixing matrix for the attribute. The ordering of the attribute values is alphabetical. If the option contrast=TRUE is used, then a statistic for the first pairing is not included, making it the de facto reference category. The option base gives the index of statistics to be omitted from the tabulation. For example base=2 will omit the second statistic, making it the de facto reference category.} idegree(d, attrname){In-degree: The d argument is a vector of distinct integers. This term adds one network statistic to the model for each element in d; the $i$th such statistic equals the number of nodes in the network of in-degree d[i], i.e. the number of nodes with exactly d[i] in-edges. The optional argument attrname is a character string giving the name of an attribute in the network's vertex attribute list. If this is specified then the count only considers edges in which both nodes have the same value of the attribute. This term can only be used with directed networks; for undirected networks see degree. }

intransitive{Intransitive triads: This term adds one statistic to the model, equal to the number of intransitive triads in the network. These are defined as the triads of types 111D, 201, 111U, 021C, and 030C in the triad census of Davis and Leinhardt (1972). For details about triad types, see triad.classify in the sna package. Note the distinction from the ctriple term. This term can only be used with directed networks.}

isolates{Isolates: This term adds one statistic to the model equal to the number of isolates in the network. For an undirected network, an isolate is defined to be any node with degree zero. For a directed network, an isolate is any node with both in-degree and out-degree equal to zero.}

istar(k, attrname){In-stars: The k argument is a vector of distinct integers. This term adds one network statistic to the model for each element in k. The $i$th such statistic counts the number of distinct k[i]-instars in the network, where a $k$-instar is defined to be a node $N$ and a set of $k$ different nodes ${O_1, \dots, O_k}$ such that the ties $(O_j{\rightarrow}N)$ exist for $j=1, \dots, k$. The optional argument attrname is a character string giving the name of an attribute in the network's vertex attribute list. If this is specified then the count is over the number of $k$-instars where all nodes have the same value of the attribute. This term can only be used for directed networks; for undirected networks see kstar. Note that istar(1) is equal to both ostar(1) and edges. }

kstar(k, attrname){k-Stars: The k argument is a vector of distinct integers. This term adds one network statistic to the model for each element in k. The $i$th such statistic counts the number of distinct k[i]-stars in the network, where a $k$-star is defined to be a node $N$ and a set of $k$ different nodes ${O_1, \dots, O_k}$ such that the ties ${N, O_i}$ exist for $i=1, \dots, k$. The optional argument attrname is a character string giving the name of an attribute in the network's vertex attribute list. If this is specified then the count is over the number of $k$-stars where all nodes have the same value of the attribute. This term can only be used for undirected networks; for directed networks, see istar, ostar, twopath and m2star. Note that kstar(1) is equal to edges. }

localtriangle(x){Triangles within neighborhoods: This term adds one statistic to the model equal to the number of triangles in the network between nodes close to each other. For an undirected network, a local triangle is defined to be any set of three edges between nodal pairs ${(i,j), (j,k), (k,i)}$ that are in the same neighborhood. For a directed network, a triangle is defined as any set of three edges $(i{\rightarrow}j), (j{\rightarrow}k)$ and either $(k{\rightarrow}i)$ or $(k{\leftarrow}i)$ where again all nodes are within the same neighborhood. The argument x is a network or an adjacency matrix that specifies whether the two nodes are in the same neighborhood. Note that this is technically a special case of triangle. }

m2star{Mixed 2-stars, a.k.a 2-paths: This term can only be used with directed networks; for undirected networks see kstar(2). This term adds one statistic to the model, equal to the number of mixed 2-stars in the network, defined as a pair of edges $(i{\rightarrow}j), (j{\rightarrow}k)$. A mixed 2-star is sometimes called a 2-path because it is a directed path of length 2 from $i$ to $k$ via $j$. See also twopath.}

match(attrname, diff=FALSE){Uniform homophily and differential homophily: This is an alias for nodematch(attrname, diff=FALSE).}

meandeg{Mean vertex degree: This term adds one network statistic to the model equal to the average degree of a node. Note that this term is a constant multiple of both edges and density.} mutual{Mutuality: This term adds one network statistic to the model, equaling the number of pairs of actors $i$ and $j$ for which $(i{\rightarrow}j)$ and $(j{\rightarrow}i)$ both exist. This term can only be used with directed networks. }

nearsimmelian{Near simmelian triads: This term adds one statistic to the model equal to the number of near Simmelian triads, as defined by Krackhardt and Handcock (2006). This is a sub-graph of size three which is exactly one tie short of being complete. This term can only be used with directed networks.}

nodecov(attrname){Main effect of a covariate: The attrname argument is a character string giving the name of a numeric (not categorical) attribute in the network's vertex attribute list. This term adds a single network statistic to the model equaling the sum of attrname(i) and attrname(j) for all edges $(i,j)$ in the network. For categorical attributes, see nodefactor. Note that for directed networks, nodecov equals nodeicov plus nodeocov.} nodefactor(attrname, base=1){Factor attribute effect: The attrname argument is a character string giving the name of a categorical attribute in the network's vertex attribute list. This term adds multiple network statistics to the model, one for each of (a subset of) the unique values of the attrname attribute. Each of these statistics gives the number of times a node with that attribute appears in an edge in the network. In particular, for edges whose endpoints both have the same attribute value, this value is counted twice. To include all attribute values is usually not a good idea, because the sum of all such statistics equals twice the number of edges and hence a linear dependency would arise in any model also including edges. Thus, the base argument tells which value(s), numbered in order according to the sort function, should be omitted. The default value, one, means that the smallest (i.e., first in sorted order) attribute value is omitted, making this value the reference category to which all other values are compared. For example, if the fruit factor has levels orange, apple, banana, and pear, then to add just two terms, one for apple and one for pear, then set banana and orange to the base (remember to sort the values first) by using nodefactor("fruit", base=2:3). For an analogous term for quantitative vertex attributes, see nodecov. }

nodeicov(attrname){Main effect of a covariate for in-edges: The attrname argument is a character string giving the name of a numeric (not categorical) attribute in the network's vertex attribute list. This term adds a single network statistic to the model equaling the total value of attrname(j) for all edges $(i,j)$ in the network. This term may only be used with directed networks. For categorical attributes, see nodeifactor.}

nodeifactor(attrname, base=1){Factor attribute effect for in-edges: The attrname argument is a character string giving the name of a categorical attribute in the network's vertex attribute list. This term adds multiple network statistics to the model, one for each of (a subset of) the unique values of the attrname attribute. Each of these statistics gives the number of times a node with that attribute appears as the terminal node of a directed tie. The base argument tells which value(s), numbered in order according to the sort function, should be omitted. The default value, one, means that the smallest (i.e., first in sorted order) attribute value is omitted, making this value the reference category to which all other values are compared. For an example, see the nodefactor entry. The nodeifactor term may only be used with directed networks.}

nodematch(attrname, diff=FALSE, keep=NULL){Uniform homophily and differential homophily: The attrname argument is a character string giving the name of an attribute in the network's vertex attribute list. When diff=FALSE, this term adds one network statistic to the model, which counts the number of edges $(i,j)$ for which attrname(i)==attrname(j). When diff=TRUE, $p$ network statistics are added to the model, where $p$ is the number of unique values of the attrname attribute. The $k$th such statistic counts the number of edges $(i,j)$ for which attrname(i) == attrname(j) == value(k), where value(k) is the $k$th smallest unique value of the attrname attribute. The optional keep argument determines which values of k will be considered for matches; other values are ignored. Default is that all values are considered. If keep is set to non-NULL, it should be a vector of positive integers giving the values of k to keep, or a vector of negative integers giving the (negative) values of k to ignore. Note that this works for both diff=FALSE and diff=TRUE. For example, to add two statistics, counting the matches for just the 2nd and 4th categories, use nodematch with diff=TRUE and keep=c(2,4).}

nodemix(attrname, contrast=FALSE){Nodal attribute mixing: The attrname argument is a character string giving the name of a categorical attribute in the network's vertex attribute list. This term adds one network statistic to the model for each possible pairing of attribute values. The statistic equals the number of edges in the network in which the nodes have that pairing of values. In other words, this term produces one statistic for every entry in the mixing matrix for the attribute. The ordering of the attribute values is alphabetical. If the option contrast=TRUE is used, then a statistic for the first pairing is not included, making it the de facto reference category.}

nodeocov(attrname){Main effect of a covariate for out-edges: The attrname argument is a character string giving the name of a numeric (not categorical) attribute in the network's vertex attribute list. This term adds a single network statistic to the model equaling the total value of attrname(i) for all edges $(i,j)$ in the network. This term may only be used with directed networks. For categorical attributes, see nodeofactor.}

nodeofactor(attrname, base=1){Factor attribute effect for out-edges: The attrname argument is a character string giving the name of a categorical attribute in the network's vertex attribute list. This term adds multiple network statistics to the model, one for each of (a subset of) the unique values of the attrname attribute. Each of these statistics gives the number of times a node with that attribute appears as the node of origin of a directed tie. The base argument tells which value(s), numbered in order according to the sort function, should be omitted. The default value, one, means that the smallest (i.e., first in sorted order) attribute value is omitted, making this value the reference category to which all other values are compared. For an example, see the nodefactor entry. The nodeofactor term may only be used with directed networks.}

odegree(d, attrname){Out-degree: The d argument is a vector of distinct integers. This term adds one network statistic to the model for each element in d; the $i$th such statistic equals the number of nodes in the network of out-degree d[i], i.e. the number of nodes with exactly d[i] out-edges. The optional argument attrname is a character string giving the name of an attribute in the network's vertex attribute list. If this is specified then the count only considers edges in which both nodes have the same value of the attribute. This term can only be used with directed networks; for undirected networks see degree. }

ostar(k, attrname){k-Outstars: The k argument is a vector of distinct integers. This term adds one network statistic to the model for each element in k. The $i$th such statistic counts the number of distinct k[i]-outstars in the network, where a $k$-outstar is defined to be a node $N$ and a set of $k$ different nodes ${O_1, \dots, O_k}$ such that the ties $(N{\rightarrow}O_j)$ exist for $j=1, \dots, k$. The optional argument attrname is a character string giving the name of an attribute in the network's vertex attribute list. If this is specified then the count is the number of $k$-outstars where all nodes have the same value of the attribute. This term can only be used with directed networks; for undirected networks see kstar. Note that ostar(1) is equal to both istar(1) and edges. }

receiver{Receiver effect: This term adds one network statistic for each node equal to the number of in-ties for that node. This measures the popularity of the node. The term for the first node is omitted because of redundancy, but the coefficient can be computed as the negative of the sum of the coefficients of all the other actors. That is, the average coefficient is zero, following the Holland-Leinhardt parametrization of the $p_1$ model. This term can only be used with directed networks. For undirected networks, see sociality.}

sender{Sender effect: This term adds one network statistic for each node equal to the number of out-ties for that node. This measures the activity of the node. The term for the first node is omitted because of redundancy, but the coefficient can be computed as the negative of the sum of the coefficients of all the other actors. That is, the average coefficient is zero, following the Holland-Leinhardt parametrization of the $p_1$ model. This term can only be used with directed networkss. For undirected networks, see sociality.}

simmelian{Simmelian triads: This term adds one statistic to the model equal to the number of Simmelian triads, as defined by Krackhardt and Handcock (2006). This is a complete sub-graph of size three. This term can only be used with directed networks.}

simmelianties{Ties in simmelian triads: This term adds one statistic to the model equal to the number of ties in the network that are associated with Simmelian triads, as defined by Krackhardt and Handcock (2006). Each Simmelian has six ties in it but, because Simmelians can overlap in terms of nodes (and associated ties), the total number of ties in these Simmelians is less than six times the number of Simmelians. hence this is a measure of the clustering of Simmelians (given the number of Simmelians). This term can only be used with directed networks.}

smalldiff(attrname, cutoff){Small difference: The attrname argument is a character string giving the name of a quantitative attribute in the network's vertex attribute list and cutoff is any real number. This term adds one network statistic to the model, equal to the number of edges $(i,j)$ for which abs(attrname(i)-attrname(j)) is less than or equal to cutoff. }

sociality(attrname){Undirected degree: This term adds one network statistic for each node equal to the number of ties of that node. The optional attrname is a character string giving the name of an attribute in the network's vertex attribute list that takes categorical values. If provided, this term only counts ties between nodes with the same value of the attribute. This term can only be used with undirected networks. For directed networks, see sender and receiver. }

transitive{Transitive triads: This term adds one statistic to the model, equal to the number of transitive triads in the network. These are defined as the triads of types 120D, 030T, 120U, and 300 in the triad census of Davis and Leinhardt (1972). For details about triad types, see triad.classify in the sna package. Note the distinction from the ttriple term. This term can only be used with directed networks.}

triadcensus{Triad census: This term adds one network statistic for each of the 16 types of triads categorized by Davis and Leinhardt (1972). Each statistic is the count of the corresponding triad type in the network. Every unoriented directed triad may occupy one of 16 distinct states. These states were used by Davis and Leinhardt (1972) as a basis for classifying triads within a larger structure. For details see triad.classify in the sna package, on which this code is based. For an undirected graph the triad census is over the four types defined by the number of ties (i.e., 0, 1, 2, and 3).}

triangle(attrname){Triangles: This term adds one statistic to the model equal to the number of triangles in the network. For an undirected network, a triangle is defined to be any set ${(i,j), (j,k), (k,i)}$ of three edges. For a directed network, a triangle is defined as any set of three edges $(i{\rightarrow}j)$ and $(j{\rightarrow}k)$ and either $(k{\rightarrow}i)$ or $(k{\leftarrow}i)$. Note that for directed networks, triangle equals ttriple plus ctriple, so at most two of these three terms can be in a model. The optional argument attrname restricts the count to those triples of nodes with equal values of the vertex attribute specified by attrname. }

tripercent(attrname){Triangle percentage: This term adds one statistic to the model equal to the percentage of triangles in the network relative to the number of potential triangles. For the definition of triangle, see triangle. A potential triangle is a 2-star. The optional argument attrname restricts the counts (both numerator and denominator) to those triples of nodes with equal values of the vertex attribute specified by attrname. This term can only be used with undirected networks.} ttriple(attrname){Transitive triples: This term adds one statistic to the model, equal to the number of transitive triples in the network, defined as a set of edges ${(i{\rightarrow}j), (j{\rightarrow}k), (i{\rightarrow}k)}$. Note that triangle equals ttriple+ctriple for a directed network, so at most two of the three terms can be in a model. The optional argument attrname is a character string giving the name of an attribute in the network's vertex attribute list. If this is specified then the count is over the number of transitive triples where all three nodes have the same value of the attribute. This term can only be used with directed networks.}

twopath{2-Paths: This term adds one statistic to the model, equal to the number of 2-paths in the network. For directed network this is defined as a pair of edges $(i{\rightarrow}j), (j{\rightarrow}k)$. That is, it is a directed path of length 2 from $i$ to $k$ via $j$. For directed networks a 2-path is also a mixed 2-star. For undirected networks this is defined as a pair of edges ${i,j}, {j,k}$. That is, it is an undirected path of length 2 from $i$ to $k$ via $j$, also known as a 2-star.}

References

Davis, J.A. and Leinhardt, S. (1972). ``The Structure of Positive Interpersonal Relations in Small Groups.'' In J. Berger (Ed.), Sociological Theories in Progress, Volume 2, 218-251. Boston: Houghton Mifflin.

Hunter, D. R. and M. S. Handcock (2006), Inference in curved exponential family models for networks.'' Journal of Computational and Graphical Statistics, 15: 565-583.

Hunter, D. R. (2007), Curved exponential family models for social networks, Social Networks, 29: 216-230.

Krackhardt, D. and Handcock, M. S. (2007), ``Heider versus Simmel: Emergent Features in Dynamic Structures.'' Lecture Notes in Computer Science, 4503, 14-27. Snijders, T. A. B., P. E. Pattison, G. L. Robins, and M. S. Handcock (2006). New specifications for exponential random graph models, Sociological Methodology, 36(1): 99-153. ergm, network, %v%, %n%, sna, summary.ergm, print.ergm ergm(flomarriage ~ kstar(1:2) + absdiff("wealth") + triangle)

ergm(molecule ~ edges + kstar(2:3) + triangle + nodematch("atomic type",diff=TRUE) + triangle + absdiff("atomic type")) models