gof-methods: Conduct Goodness-of-Fit Diagnostics on ERGMs, TERGMs, SAOMs, and logit models

Description

Assess goodness of fit of btergm and other network models.

Usage

## S3 method for class 'btergm':
gof(object, target = NULL, formula = getformula(object), 
    nsim = 100, MCMC.interval = 1000, MCMC.burnin = 10000, 
    parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, 
    statistics = c(dsp, esp, deg, ideg, geodesic, rocpr, 
    walktrap.modularity), verbose = TRUE, ...)
## S3 method for class 'mtergm':
gof(object, target = NULL, formula = getformula(object), 
    nsim = 100, MCMC.interval = 1000, MCMC.burnin = 10000, 
    parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, 
    statistics = c(dsp, esp, deg, ideg, geodesic, rocpr, 
    walktrap.modularity), verbose = TRUE, ...)
## S3 method for class 'ergm':
gof(object, target = NULL, formula = getformula(object), 
    nsim = 100, MCMC.interval = 1000, MCMC.burnin = 10000, 
    parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, 
    statistics = c(dsp, esp, deg, ideg, geodesic, rocpr, 
    walktrap.modularity), verbose = TRUE, ...)
## S3 method for class 'matrix':
gof(object, covariates, coef, target = NULL, nsim = 100, 
    mcmc = FALSE, MCMC.interval = 1000, MCMC.burnin = 10000, 
    parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, 
    statistics = c(dsp, esp, deg, ideg, geodesic, rocpr, 
    walktrap.modularity), verbose = TRUE, ...)
## S3 method for class 'network':
gof(object, covariates, coef, target = NULL, 
    nsim = 100, mcmc = FALSE, MCMC.interval = 1000, 
    MCMC.burnin = 10000, parallel = c("no", "multicore", "snow"), 
    ncpus = 1, cl = NULL, statistics = c(dsp, esp, deg, ideg, 
    geodesic, rocpr, walktrap.modularity), verbose = TRUE, ...)
## S3 method for class 'sienaAlgorithm':
gof(object, siena.data, siena.effects, 
    predict.period = NULL, nsim = 50, parallel = c("no", 
    "multicore", "snow"), ncpus = 1, cl = NULL, target.na = NA, 
    target.na.method = "remove", target.structzero = 10, 
    statistics = c(dsp, esp, deg, ideg, geodesic, rocpr, 
    walktrap.modularity), verbose = TRUE, ...)
## S3 method for class 'sienaModel':
gof(object, siena.data, siena.effects, 
    predict.period = NULL, nsim = 50, parallel = c("no", 
    "multicore", "snow"), ncpus = 1, cl = NULL, 
    target.na = NA, target.na.method = "remove", 
    target.structzero = 10, statistics = c(dsp, esp, deg, ideg, 
    geodesic, rocpr, walktrap.modularity), verbose = TRUE, ...)

Arguments

An optional parallel or snow cluster for use if parallel = "snow". If not supplied, a cluster on the local machine is created temporarily.

coef

A vector of coefficients.

covariates

A list of matrices or network objects that serve as covariates for the dependent network. The covariates in this list are automatically added to the formula as edgecov terms.

formula

A model formula from which networks are simulated for comparison. By default, the formula from the btergm object x is used. It is possible to hand over a formula with only a single response network and/or dyad or edge covariates

mcmc

Should statnet's MCMC methods be used for simulating new networks? If mcmc = FALSE, new networks are simulated based on predicted tie probabilities of the regression equation.

MCMC.burnin

Internally, this package uses the simulation facilities of the ergm package to create new networks against which to compare the original network(s) for goodness-of-fit assessment. This argument sets the MCMC burnin to be passed over to the simu

MCMC.interval

ncpus

The number of CPU cores used for parallel GOF assessment (only if parallel is activated). If the number of cores should be detected automatically on the machine where the code is executed, one can try the detectCores() function f

nsim

The number of networks to be simulated at each time step. Example: If there are six time steps in the formula and nsim = 100, a total of 600 new networks is simulated. The comparison between simulated and observed networks is onl

object

A btergm, ergm, sienaAlgorithm, or sienaModel object (for the btergm, ergm, sienaAlgorithm, and sienaModel methods, respectively). Or a network object

parallel

Use multiple cores in a computer or nodes in a cluster to speed up the simulations. The default value "no" means parallel computing is switched off. If "multicore" is used (only available for sienaAlgorithm and

predict.period

Which time period should be predicted? By default, the last time period is predicted based on the last simulation of the second-last time period. The time period can be provided as a numeric, e.g., predict.period = 4 for predicting the fourth

siena.data

An object of the class siena, which is usually created using the sienaDataCreate function in the RSiena package.

siena.effects

An object of the class sienaEffects, which is usually created using the getEffects() and the includeEffects() function in the RSiena package.

statistics

A list of functions used for comparison of observed and simulated networks. Note that the list should contain the actual functions, not a character representation of them. See gof-statistics for details.

target

A network or list of networks to which the simulations are compared. If left empty, the original networks from the btergm object x are used as observed networks.

target.na

Which value was used for missing data in the dependent variable?

target.na.method

How should missing data be handled when comparing the simulations to the empirical (= observed) network? Two options are possible: remove drops nodes with missing ties both from the simulations (after running the simulations) and from the obs

target.structzero

Which value was used for structural zeros (usually nodes that have dropped out of the network or have not yet joined the network) in the dependent variable? These nodes are removed from the observed network and the simulations before comparison.

verbose

Print details?

...

Arbitrary further arguments.

Details

The generic gof function provides goodness-of-fit measures and degeneracy checks for btergm, mtergm, ergm, SAOM, and custom dyadic-independent models. The user can provide a list of network statistics for comparing simulated networks based on the estimated model with the observed network(s). See gof-statistics. The objects created by these methods can be displayed using various plot and print methods (see gof-plot).

In-sample GOF assessment is the default, which means that the same time steps are used for creating simulations and for comparison with the observed network(s). It is possible to do out-of-sample prediction by specifying a (list of) target network(s) using the target argument. If a formula is provided, the simulations are based on the networks and covariates specified in the formula. This is helpful in situations where complex out-of-sample predictions have to be evaluated. A usage scenario could be to simulate from a network at time t (provided through the formula argument) and compare to an observed network at time t + 1 (the target argument). This can be done, for example, to assess predictive performance between time steps of the original networks, or to check whether the model performs well with regard to a newly measured network given the old data from the previous time step.

Predictive fit can also be assessed for stochastic actor-oriented models (SAOM) as implemented in the RSiena package. After compiling the usual objects (model, data, effects), one of the time steps can be predicted based on the previous time step and the SAOM using the sienaAlgorithm (for RSiena >= 1.1-227) or sienaModel (for RSiena < 1.1-227) method of the gof function.

The gof methods for networks and matrices serve to assess the goodness of fit of a dyadic-independence model. To do this, the method requires a vector of coefficients (one coefficient for the intercept or edges term and one coefficient for each covariate), a list of covariates (in matrix or network shape), and a dependent network or matrix. This is useful for assessing the goodness of fit of QAP-adjusted logistic regression models (as implemented in the netlogit function in the sna package) or other dyadic-independence models, such as models fitted using glm. Note that this method only works with cross-sectional models and does not accept lists of networks as input data.

Examples

Run this code

# First, create data and fit a TERGM...
networks <- list()
for(i in 1:10){            # create 10 random networks with 10 actors
  mat <- matrix(rbinom(100, 1, .25), nrow = 10, ncol = 10)
  diag(mat) <- 0           # loops are excluded
  nw <- network(mat)       # create network object
  networks[[i]] <- nw      # add network to the list
}

covariates <- list()
for (i in 1:10) {          # create 10 matrices as covariate
  mat <- matrix(rnorm(100), nrow = 10, ncol = 10)
  covariates[[i]] <- mat   # add matrix to the list
}

fit <- btergm(networks ~ edges + istar(2) +
    edgecov(covariates), R = 100)

# Then assess the goodness of fit:
g <- gof(fit, statistics = c(triad.directed, esp, maxmod.modularity, 
    rocpr), nsim = 50)
g
plot(g)  # see ?"gof-plot" for details

Run the code above in your browser using DataLab

Description

Usage

Arguments

Details

See Also

Examples