"gof"(object, target = NULL, formula = getformula(object), nsim = 100, MCMC.interval = 1000, MCMC.burnin = 10000, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, statistics = c(dsp, esp, deg, ideg, geodesic, rocpr, walktrap.modularity), verbose = TRUE, ...)
"gof"(object, target = NULL, formula = getformula(object), nsim = 100, MCMC.interval = 1000, MCMC.burnin = 10000, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, statistics = c(dsp, esp, deg, ideg, geodesic, rocpr, walktrap.modularity), verbose = TRUE, ...)
"gof"(object, target = NULL, formula = getformula(object), nsim = 100, MCMC.interval = 1000, MCMC.burnin = 10000, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, statistics = c(dsp, esp, deg, ideg, geodesic, rocpr, walktrap.modularity), verbose = TRUE, ...)
"gof"(object, covariates, coef, target = NULL, nsim = 100, mcmc = FALSE, MCMC.interval = 1000, MCMC.burnin = 10000, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, statistics = c(dsp, esp, deg, ideg, geodesic, rocpr, walktrap.modularity), verbose = TRUE, ...)
"gof"(object, covariates, coef, target = NULL, nsim = 100, mcmc = FALSE, MCMC.interval = 1000, MCMC.burnin = 10000, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, statistics = c(dsp, esp, deg, ideg, geodesic, rocpr, walktrap.modularity), verbose = TRUE, ...)
"gof"(object, period = NULL, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, structzero = 10, statistics = c(esp, deg, ideg, geodesic, rocpr, walktrap.modularity), groupName = object$f$groupNames[[1]], varName = NULL, outofsample = FALSE, sienaData = NULL, sienaEffects = NULL, nsim = NULL, verbose = TRUE, ...)parallel = "snow". If not supplied, a cluster on the local machine is created temporarily. edgecov terms. btergm object x is used. It is possible to hand over a formula with only a single response network and/or dyad or edge covariates or with lists of response networks and/or covariates. It is also possible to use indices like networks[[4]] or networks[3:5] inside the formula. mcmc = FALSE, new networks are simulated based on predicted tie probabilities of the regression equation. 10000. There is no general rule of thumb on the selection of this parameter, but if the results look suspicious (e.g., when the model fit is perfect), increasing this value may be helpful. 1000, which means that every 1000th simulation outcome from the MCMC sequence is used. There is no general rule of thumb on the selection of this parameter, but if the results look suspicious (e.g., when the model fit is perfect), increasing this value may be helpful. parallel is activated). If the number of cores should be detected automatically on the machine where the code is executed, one can try the detectCores() function from the parallel package. On some HPC clusters, the number of available cores is saved as an environment variable; for example, if MOAB is used, the number of available cores can sometimes be accessed using Sys.getenv("MOAB_PROCCOUNT"), depending on the implementation. Note that the maximum number of connections in a single R session (i.e., to other cores or for opening files etc.) is 128, so fewer than 128 cores should be used at a time. formula and nsim = 100, a total of 600 new networks is simulated. The comparison between simulated and observed networks is only done within time steps. For example, the first 100 simulations are compared with the first observed network, simulations 101-200 with the second observed network etc. btergm, ergm, or sienaFit object (for the btergm, ergm, and sienaFit methods, respectively). Or a network object or matrix (for the network and matrix methods, respectively). sienaData, sienaEffects, and nsim. The sienaData object must contain a base and a target network for out-of-sample prediction. The sienaEffects must contain the effects to be used for the simulations. The estimates will be taken from the estimated object, and they will be injected into a new SAOM and fixed during the sampling procedure. nsim determines how many simulations are used for the out-of-sample comparison. "no" means parallel computing is switched off. If "multicore" is used (only available for sienaAlgorithm and sienaModel objects), the mclapply function from the parallel package (formerly in the multicore package) is used for parallelization. This should run on any kind of system except MS Windows because it is based on forking. It is usually the fastest type of parallelization. If "snow" is used, the parLapply function from the parallel package (formerly in the snow package) is used for parallelization. This should run on any kind of system including cluster systems and including MS Windows. It is slightly slower than the former alternative if the same number of cores is used. However, "snow" provides support for MPI clusters with a large amount of cores, which multicore does not offer (see also the cl argument). Note that "multicore" will only work if all cores are on the same node. For example, if there are three nodes with eight cores each, a maximum of eight CPUs can be used. Parallel computing is described in more detail on the help page of btergm. period = 4 for extracting the simulations between time steps 4 and 5 (= the fourth transition) and predicting the fifth network. Values lower than 1 or larger than the number of consecutive networks minus 1 are therefore not permitted. This argument is only used if out-of-sample prediction is switched off. siena, which is usually created using the sienaDataCreate function in the RSiena package. This argument is only used for out-of-sample prediction. The object must be based on a sienaDependent object that contains two networks: the base network from which to simulate forward, and the target network which you want to predict out-of-sample. The object can contain further objects for storing covariates etc. that are necessary for estimating new networks. The best practice is to create an object that is identical to the siena object used for estimating the model, except that it contains the base and the target network instead of the dependent variable/networks. sienaEffects, which is usually created using the getEffects() and the includeEffects() functions in the RSiena package. The best practice is to provide a sienaEffects object that is identical to the object used to create the original model (that is, it should contain the same effects), except that it should be based on the siena object provided through the sienaData argument. In other words, the sienaEffects object should be based on the base and target network used for out-of-sample prediction, and it should contain the same effects as those used for the original estimation. This argument is used only for out-of-sample prediction. btergm object x are used as observed networks. 10 is used for structural zeros in Siena. gof function provides goodness-of-fit measures and degeneracy checks for btergm, mtergm, ergm, sienaFit, and custom dyadic-independent models. The user can provide a list of network statistics for comparing simulated networks based on the estimated model with the observed network(s). See gof-statistics. The objects created by these methods can be displayed using various plot and print methods (see gof-plot).In-sample GOF assessment is the default, which means that the same time steps are used for creating simulations and for comparison with the observed network(s). It is possible to do out-of-sample prediction by specifying a (list of) target network(s) using the target argument. If a formula is provided, the simulations are based on the networks and covariates specified in the formula. This is helpful in situations where complex out-of-sample predictions have to be evaluated. A usage scenario could be to simulate from a network at time t (provided through the formula argument) and compare to an observed network at time t + 1 (the target argument). This can be done, for example, to assess predictive performance between time steps of the original networks, or to check whether the model performs well with regard to a newly measured network given the old data from the previous time step.
Predictive fit can also be assessed for stochastic actor-oriented models (SAOM) as implemented in the RSiena package. After compiling the usual objects (model, data, effects), one of the time steps can be predicted based on the previous time step and the SAOM using the sienaFit method of the gof function. By default, however, within-sample fit is used for SAOMs, just like for (T)ERGMs.
The gof methods for networks and matrices serve to assess the goodness of fit of a dyadic-independence model. To do this, the method requires a vector of coefficients (one coefficient for the intercept or edges term and one coefficient for each covariate), a list of covariates (in matrix or network shape), and a dependent network or matrix. This is useful for assessing the goodness of fit of QAP-adjusted logistic regression models (as implemented in the netlogit function in the sna package) or other dyadic-independence models, such as models fitted using glm. Note that this method only works with cross-sectional models and does not accept lists of networks as input data.
## Not run:
# # First, create data and fit a TERGM...
# networks <- list()
# for(i in 1:10){ # create 10 random networks with 10 actors
# mat <- matrix(rbinom(100, 1, .25), nrow = 10, ncol = 10)
# diag(mat) <- 0 # loops are excluded
# nw <- network(mat) # create network object
# networks[[i]] <- nw # add network to the list
# }
#
# covariates <- list()
# for (i in 1:10) { # create 10 matrices as covariate
# mat <- matrix(rnorm(100), nrow = 10, ncol = 10)
# covariates[[i]] <- mat # add matrix to the list
# }
#
# fit <- btergm(networks ~ edges + istar(2) +
# edgecov(covariates), R = 100)
#
# # Then assess the goodness of fit:
# g <- gof(fit, statistics = c(triad.directed, esp, maxmod.modularity,
# rocpr), nsim = 50)
# g
# plot(g) # see ?"gof-plot" for details
# ## End(Not run)
Run the code above in your browser using DataLab