sdsm: The stochastic degree sequence model (sdsm) for backbone probabilities

Description

`sdsm` computes the probability of edge weights being above or below the observed edge weights in a bipartite projection using the stochastic degree sequence model. Once computed, use backbone.extract to return the backbone matrix for a given alpha value.

Usage

sdsm(B, model = "polytope", trials = 1000)

Arguments

graph: Bipartite graph object of class matrix, sparse matrix, igraph, edgelist, or network object.

model

String: A method used to compute probabilities for generating random bipartite graphs. Can be c("logit", "probit", "cauchit", "log", "cloglog", "scobit", "oldlogit","lpm", "chi2", "curveball", "polytope").

trials

Integer: If <U+2018>model<U+2019> = <U+2018>curveball<U+2019>, number of random bipartite graphs generated using curveball to compute probabilities. Default is 1000.

Value

backbone, a list(positive, negative, summary). Here `positive` is a matrix of probabilities of edge weights being equal to or above the observed value in the projection, `negative` is a matrix of probabilities of edge weights being equal to or below the observed value in the projection, and `summary` is a data frame summary of the inputted matrix and the model used including: model name, number of rows, skew of row sums, number of columns, skew of column sums, and running time.

Details

Specifically, the sdsm function compares an edge's observed weight in the projection B*t(B) to the distribution of weights expected in a projection obtained from a random bipartite network where both the row vertex degrees and column vertex degrees are approximately fixed.

If the 'model' parameter is one of c('logit', 'probit', 'cauchit', 'log', 'cloglog','scobit'), then this model is used as a 'link' function for a binary outcome model conditioned on the row degrees and column degrees, as described by glm and family. If the 'model' parameter is 'oldlogit', then a logit link function is used but the model is conditioned on the row degrees, column degrees, and their product. If 'model = lpm', a linear probability model is used. If 'model = chi2', a chi-squared model is used.

If 'model' = 'curveball' and 'trials' > 0, the probabilities are computed by using curveball function `trials` times. The proportion of each cell being 1 is used as its probability. If 'model = polytope', the polytope function is used to find a matrix of probabilities that maximizes the entropy function, with same row and column sums.

The "backbone" S3 class object returned is composed of two matrices, a summary dataframe and (optionally, if generated by using fdsm) a 'dyad_values' vector.

References

Neal, Z. P. (2014). The backbone of bipartite projections: Inferring relationships from co-authorship, co-sponsorship, co-attendance, and other co-behaviors. Social Networks, 39, Elsevier: 84-97. DOI: 10.1016/j.socnet.2014.06.001

Examples

Run this code

# NOT RUN {
sdsm_probs <- sdsm(davis)
# }
# NOT RUN {
sdsm_probs2 <- sdsm(davis, model = "curveball", trials = 1000)
# }

Run the code above in your browser using DataLab