kpset: Selecting the Most Central Group of Players in a Network

Description

kpset helps identify the most central group of players in a social network given a sepcified centraliy measure and a target group size.

Usage

kpset(
  adj.matrix,
  size,
  type = "degree",
  M = Inf,
  T = ncol(adj.matrix),
  method = "min",
  binary = FALSE,
  cmode = "total",
  large = TRUE,
  geodist.precomp = NULL,
  seed = "top",
  parallel = FALSE,
  cluster = 2,
  round = 10,
  iteration = ncol(adj.matrix)
)

Value

kpset returns the column indices of the players who form the most central set and its centrality score.

Arguments

adj.matrix: Matrix indicating the adjacency matrix of the network or in the case of diffusion centrality a probability matrix.
size: Integer indicating the target size of players.
type: A string indicating the type of centrality measure to be used. Should be one of "degree" for degree centrality, "closeness" for closeness centrality, "betweenness" for betweenness centrality, "evcent" for eigenvector centrality, "mreach.degree" for M-reach degree centrality, "mreach.closeness" for M-reach closeness centrality, "fragment" for fragment centrality, and "diffusion" for diffusion centrality.
M: Positive number indicating the maximum geodistance between two nodes, above which the two nodes are considered disconnected. The default is Inf. The option is applicable to M-reach degree, M-reach closeness, and fragmentation centralities..
T: Integer indicating the maximum number of iterations in the communication process. By default, T is the network size.
method: Indication of which grouping criterion should be used.
"min" indicates the "minimum" criterion and is suggested for betweenness, closeness, fragmentation, and M-reach centralities.
"max" indicates the "maximum" criterion and is suggested for degree and eigenvector centralities.
"add" indicates the "addition" criterion and is suggested for degree and eigenvector centralities as an altenative of "max".
"union" indicates the "union" criterion and is suggested for diffusion centrality.
The default is "min". See kpcent Details section for explanations on grouping method.
binary: If TRUE, the input matrix is binarized. If FALSE, the edge values are considered. The default is FALSE.
cmode: String indicating the type of centrality being evaluated. The option is applicable to degree and M-reach centralities. "outdegree", "indegree", and "total" refer to indegree, outdegree, and total degree, respectively. "all" reports all the above measures. The default is to report the total degree. Note for closeness centrality, we use the Gil-Schmidt power index when large=FALSE. See closeness for explanation. When large=TRUE, the function reports the standard closeness score.
large: Logical scalar. If TRUE (the default), the method implmented in igraph is used for computing geodistance and related centrality measures; otherwise the method in sna is used.
geodist.precomp: Geodistance precomputed for the network to be analyzed (optional).
seed: String indicating the seeding method or a vector of the seeds specified by user. If "top", players with the high individual centrality are used as the seeds. If "random", seeds are randomly sampled. The default is "top" for efficiency.
parallel: Logical scalar. IF TRUE, the parallel computation is implement. The default is FALSE.
cluster: Integer indicating the number of CPU cores to be used for parallel computation.
round: Integer indicating the "length" of search, namely, the number of loops over the nodes in the candidate set.
iteration: Integer indicating the "width" of search in each round, namely, the number of loops over the nodes in the residual set.

Author

Weihua An weihua.an@emory.edu; Yu-Hsin Liu ugeneliu@meta.com

Details

The most central group of players in a network is not necessarily the set of players who are the most central as individuals because there may be redundancy in their connections. Currenlty a greedy search algorithm is implemented in this package to identify the most central group of key players. The basic steps are shown as follows.

Select an initial candidate set C. The residual set is denoted as R.
Update the candidate set C.
- Start with the first node in C. Try to swap it with nodes in R sequentially (loop 1). Make the swap if it improves the centrality score of the resulting C. The number of loop 1 is defined as the number of iterations (over the nodes in the residual set).
- Repeat step 1 for each node in C sequentially (loop 2). The number of loop 2 is defined as the number of rounds (over the nodes in the candidate set).
- Stop if (a) the change in C's centrality score is negligible (i.e. it is smaller than a pre-specified threshold determined by both the network size and edge values.) or (b) the process reaches a specified number of rounds.
Return the final candidate set and the centrality score.

It is recommended to run kpset several times with different seeds so that the algorithm will not be trapped in a local optimum. To facilitate the search in large networks, users may specify a reasonable number of iterations or rounds and/or utilize parallel computation. During parallel computation, for each cluster and each iteration the algorithm randomly picks a node from the candidate set and the residual set, respectively, and swaps the two if it improves the centrality score of the candidate set. It repeats this process until exhausting the specified iterations and rounds and then compare and combine the results from the clusters.

References

An, Weihua. (2015). "Multilevel Meta Network Analysis with Application to Studying Network Dynamics of Network Interventions." Social Networks 43: 48-56.

An, Weihua and Yu-Hsin Liu (2016). "keyplayer: An R Package for Locating Key Players in Social Networks." The R Journal, 8(1): 257-268.

Borgatti, Stephen P. (2006). "Identifying Sets of Key Players in a Network." Computational, Mathematical and Organizational Theory, 12(1):21-34.

Butts, Carter T. (2014). sna: Tools for Social Network Analysis. R package version 2.3-2. https://CRAN.R-project.org/package=sna

Csardi, G and Nepusz, T (2006). "The igraph software package for complex network research." InterJournal, Complex Systems 1695. https://igraph.org

Examples

Run this code

# Create a 5x5 weighted and directed adjacency matrix
W <- matrix(
  c(0,1,3,0,0,
    0,0,0,4,0,
    1,1,0,2,0,
    0,0,0,0,3,
    0,2,0,0,0),
    nrow=5, ncol=5, byrow = TRUE)

# Find the most central player set sized 2 in terms of the degree centrality
kpset(W,size=2,type="degree")

# Find two most central players in terms of indegree
# via parallel computation using 5 cpu cores
kpset(W,size=2,type="degree", cmode="indegree", parallel = TRUE, cluster = 2)

Run the code above in your browser using DataLab