scg: All-in-one Function for the SCG of Matrices and Graphs

Description

This function handles all the steps involved in the Spectral Coarse Graining (SCG) of some matrices and graphs as described in the reference below.

Usage

scg(X, ev, nt, groups = NULL, mtype = c("symmetric", "laplacian",
      "stochastic"), algo = c("optimum", "interv_km", "interv",
      "exact_scg"), norm = c("row", "col"), direction = c("default",
      "left", "right"), evec = NULL, p = NULL, use.arpack = FALSE,
    maxiter = 300, sparse = getIgraphOpt("sparsematrices"), output =
    c("default", "matrix", "graph"), semproj = FALSE, epairs = FALSE,
    stat.prob = FALSE)

Arguments

The input graph or square matrix. Can be of class igraph, matrix or Matrix.

A vector of positive integers giving the indexes of the eigenpairs to be preserved. For real eigenpairs, 1 designates the eigenvalue with largest algebraic value, 2 the one with second largest algebraic value, etc. In the complex case, it is t

A vector of positive integers of length one or equal to length(ev). When algo = optimum, nt contains the number of groups used to partition each eigenvector separately. When algo

groups

A vector of nrow(X) or vcount(X) integers labeling each group vertex in the partition. If this parameter is supplied most part of the function is bypassed.

mtype

Character scalar. The type of semi-projector to be used for the SCG. For now symmetric, laplacian and stochastic are available.

algo

Character scalar. The algorithm used to solve the SCG problem. Possible values are optimum, interv_km, interv and exact_scg.

norm

Character scalar. Either row or col. If set to row the rows of the Laplacian matrix sum up to zero and the rows of the stochastic matrix sum up to one; otherwise it is the columns.

direction

Character scalar. When set to right, resp. left, the parameters ev and evec refer to right, resp. left eigenvectors. When passed default it is the SCG described in th

evec

A numeric matrix of (eigen)vectors to be preserved by the coarse graining (the vectors are to be stored column-wise in evec). If supplied, the eigenvectors should correspond to the indexes in ev as no cross-check will

A probability vector of length nrow(X) (or vcount(X)). p is the stationary probability distribution of a Markov chain when mtype = stochastic. This parameter is ignored in al

use.arpack

Logical scalar. When set to TRUE uses the function arpack to compute eigenpairs. This parameter should be set to TRUE if one deals with large (over a few thousands) AND spa

maxiter

A positive integer giving the maximum number of iterations for the k-means algorithm when algo = interv_km. This parameter is ignored in all other cases.

sparse

Logical scalar. Whether to return sparse matrices in the result, if matrices are requested.

output

Character scalar. Set this parameter to default to retrieve a coarse-grained object of the same class as X.

semproj

Logical scalar. Set this parameter to TRUE to retrieve the semi-projectors of the SCG.

epairs

Logical scalar. Set this to TRUE to collect the eigenpairs computed by scg.

stat.prob

Logical scalar. This is to collect the stationary probability p when dealing with stochastic matrices.

Value

XtThe coarse-grained graph, or matrix, possibly a sparse matrix.
groupsA vector of nrow(X) or vcount(X) integers giving the group label of each object (vertex) in the partition.
LThe semi-projector $L$ if semproj = TRUE.
RThe semi-projector $R$ if semproj = TRUE.
valuesThe computed eigenvalues if epairs = TRUE.
vectorsThe computed or supplied eigenvectors if epairs = TRUE.
pThe stationary probability vector if mtype = stochastic and stat.prob = TRUE. For other matrix types this is missing.

concept

Spectral coarse graining

Details

Please see SCG for an introduction. In the following $V$ is the matrix of eigenvectors for which the SCG is solved. $V$ is calculated from X, if it is not given in the evec argument. The algorithm optimum solves exactly the SCG problem for each eigenvector in V. The running time of this algorithm is $O(\max nt \cdot m^2)$ for the symmetric and laplacian matrix problems (i.e. when mtype is symmetric or laplacian. It is $O(m^3)$ for the stochastic problem. Here $m$ is the number of rows in V. In all three cases, the memory usage is $O(m^2)$.

The algorithms interv and interv_km solve approximately the SCG problem by performing a (for now) constant binning of the components of the eigenvectors, that is nt[i] constant-size bins are used to partition V[,i]. When algo = interv_km, the (Lloyd) k-means algorithm is run on each partition obtained by interv to improve accuracy.

Once a minimizing partition (either exact or approximate) has been found for each eigenvector, the final grouping is worked out as follows: two vertices are grouped together in the final partition if they are grouped together in each minimizing partition. In general the size of the final partition is not known in advance when ncol(V)>1.

Finally, the algorithm exact_scg groups the vertices with equal components in each eigenvector. The last three algorithms essentially have linear running time and memory load.

References

D. Morton de Lachapelle, D. Gfeller, and P. De Los Rios, Shrinking Matrices while Preserving their Eigenpairs with Application to the Spectral Coarse Graining of Graphs. Submitted to SIAM Journal on Matrix Analysis and Applications, 2008. http://people.epfl.ch/david.morton

Examples

Run this code

## We are not running these examples any more, because they
## take a long time (~20 seconds) to run and this is against the CRAN
## repository policy. Copy and paste them by hand to your R prompt if
## you want to run them.

# SCG of a toy network
g <- graph.full(5) %du% graph.full(5) %du% graph.full(5)
g <- add.edges(g, c(1,6, 1,11, 6, 11))
cg <- scg(g, 1, 3, algo="exact_scg")

#plot the result
layout <- layout.kamada.kawai(g)
nt <- vcount(cg$Xt)
col <- rainbow(nt)
vsize <- table(cg$groups)
ewidth <- round(E(cg$Xt)$weight,2)

op <- par(mfrow=c(1,2))
plot(g, vertex.color = col[cg$groups], vertex.size = 20,
		vertex.label = NA, layout = layout)
plot(cg$Xt, edge.width = ewidth, edge.label = ewidth, 
	vertex.color = col, vertex.size = 20*vsize/max(vsize),
	vertex.label=NA, layout = layout.kamada.kawai)
par(op)

## SCG of real-world network
library(igraphdata)
data(immuno)
summary(immuno)
n <- vcount(immuno)
interv <- c(100,100,50,25,12,6,3,2,2)
cg <- scg(immuno, ev= n-(1:9), nt=interv, mtype="laplacian",
                        algo="interv", epairs=TRUE)

## are the eigenvalues well-preserved?
gt <- cg$Xt
nt <- vcount(gt)
Lt <- graph.laplacian(gt)
evalt <- eigen(Lt, only.values=TRUE)$values[nt-(1:9)]
res <- cbind(interv, cg$values, evalt)
res <- round(res,5)
colnames(res) <- c("interv","lambda_i","lambda_tilde_i")
rownames(res) <- c("N-1","N-2","N-3","N-4","N-5","N-6","N-7","N-8","N-9")
print(res)

## use SCG to get the communities
com <- scg(graph.laplacian(immuno), ev=n-c(1,2), nt=2)$groups
col <- rainbow(max(com))
layout <- layout.auto(immuno)

plot(immuno, layout=layout, vertex.size=3, vertex.color=col[com],
                vertex.label=NA)

## display the coarse-grained graph
gt <- simplify(as.undirected(gt))
layout.cg <- layout.kamada.kawai(gt)
com.cg <- scg(graph.laplacian(gt), nt-c(1,2), 2)$groups
vsize <- sqrt(as.vector(table(cg$groups)))

op <- par(mfrow=c(1,2))
plot(immuno, layout=layout, vertex.size=3, vertex.color=col[com],
                vertex.label=NA)
plot(gt, layout=layout.cg, vertex.size=15*vsize/max(vsize), 
                vertex.color=col[com.cg],vertex.label=NA)
par(op)

Run the code above in your browser using DataLab