
Last chance! 50% off unlimited learning
Sale ends in
The function computes the Adjusted Rand Index (ARI) to compare two graphs' partitions into modules or clusters more generally. Both graphs must have the same number of nodes, but not necessarily the same number of links. They must also have the same node names and in the same order.
graph_modul_compar(
x,
y,
mode = "graph",
nb_modul = NULL,
algo = "fast_greedy",
node_inter = "distance",
data = NULL
)
The value of the ARI
The first graph object
If mode = 'graph'
(default), x
is a graph object of
class igraph
.
Then, its nodes must have the same names as in graph y
.
If mode = 'data.frame'
, x
refers to a column of
the data.frame
'data'.
Then x
must be a character string indicating the name of the
column of 'data' with the modules' labels of the nodes in the first graph.
In that case, the column can be of class numeric
, character
or factor
but will be converted into a numeric
vector
in any case.
If mode = 'vector'
, x
is a vector of
class character
, factor
or numeric
.
In that case, it must have the same length as vector y
and
will be converted into a numeric
vector.
The second graph object
Same classes possible as for x
. Must be of the same format as x
A character string indicating whether x and y are igraph objects,
vectors or columns from a data.frame. mode
can be 'graph',
'data.frame' or 'vector'.
(if x and y are igraph objects) A numeric or integer value or a numeric vector with 2 elements indicating the number of modules to create in both graphs.
If nb_modul
is a numeric value, then the same number of modules
are created in both graphs.
If nb_modul
is a numeric vector of length 2, then the
numbers of modules created in graphs x
and y
are the
first and second elements of nb_modul
, respectively.
(if x and y are igraph objects) A character string indicating the algorithm used to create the modules with igraph.
If algo = 'fast_greedy'
(default),
function cluster_fast_greedy
from igraph is used (Clauset et al., 2004).
If algo = 'walktrap'
(default), function cluster_walktrap
from igraph is used (Pons et Latapy, 2006) with
4 steps (default options).
If algo = 'louvain'
, function cluster_louvain
from igraph is used (Blondel et al., 2008).
In that case, the number of modules created in each graph is imposed.
If algo = 'optimal'
, function cluster_optimal
from igraph is used (Brandes et al., 2008) (can be very long).
In that case, the number of modules created in each graph is imposed.
(optional, if x and y are igraph objects, default is 'none') A character string indicating whether the links of the graph are weighted by distances or by similarity indices. It is only used to compute the modularity index. It can be:
'distance': Link weights correspond to distances. Nodes that are close to each other will more likely be in the same module.
'similarity': Link weights correspond to similarity indices. Nodes that are similar to each other will more likely be in the same module. Inverse link weights are then used to compute the modularity index.
'none': Links are not weighted for the computation, which is only based on graph topology.
Two different weightings can be used to create the modules of the two graphs.
If node_inter
is a character string, then the same link
weighting is used for both graphs.
If node_inter
is a character vector of length 2, then
the link weighting used by the algorithm to create the modules of
graphs x
and y
is determined by the first and second elements
of node_inter
, respectively.
(if x and y are columns from a data.frame) An object of class data.frame with at least two columns and as many rows as there are nodes in the graphs compared. The columns indicate the modules of each node in 2 different classifications.
P. Savary
This index takes values between -1 and 1. It measures how often
pairs of nodes pertaining to the same module in one graph also pertain to
the same module in the other graph.
Therefore, large values indicate that both partitions are similar.
The Rand Index can be defined as the frequency of agreement between two
classifications into discrete classes. It is the number of times a pair of
elements are classified into the same class or in two different classes
in both compared classifications, divided by the total number of possible
pairs of elements. The Rand Index is between 0 and 1 but its maximum value
depends on the number of elements. Thus, another 'adjusted' index was
created, the Adjusted Rand Index. According to the Hubert et
Arabie's formula, the ARI is computed as follows:
adjustedRandIndex
from package mclust which
applies the Hubert and Arabie's formula for the ARI.
This function works for undirected graphs only.
dyer2004populationgraph4lg hubert1985comparinggraph4lg clauset2004findinggraph4lg blondel2008fastgraph4lg brandes2008modularitygraph4lg pons2006computinggraph4lg
data(data_ex_genind)
data(pts_pop_ex)
mat_dist <- suppressWarnings(graph4lg::mat_geo_dist(data=pts_pop_ex,
ID = "ID",
x = "x",
y = "y"))
mat_dist <- mat_dist[order(as.character(row.names(mat_dist))),
order(as.character(colnames(mat_dist)))]
graph_obs <- gen_graph_thr(mat_w = mat_dist, mat_thr = mat_dist,
thr = 24000, mode = "larger")
mat_gen <- mat_gen_dist(x = data_ex_genind, dist = "DPS")
graph_pred <- gen_graph_topo(mat_w = mat_gen, mat_topo = mat_dist,
topo = "gabriel")
ARI <- graph_modul_compar(x = graph_obs, y = graph_pred)
Run the code above in your browser using DataLab