> By Jean-Loup Guillaume
Many community detection algorithms are non deterministic and can therefore give different partitions for the same graph. Depending on the context, it can be important to obtain stable results so as to identify very pertinent communities, but it can also be interesting to find some less stable ones.
For non deterministic algorithms, comparing two partitions of a given graph is not so easy. Some parameters can be calculated to estimate the similarity between two partitions: rand index, Jaccard index or the mutual information. However these parameters give only an aggregated value which can be hard to interpret.
In the spirit of the rand index, the plot above shows the similarity between 10,000 computations of communities on the same network, the famous Zachary’s karate club. The plot is a distribution of the proportion of pairs of nodes which are in the same group, the point (6213 ; 0.014) for instance means that there is 1.4% of pairs of nodes which are placed in the same community 62% of the time.
A deterministic algorithm would always place nodes either together or not, the curve would therefore exhibit two peaks, one on 0 and one on 10,000. However, the algorithm used (the Louvain method) is not deterministic and therefore some pairs are sometimes grouped and sometimes not. Despite the non-determinism, we can see that most pairs are nearly always grouped or separated, but that around 10% of pairs of nodes are nearly as often together than separated. These nodes are centainely specific and their position have to be investigated.