Extracting and Visualizing Tree-Like Structures from Concept Lattices

Cassio Melo, Benedicte Le Grand, Marie-Aude Aufaure, and Anastazia Berezianos

IV’2011 conference on Information Visualization, London, July 2011.

Concept lattices built with Formal Concept Analysis are usually represented by Hasse diagrams illustrating the groupings of objects described by common attributes. Hasse diagrams display the relations of partial order between concepts in a hierarchical fashion, where each concept may have several parent concepts. Lattice visualization becomes a problem as the number of clusters grows significantly with the number of objects and attributes. Interpreting the lattice through a direct visualization of the line diagram rapidly becomes impossible and more synthetic representations are needed. In this work we propose several methods to enhance the readability of concept lattices firstly though colouring and distortion techniques, and secondly by extracting and visualizing trees derived from concept lattices structures.

Download

Citations among blogs in a hierarchy of communities: method and case study

Abdelhamid Salah brahim, Bénédicte Le Grand, Lionel Tabourier, Matthieu Latapy

Journal of Computational Science, Vol 2(3), 2011

How does the structure of a network (e.g. its organization into groups or communities) impact the interaction among its nodes? In this paper we propose a generic methodology to study the correlation between complex networks interactions and their community structure. We illustrate it on a blog network and focus on citation links. We first define a homophily probability evaluating the tendency of blogs to ite blogs from the same communities. We then introduce the notion of community distance to capture if a blog cites (or is cited by) blogs distant or not from its community. We analyze the distribution of distances corresponding to each citation link, and use it to build maps of relevant communities which help interpreting blogs interactions. This community-oriented approach allows to study citation links at various abstraction levels, and conversely, enable us to characterize communities with regard to their citation behaviour.

Download

Extraction hiérarchique de fenêtres de temps basée sur la structure communautaire

Thomas Aynaud and Jean-Loup Guillaume

in Proceedings of MARAMI 2011

Dans cet article nous décrivons une méthode de décomposition du temps en fenêtres de temps dans un graphe dynamique. Une particularité de la méthode est que le résultat est un regroupement hiérarchique : les fenêtres de temps sont elles-mêmes susceptibles d’en contenir. En outre, les fenêtres n’ont pas besoin d’être contiguës ce qui permet par exemple de détecter une structure se répétant. De plus, chaque fenêtre est associée à une décomposition en communautés représentant la structure topologique du réseau durant cette fenêtre. Nous appliquons ensuite cette méthode à trois graphes de terrain dynamiques ayant des caractéristiques différentes pour montrer que les fenêtres identifiées correspondent bien à des phénomènes observables. In this paper, we describe a way to cluster the time in time windows in a dynamic network. The result is a tree and thus time windows can themselves contain smaller ones. Moreover, the windows do not have to be consecutive and this allows for instance to detect repeated structure. Each window is also associated to a community decomposition that represents the topological structure of the network during this window. We then apply the method to three dynamic networks to show that observed time windows correspond to observable phenomena.

Download

Phénomènes de diffusion dans les réseaux dynamiques : simulation et modélisation

Alice Albano

JFGG 2011

Les phénomènes de diffusion sont présents dans de nombreux contextes: diffusion d’épidémies, de virus informatiques, d’information dans des réseaux sociaux, etc. Bien que les réseaux où se produit la diffusion soient souvent dynamiques, cette dynamique n’est pas prise en compte dans la plupart des modèles existants. L’objectif de ces travaux est de proposer des modèles de diffusion, et d’étudier l’impact de la dynamique du réseau sur la diffusion.

Download

Internal links prediction: a new approach for predicting links in bipartite graphs

Oussama Allali, Clémence Magnien and Matthieu Latapy

Dynamic Networks and Knowledge Discovery, special issue of Intelligent Data Analysis, 7 (1), 5-25, 2013

Many real-world complex networks, like actor-movie or le-provider relations, have a bipartite nature and evolve over time. Predicting links that will appear in them is one of the main approach to understand their dynamics. Only few works address the bipartite case, though, despite its high practical interest and the specic challenges it raises. We dene in this paper the notion of internal links in bipartite graphs and propose a link prediction method based on them. We thoroughly describe the method and its variations, and experimentally compare it to a basic collaborative ltering approach. We present results obtained for a typical practical case. We reach the conclusion that our method performs very well, and we study in details how its parameters may inuence obtained results.

Download

Multi-Step Community Detection and Hierarchical Time Segmentation in Evolving Networks

Thomas Aynaud and Jean-Loup Guillaume

proceedings of the Fifth SNA-KDD Workshop Social Network Mining and Analysis, in conjunction with the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2011)

Many complex systems composed of interacting objects  like social networks or the web can be modeled as graphs. They can usually be divided in dense sub-graphs with few links between them, called communities and detecting this underlying community structure may have a major impact in the understanding of these systems. We focus here on evolving graphs, for which the usual approach is to represent the state of the system at different time steps and to compute communities independently on the graph obtained at each time step. We propose in this paper to use a different framework: instead of detecting communities on each time step, we detect a unique decomposition in communities that is relevant for (almost) every time step during a given period called the time window.  We propose a definition of this new decomposition and two algorithms to detect it quickly. We validate both the approach and the algorithms on three evolving networks of different kinds showing that the quality loss at each time step is very low despite the constraint of maximization on several time steps. Since the time window length is a crucial parameter of our technique, we also propose an unsupervised hierarchical clustering algorithm to build automatically a hierarchical time segmentation into time windows. This clustering relies on a new similarity measure based on community structure. We show that it is very efficient in detecting meaningful windows.

Download

A Real-World Spreading Experiment in the Blogosphere

Adrien Friggeri, Jean-Philippe Cointet and Matthieu Latapy

Complex Systems 19, 2011

We designed an experiment to observe a spreading phenomenon in the blogosphere. This experiment relies on a small applet that participants copy on their own web page. We present the obtained dataset, which we freely provide for study, and conduct basic analysis. We conclude that, despite the classical assumption, in this experiment famous blogs do not necessarily act as super spreaders.

Download

Post-processing hierarchical community structures: Quality improvements and multi-scale view

Pascal Pons, Matthieu Latapy

Theoretical Computer Science (TCS) 412(8-10): 892-900 (2011)

Dense sub-graphs of sparse graphs (communities), which appear in most real-world complex networks, play an important role in many contexts. Most existing community detection algorithms produce a hierarchical structure of communities and seek a partition into communities that optimizes a given quality function. We propose new methods to improve the results of any of these algorithms. First we show how to optimize a general class of additive quality functions (containing the modularity, the performance, and a new similarity-based quality function which we propose) over a larger set of partitions than the classical methods. Moreover, we define new multi-scale quality functions which make it possible to detect different scales at which meaningful community structures appear, while classical approaches find only one partition.

Download

A Radar for the Internet

Matthieu Latapy, Clémence Magnien and Frédéric Ouédraogo

Complex Systems, 20 (1), 23-30, 2011.

Mapping the internet’s topology is a challenge in itself, and studying its dynamics is even more difficult. Achieving this would however provide key information on the nature of the internet, crucial for modeling and simulation. Moreover, detecting anomalies in this dynamics is a key issue for security. We introduce here a new measurement approach which makes it possible to capture internet dynamics at a scale of a few minutes in a radar-like manner. By conducting and analyzing large-scale measurements of this kind, we rigorously and automatically detect events in the observed dynamics, which is totally out of reach of previous approaches.

Download

Link prediction in bipartite graphs using internal links and weighted projection

Oussama Allali, Clémence Magnien and Matthieu Latapy

Proceedings of the third International Workshop on Network Science for Communication Networks (Netscicom 2011), In conjunction with IEEE Infocom 2011.

Many real-world complex networks, like client-product or file-provider relations, have a bipartite nature and evolve during time. Predicting links that will appear in them is one of the main approach to understand their dynamics. Only few works address the bipartite case, though, despite its high practical interest and the specific challenges it raises. We define in this paper the notion of internal links in bipartite graphs and propose a link prediction method based on them. We describe the method and experimentally compare it to a basic collaborative filtering approach. We present results obtained for two typical practical cases. We reach the conclusion that our method performs very well, and that internal links play an important role in bipartite graphs and their dynamics.

Download

Quantifying paedophile queries in a large P2P system

Matthieu Latapy, Clémence Magnien and Raphaël Fournier

IEEE Infocom Mini-Conference 2011, Shanghai

Increasing knowledge of paedophile activity in P2P systems is a crucial societal concern, with important consequences on child protection, policy making, and internet regulation. Because of a lack of traces of P2P exchanges and rigorous analysis methodology, however, current knowledge of this activity remains very limited. We consider here a widely used P2P system, eDonkey, and focus on two key statistics: the fraction of paedophile queries entered in the system and the fraction of users who entered such queries. We collect hundreds of millions of keyword-based queries; we design a paedophile query detection tool for which we establish false positive and false negative rates using assessment by experts; with this tool and these rates, we then estimate the fraction of paedophile queries in our data; finally, we design and apply methods for quantifying users who entered such queries. We conclude that approximately 0.25 % of queries are paedophile, and that more than 0.2 % of users enter such queries. These statistics are by far the most precise and reliable ever obtained in this domain.

Download

Optimisation locale multi-niveaux de la modularité

Thomas Aynaud, Vincent Blondel, Jean-Loup Guillaume and Renaud Lambiotte

in Partitionnement de graphe : optimisation et applications, Traité IC2, Hermes-Lavoisier 2011

Dans ce chapitre, nous présentons une méthode gloutonne pour optimiser la modularité d’un graphe. Cette méthode de partionnement permet de traiter avec une excellente précision des systèmes de taille inégalée, allant jusqu’à plusieurs milliards de liens. Notre algorithme a de surcroît l’avantage de ne pas être limité à l’optimisation de la modularité puisqu’il peut être généralisé à d’autres fonctions de qualité, et de découvrir des communautés à différentes échelles. Les performances de l’algorithme sont évaluées sur des graphes artificiels pour lesquels la structure communautaire est connue, ainsi que sur des graphes de terrain réels.

Long range community detection

Thomas Aynaud and Jean-Loup Guillaume

Latin-American Workshop on Dynamic Networks (LAWDN), Buenos Aires, 2010

Complex networks can usually be divided in dense subnetworks called communities. In evolving networks, the usual way to detect communities is to find several partitions independently, one for each time step. However, this generally causes troubles when trying to track communities from one time step to the next. We propose here a new method to detect only one decomposition in communities that is good for (almost) every time step. We show that this unique partition can be computed with a modification of the Louvain method and that the loss of quality at each time step is generally low despite the constraint of global maximization. We also show that some specific modifications of the networks topology can be identified using this unique partition in the case of the Internet topology.

Download

Détection de communautés à long terme dans les graphes dynamiques

Thomas Aynaud and Jean-Loup Guillaume

Journée thématique Fouille de grands graphes, en conjonction avec la première conférence sur les Modèles et l’Analyse des Réseaux : Approches Mathématiques et Informatique (MARAMI), Toulouse, France, 2010

La plupart des graphes de terrain peuvent être décomposés en sous graphes denses appelés communautés. Habituellement, dans des graphes dynamiques, les communautés sont détectées pour chaque instant indépendamment ce qui pose de nombreux problèmes tels que la stabilité ou le suivi de des communautés entre deux décompositions successives. Nous proposons ici une méthode pour trouver une partition unique, de qualité, couvrant une longue période. Cette décomposition peut être trouvée efficacement via une adaptation de la méthode de Louvain et la perte de qualité à chaque instant due à la contrainte de détecter des communautés globales s’avère assez faible.

Download

Termination of Multipartite Graph Series Arising from Complex Network Modelling

Matthieu Latapy, Thi Ha Duong Phan, Christophe Crespelle, Thanh Qui Nguyen

COCOA 2010

An intense activity is nowadays devoted to the definition of models capturing the properties of complex networks. Among the most promising approaches, it has been proposed to model these graphs via their clique incidence bipartite graphs. However, this approach has, until now, severe limitations resulting from its incapacity to reproduce a key property of this object: the overlapping nature of cliques in complex networks. In order to get rid of these limitations we propose to encode the structure of clique overlaps in a network thanks to a process consisting in iteratively factorising the maximal bicliques between the upper level and the other levels of a multipartite graph. We show that the most natural definition of this factorising process leads to infinite series for some instances. Our main result is to design a restriction of this process that terminates for any arbitrary graph. Moreover, we show that the resulting multipartite graph has remarkable combinatorial properties and is closely related to another fundamental combinatorial object. Finally, we show that, in practice, this multipartite graph is computationally tractable and has a size that makes it suitable for complex network modelling.

Download

Evaluation of a new method for measuring the internet degree distribution: Simulation results

Christophe Crespelle and Fabien Tarissan

in Complex Networks, special issue of Computer Communications, 34-5, pp. 635-648 (2011), DOI: 10.1016/j.comcom.2010.06.006

Many contributions rely on the degree distribution of the Internet topology. However, current knowledge of this property is based on biased and erroneous measurements and is subject to much debate. Recently, a new approach, referred to as the Neighborhood Flooding method, was proposed to avoid issues raised by classical measurements. It aims at measuring the neighborhood of Internet core routers by sending traceroute probes from many monitors distributed in the Internet towards a given target router. In this paper, we investigate the accuracy of this method with simulations. Our results show that Neighborhood Flooding is free from the bias highlighted in the classical approach and is able to observe properly the exact degree of a vast majority of nodes in the core of the network. We show how the quality of the estimation depends on the number of monitors used and we carefully examine the influence of parameters of the simulations on our results. We also point out some limitations of the Neighborhood Flooding method and discuss their impact on the observed distribution.

Download

Static community detection algorithms for evolving networks

Thomas Aynaud and Jean-Loup Guillaume

Proceedings of International Workshop on Dynamic Networks (WDN), in conjunction with WiOpt 2010, pages 508-514

Complex networks can often be divided in dense sub-networks called communities. We study, using a partition edit distance, how three community detection algorithms transform their outputs if the input network is sligthly modified. The instabilities appear to be important and we propose a modification of one algorithm to stabilize it and to allow the tracking of the communities in a dynamic network. This modification has one parameter which is a tradeoff between stability and quality. The resulting algorithm appears to be very effective. We finally use it on a dynamic network of blogs.

Download

Some Insight on Dynamics of Posts and Citations in Different Blog Communities

Abdelhamid Salah Brahim, Bénédicte Le Grand and Matthieu Latapy

IEEE ICC 2010 workshop « SocNets », Cape Town, South Africa, May 2010

This paper explores new approaches and methods to characterize post and citation dynamics in different blog communities. In particular, evolution of post popularity over time is studied, as well as information spreading cascades. This methodology goes beyond traditional approaches by defining classes of dynamic behaviors based on topological features of the post network, and by investigating the impact of topical communities on post popularity dynamics and on information spreading cascades. This methodology has been applied to a corpus of active French blogs monitored during 4 months.

Download

Impact of Sources and Destinations on the Observed Properties of the Internet Topology

Frédéric Ouédraogo, Clémence Magnien

Computer Communications, 34, 670-679, 2011

Maps of the internet topology are generally obtained by measuring the routes from a given set of sources to a given set of destinations (with tools such as traceroute). It has been shown that this approach misses some links and nodes. Worse, in some cases it can induce a bias in the obtained data, i.e. the properties of the obtained maps are significantly different from those of the real topology. In order to reduce this bias, the general approach consists in increasing the number of sources. Some works have studied the relevance of this approach. Most of them have used theoretical results, or simulations on network models. Some papers have used real data obtained from actual measurement procedures to evaluate the importance of the number of sources and destinations, but no work to our knowledge has studied extensively the importance of the choice of sources or destinations. Here, we use real data from internet topology measurements to study this question: by comparing partial measurements to our complete data, we can evaluate the impact of adding sources or destinations on the observed properties. We show that the number of sources and destinations used plays a role in the observed properties, but that their choice, and not only their number, also has a strong influence on the observations. We then study common statistics used to describe the internet topology, and show that they behave differently: some can be trusted once the number of sources and destinations are not too small, while others are difficult to evaluate.

Download

Age, Gender and Communication Networks

Alina Stoica, Zbigniew Smoreda, Christophe Prieur and Jean-Loup Guillaume

Proceedings of the Workshop on the Analysis of Mobile Phone Networks, satellite workshop to NetSci 2010

In this paper, we address some sociological andtopological issues associated with mobile phone communication.Based on a dataset of a few million users, we use customers’ ageand gender information to study relation between theseparameters and the average behavior of users in terms of numberof calls, number of SMS and calls duration. We also study thedataset from a networking point of view: we define differentprofiles based on the topological properties of the personalnetwork of each individual and study the relations between theseprofiles and the age of customers

Download