Structure multi-échelle de grands graphes de terrain

Thomas Aynaud and Jean-Loup Guillaume

in Technique et Science Informatiques, vol. 30/2, pp. 137-154, 2011

Most complex networks can be divided into dense sub-graphs called communities. These communities may also be divided recursively producing a hierarchical structure of communities, summarized in a tree named dendrogram. In this article we analyze this structure extracted from several complex networks. First we study the shape of the tree and, in particular, communities imbrications. Then we show that an excessive decomposition of communities can result in meaningless communities. We propose a couple of approaches to solve this problem.

Download

Detecting Events in the Dynamics of Ego-centered Measurements of the Internet Topology

Assia Hamzaoui, Matthieu Latapy and Clémence Magnien

Proceedings of International Workshop on Dynamic Networks (WDN), in conjunction with WiOpt 2010

Detecting events such as major routing changes or congestions in the dynamics of the internet topology is an important but challenging task. We explore here a top-down approach based on a notion of statistically significant events. It consists in identifying statistics which exhibit a homogeneous distribution with outliers, which correspond to events. We apply this approach to ego-centerd measurements of the internet topology (views obtained from a single monitor) and show that it succeeds in detecting meaningful events. Finally, we give some hints for the interpretation of such events in terms of network events.

Download

Estimating properties in dynamic systems: the case of churn in P2P networks

Lamia Benamara and Clémence Magnien

Proceedings of the third International Workshop on Network Science for Communication Networks (NetSciCom 2010), In conjunction with IEEE Infocom 2010.

In many systems, such as P2P systems, the dynamicity of participating elements, or churn, has a strong impact. As a consequence, many efforts have been made to characterize it, and in particular to capture the session length distribution. However in most cases, estimating it rigorously is difficult. One of the reasons is that, because the observation window is by definition finite, parts of the sessions that begin before the window and/or end after it are missed. This induces a bias. Although it tends to decrease when the observation window length increases, it is difficult to quantify its importance, or how fast it decreases. Here, we introduce a general methodology that allows us to know if the observation window is long enough to characterize a given property. This methodology is not specific to one study case and may be applied to any property in a dynamic system. We apply this methodology to the study of session lengths in a massive measurement of P2P activity in the eDonkey system. We show that the measurement needs to last for at least one week in order to obtain representative results. We also show that our methodology allows us to precisely characterize the shape of the session length distribution.

Download

Rigorous measurement of IP-level neighborhood of Internet core routers

Christophe Crespelle, Matthieu Latapy, Elie Rotenberg

Proceedings of the third International Workshop on Network Science for Communication Networks (NetSciCom 2010), In conjunction with IEEE Infocom 2010.

Many contributions use the degree distribution of IP-level internet topology. However, current knowledge of this property relies on biased and erroneous measurements, and so it is subject to much debate. We introduce here a new approach, dedicated to the core of the internet, which avoids the issues raised by classical measurements. It is based on the measurement of IP-level neighborhood of internet core routers, for which we design and implement a rigorous method. It consists in sending traceroute probes from many monitors distributed in the internet towards a given target router and carefully selecting the relevant information in collected data. Using simulations, we provide strong evidence of the accuracy of our approach. We then conduct real-world measurements illustrating the practical effectiveness of our method. This constitutes a significant step towards reliable knowledge of the IP-level degree distribution of the core of the internet.

Download

Interactive multiscale visualization of huge graphs: application to a network of weblogs

Massoud Seifi, Jean-loup Guillaume, Matthieu Latapy and Bénédicte Le Grand

Proceedings of the 8th Workshop on Visualization and Knowledge Extraction (EGC 2010)

De nombreux réseaux du monde réel peuvent être modélisés par des grands graphes. Réduire la complexité d’un graphe de manière à ce qu’il puisse être facilement interprété par l’oeil humain est une aide précieuse pour comprendre et analyser ce type de données. Nous proposons une méthodologie de visualisation interactive multi-échelle de grands graphes basée sur une classification des sommets qui nous permet de représenter ces graphes de manière lisible et interprétable. Nous appliquons notre méthodologie à un réseau de blogs francophones.

Download

Impact of Random Failures and Attacks on Poisson and Power-Law Random Networks

Clémence Magnien, Matthieu Latapy and Jean-Loup Guillaume

ACM Computing Surveys, 43 (3), 2011. Extended abstract published in LNCS, proceedings of the 8-th International Conference On Principles Of Distributed Systems OPODIS’04, 2004, Grenoble, France (title: Comparison of Failures and Attacks on Random and Scale-free Networks)

It appeared recently that the underlying degree distribution of a network may play a crucial role concerning its robustness. Empiric and analytic results have been obtained, based on asymptotic and mean-field approximations. Previous work insisted on the fact that power-law degree distributions induce a high resilience to random failure but a high sensitivity to some attack strategies, while Poisson degree distributions are quite sensitive in both cases. Then, much work has been done to extend these results. We focus here on these basic results, with the aim of deepening significantly our understanding of their origin and their limitations. We review in detail previous contributions and we give full proofs in a unified framework, in which the approximations on which these results rely are well identified. We then add to these known results a set of new ones aimed at enlightening some important aspects. We also provide extensive and rigorous experiments which make it possible to evaluate the relevance of the analytic results. We reach the conclusion that, even if it is clear that the basic results of the field are true and important, they are in practice much less striking than generally thought. The differences between random failures and attacks are not so huge and can be explained with simple facts. Likewise, the differences in the behaviors induced by power-law and Poisson distributions are not as striking as often claimed.

Download

Mobile IPv6 Deployments: Graph-based Analysis and practical Guidelines

Guillaume Valadon, Clémence Magnien and Ryuji Wakikawa

Computer Communications, 32, 2009

The Mobile IPv6 protocol is a major solution to supply mobility services on the Internet. Many networking vendors have already implemented it in their operating systems and equipments. Moreover, it was recently selected to provide permanent IP addresses to end users of WiMAX and 3GPP2. Mobile IPv6 relies on a specific router called the home agent that hides location changes of the mobile nodes from the rest of the Internet. To do so, the mobile nodes’ traffic must flow through the home agent. This mandatory deviation produces longer paths and higher communication delays. In order to solve these problems, we describe a new approach to address deployments of Mobile IPv6 based on graph theory and could be applied to any operator’s network. In particular, we use notions of centrality in graphs to quantify increases of communication distances induced by dogleg routing and identify relevant home agents locations. We evaluate this approach using real-world network topologies and show that the obtained Mobile IPv6 performance could be close to direct paths ones. The proposed algorithm is generic and can be used to achieve efficient deployments of Mobile IPv6 as well as Home Agent Migration: a new Mobile IPv6 architecture using several distributed home agents.

Download

Efficient Measurement of Complex Networks Using Link Queries

Fabien Tarissan, Matthieu Latapy and Christophe Prieur

In Proceedings of the IEEE International Workshop on Network Science For Communication Networks (NetSciCom’09)

Complex networks are at the core of an intense research activity. However, in most cases, intricate and costly measurement procedures are needed to explore their structure. In some cases, these measurements rely on link queries: given two nodes, it is possible to test the existence of a link between them. These tests may be costly, and thus minimizing their number while maximizing the number of discovered links is a key issue. This is a challenging task, though, as initially no information is known on the network. This paper studies this problem: we observe that properties classically observed on real-world complex networks give hints for their efficient measurement; we derive simple principles and several measurement strategies based on this, and experimentally evaluate their efficiency on real-world cases. In order to do so, we introduce methods to evaluate the efficiency of strategies. We also explore the bias that different measurement strategies may induce.

Download

Ten weeks in the life of an eDonkey server

Frédéric Aidouni, Matthieu Latapy and Clémence Magnien

Sixth International Workshop on Hot Topics in Peer-to-Peer Systems (Hot-P2P 2009), May 29, 2009, Rome, Italy

This paper presents a capture of the queries managed by an eDonkey server during almost 10 weeks, leading to the observation of almost 9 billion messages involving almost 90 million users and more than 275 million distinct files. Acquisition and management of such data raises several challenges, which we discuss as well as the solutions we developed. We obtain a very rich dataset, orders of magnitude larger than previously avalaible ones, which we provide for public use. We finally present basic analysis of the obtained data, which already gives evidence of non-trivial features.

Download

Fast dynamics in Internet topology: preliminary observations and explanations

Clémence Magnien, Frédéric Ouedraogo, Guillaume Valadon, Matthieu Latapy

Fourth International Conference on Internet Monitoring and Protection (ICIMP 2009), May 24-28, 2009, Venice, Italy

By focusing on what can be observed by running traceroute-like measurements at a high frequency from a single monitor to a fixed destination set, we show that the observed view of the topology is constantly evolving at a pace much higher than expected. Repeated measurements discover new IP addresses at a constant rate, for long period of times (up to several months). In order to provide explanations, we study this phenomenon both at the IP, and at the Autonomous System levels. We show that this renewal of IP addresses is partially caused by a BGP routing dynamics, altering paths between existing ASes. Furthermore, we conjecture that an intra AS routing dynamics is another cause of this phenomenon.

Download

Measurement of eDonkey Activity with Distributed Honeypots

Oussama Allali, Matthieu Latapy and Clémence Magnien

Sixth International Workshop on Hot Topics in Peer-to-Peer Systems (Hot-P2P 2009), May 29, 2009, Rome, Italy

Collecting information about user activity in peer-to-peer systems is a key but challenging task. We describe here a distributed platform for doing so on the eDonkey network, relying on a group of honeypot peers which claim to have certain files and log queries they receive for these files. We then conduct some measurements with typical scenarios and use the obtained data to analyze the impact of key parameters like measurement duration, number of honeypots involved, and number of advertised files. This illustrates both the possible uses of our measurement system, and the kind of data one may collect using it.

Download

Empreintes conceptuelles et spatiales pour la caractérisation des réseaux sociaux

Bénédicte Le Grand, Marie-Aude Aufaure et Michel Soto

Conférence EGC 2009 (Extraction et Gestion des Connaissances), Strasbourg, France, 27-30 janvier 2009

In this paper, Formal Concept Analysis and Galois lattices are used for the analysis of complex datasets, online social networks in particular. Lattice-inspired statistics computed on the objects of the lattice provide their « conceptual distribution ». An experimentation conducted on four social networks’ samples shows how these statistics may be used to characterize these networks and filter them automatically.

Download

Main-memory Triangle Computations for Very Large (Sparse (Power-Law)) Graphs

Matthieu Latapy

Theoretical Computer Science (TCS) 407 (1-3), pages 458-473, 2008

Finding, counting and/or listing triangles (three vertices with three edges) in massive graphs are natural fundamental problems, which received recently much attention because of their importance in complex network analysis. We provide here a detailed survey of proposed main-memory solutions to these problems, in an unified way. We note that previous authors paid surprisingly little attention to space complexity of main-memory solutions, despite its both fundamental and practical interest. We therefore detail space complexities of known algorithms and discuss their implications. We also present new algorithms which are time optimal for triangle listing and beats previous algorithms concerning space needs. They have the additional advantage of performing better on power-law graphs, which we also detail. We finally show with an experimental study that these two algorithms perform very well in practice, allowing to handle cases which were previously out of reach.

Download

A Radar for the Internet

Matthieu Latapy, Clémence Magnien and Frédéric Ouédraogo

Proceedings of ADN’08: 1st International Workshop on Analysis of Dynamic Networks, in conjunction with IEEE ICDM 2008

In contrast with most internet topology measurement research, our concern here is not to obtain a map as complete and precise as possible of the whole internet. Instead, we claim that each machine’s view of this topology, which we call ego-centered view, is an object worth of study in itself. We design and implement an ego-centered measurement tool, and perform radar-like measurements consisting of repeated measurements of such views of the internet topology. We conduct long-term (several weeks) and high-speed (one round every few minutes) measurements of this kind from more than one hundred monitors, and we provide the obtained data. We also show that these data may be used to detect events in the dynamics of internet topology.

Download

Fast Computation of Empirically Tight Bounds for the Diameter of Massive Graphs

Clémence Magnien, Matthieu Latapy and Michel Habib

ACM Journal of Experimental Algorithmics (JEA), 13, 2009

The diameter of a graph is among its most basic parameters. Since a few years, it moreover became a key issue to compute it for massive graphs in the context of complex network analysis. However, known algorithms, including the ones producing approximate values, have too high a time and/or space complexity to be used in such cases. We propose here a new approach relying on very simple and fast algorithms that compute (upper and lower) bounds for the diameter. We show empirically that, on various real-world cases representative of complex networks studied in the literature, the obtained bounds are very tight (and even equal in some cases). This leads to rigorous and very accurate estimations of the actual diameter in cases which were previously untractable in practice.

Download

Fast unfolding of communities in large networks

Vincent D. Blondel, Jean-Loup Guillaume, Renaud Lambiotte, Etienne Lefebvre

J. Stat. Mech. (october 2008) P10008

We propose a simple method to extract the community structure of large networks. Our method is a heuristic method that is based on modularity optimization. It is shown to outperform all other known community detection method in terms of computation time. Moreover, the quality of the communities detected is very good, as measured by the so-called modularity. This is shown first by identifying language communities in a Belgian mobile phone network of 2.6 million customers and by analyzing a web graph of 118 million nodes and more than one billion links. The accuracy of our algorithm is also verified on ad-hoc modular networks

Download

Description and simulation of dynamic mobility networks

Pierre Borgnat, Eric Fleury, Jean-Loup Guillaume, Céline Robardet and Antoine Scherrer

Computer Networks 52 (2008), pp. 2842-2858

During the last decade, the study of large scale complex networks has attracted a substantial amount of attention and works from several domains: sociology, biology, computer science, epidemiology. Most of such complex networks are inherently dynamic, with new vertices and links appearing while some old ones disappear. Until recently, the dynamic of these networks was less studied and there is a strong need for dynamic network models in order to sustain protocol performance evaluations and fundamental analyzes in all the research domains listed above. We propose in this paper a novel framework for the study of dynamic mobility networks. We address the characterization of dynamics by proposing an in-depth description and analysis of two real-world data sets. We show in particular that links creation and deletion processes are independent of other graph properties and that such networks exhibit a large number of possible configurations, from sparse to dense. From those observations, we propose simple yet very accurate models that allow to generate random mobility graphs with similar temporal behavior as the one observed in experimental data.

Download

Evolving networks

Pierre Borgnat, Eric Fleury, Jean-Loup Guillaume, Clémence Magnien, Céline Robardet et Antoine Scherrer

Proceedings of NATO Advanced Study Institute on Mining Massive Data Sets for Security, IOS Press, 2008

Most real networks often evolve through time: changes of topology can occur if some nodes and/or edges appear and/or disappear, and the types or weights of nodes and edges can also change even if the topology stays static. Mobile devices with wireless capabilities (mobile phones, laptops, etc.) are a typical example of evolving networks where nodes or users are spread in the environment and connections between users can only occur if they are near each other. This whois- near-whom network evolves every time users move and communication services (such as the spread of any information) will deeply rely on the mobility and on the characteristics of the underlying network. This paper presents some recent results concerning the characterization of the dynamics of complex networks through three different angles: evolution of some parameters on snapshots of the network, parameters describing the evolution itself, and intermediate approaches consisting in the study of specific phenomena or users of interest through time.

Download

Basic Notions for the Analysis of Large Two-mode Networks

Matthieu Latapy, Clémence Magnien and Nathalie Del Vecchio

Social Networks (2008), vol. 30, no1, pp. 31-48

Many large real-world networks actually have a 2-mode nature: their nodes may be separated into two classes, the links being between nodes of different classes only. Despite this, and despite the fact that many ad-hoc tools have been designed for the study of special cases, very few exist to analyse (describe, extract relevant information) such networks in a systematic way. We propose here an extension of the most basic notions used nowadays to analyse large 1-mode networks (the classical case) to the 2-mode case. To achieve this, we introduce a set of simple statistics, which we discuss by comparing their values on a representative set of real-world networks and on their random versions. This makes it possible to evaluate their relevance in capturing properties of interest in 2-mode networks.

Download

Detection, Understanding, and Prevention of Traceroute Measurement Artifacts

Fabien Viger, Brice Augustin, Xavier Cuvellier, Clémence Magnien, Matthieu Latapy, Timur Friedman, and Renata Teixeira

Computer Networks 52-5 (2008), pp. 998-1018. Extended abstract published in the proceedings of the 6-th Internet Measurement Conference IMC’06, 2006, Rio de Janeiro, Brazil

Test of time award IMC 2022

Traceroute is widely used: from the diagnosis of network problems to the assemblage of internet maps. Unfortunately, there are a number of problems with traceroute methodology, which lead to the inference of erroneous routes. This paper studies particular structures arising in nearly all traceroute measurements. We characterize them as « loops », « cycles », and « diamonds ». We identify load balancing as a possible cause for the appearance of false loops, cycles and diamonds, i.e., artifacts that do not represent the internet topology. We provide a new publicly-available traceroute, called Paris traceroute, which, by controlling the packet header contents, provides a truer picture of the actual routes that packets follow. We performed measurements, from the perspective of a single source tracing towards multiple destinations, and Paris traceroute allowed us to show that many of the particular structures we observe are indeed traceroute measurement artifacts.