Evaluation of a new method for measuring the internet degree distribution: Simulation results

Christophe Crespelle and Fabien Tarissan

in Complex Networks, special issue of Computer Communications, 34-5, pp. 635-648 (2011), DOI: 10.1016/j.comcom.2010.06.006

Many contributions rely on the degree distribution of the Internet topology. However, current knowledge of this property is based on biased and erroneous measurements and is subject to much debate. Recently, a new approach, referred to as the Neighborhood Flooding method, was proposed to avoid issues raised by classical measurements. It aims at measuring the neighborhood of Internet core routers by sending traceroute probes from many monitors distributed in the Internet towards a given target router. In this paper, we investigate the accuracy of this method with simulations. Our results show that Neighborhood Flooding is free from the bias highlighted in the classical approach and is able to observe properly the exact degree of a vast majority of nodes in the core of the network. We show how the quality of the estimation depends on the number of monitors used and we carefully examine the influence of parameters of the simulations on our results. We also point out some limitations of the Neighborhood Flooding method and discuss their impact on the observed distribution.

Download

Static community detection algorithms for evolving networks

Thomas Aynaud and Jean-Loup Guillaume

Proceedings of International Workshop on Dynamic Networks (WDN), in conjunction with WiOpt 2010, pages 508-514

Complex networks can often be divided in dense sub-networks called communities. We study, using a partition edit distance, how three community detection algorithms transform their outputs if the input network is sligthly modified. The instabilities appear to be important and we propose a modification of one algorithm to stabilize it and to allow the tracking of the communities in a dynamic network. This modification has one parameter which is a tradeoff between stability and quality. The resulting algorithm appears to be very effective. We finally use it on a dynamic network of blogs.

Download

Some Insight on Dynamics of Posts and Citations in Different Blog Communities

Abdelhamid Salah Brahim, Bénédicte Le Grand and Matthieu Latapy

IEEE ICC 2010 workshop « SocNets », Cape Town, South Africa, May 2010

This paper explores new approaches and methods to characterize post and citation dynamics in different blog communities. In particular, evolution of post popularity over time is studied, as well as information spreading cascades. This methodology goes beyond traditional approaches by defining classes of dynamic behaviors based on topological features of the post network, and by investigating the impact of topical communities on post popularity dynamics and on information spreading cascades. This methodology has been applied to a corpus of active French blogs monitored during 4 months.

Download

Impact of Sources and Destinations on the Observed Properties of the Internet Topology

Frédéric Ouédraogo, Clémence Magnien

Computer Communications, 34, 670-679, 2011

Maps of the internet topology are generally obtained by measuring the routes from a given set of sources to a given set of destinations (with tools such as traceroute). It has been shown that this approach misses some links and nodes. Worse, in some cases it can induce a bias in the obtained data, i.e. the properties of the obtained maps are significantly different from those of the real topology. In order to reduce this bias, the general approach consists in increasing the number of sources. Some works have studied the relevance of this approach. Most of them have used theoretical results, or simulations on network models. Some papers have used real data obtained from actual measurement procedures to evaluate the importance of the number of sources and destinations, but no work to our knowledge has studied extensively the importance of the choice of sources or destinations. Here, we use real data from internet topology measurements to study this question: by comparing partial measurements to our complete data, we can evaluate the impact of adding sources or destinations on the observed properties. We show that the number of sources and destinations used plays a role in the observed properties, but that their choice, and not only their number, also has a strong influence on the observations. We then study common statistics used to describe the internet topology, and show that they behave differently: some can be trusted once the number of sources and destinations are not too small, while others are difficult to evaluate.

Download

Age, Gender and Communication Networks

Alina Stoica, Zbigniew Smoreda, Christophe Prieur and Jean-Loup Guillaume

Proceedings of the Workshop on the Analysis of Mobile Phone Networks, satellite workshop to NetSci 2010

In this paper, we address some sociological andtopological issues associated with mobile phone communication.Based on a dataset of a few million users, we use customers’ ageand gender information to study relation between theseparameters and the average behavior of users in terms of numberof calls, number of SMS and calls duration. We also study thedataset from a networking point of view: we define differentprofiles based on the topological properties of the personalnetwork of each individual and study the relations between theseprofiles and the age of customers

Download

Structure multi-échelle de grands graphes de terrain

Thomas Aynaud and Jean-Loup Guillaume

in Technique et Science Informatiques, vol. 30/2, pp. 137-154, 2011

Most complex networks can be divided into dense sub-graphs called communities. These communities may also be divided recursively producing a hierarchical structure of communities, summarized in a tree named dendrogram. In this article we analyze this structure extracted from several complex networks. First we study the shape of the tree and, in particular, communities imbrications. Then we show that an excessive decomposition of communities can result in meaningless communities. We propose a couple of approaches to solve this problem.

Download

Detecting Events in the Dynamics of Ego-centered Measurements of the Internet Topology

Assia Hamzaoui, Matthieu Latapy and Clémence Magnien

Proceedings of International Workshop on Dynamic Networks (WDN), in conjunction with WiOpt 2010

Detecting events such as major routing changes or congestions in the dynamics of the internet topology is an important but challenging task. We explore here a top-down approach based on a notion of statistically significant events. It consists in identifying statistics which exhibit a homogeneous distribution with outliers, which correspond to events. We apply this approach to ego-centerd measurements of the internet topology (views obtained from a single monitor) and show that it succeeds in detecting meaningful events. Finally, we give some hints for the interpretation of such events in terms of network events.

Download

Risk diversification and default cascades in financial networks

Stefano Battiston

28 avril 2011 de 11h à 12h en salle 25-26/101

We introduce a discrete dynamics of distress propagation in networks. The formulation combines a diffusion and a contact process which are relevant to complex networks in general. In particular it is more suitable than previous models to describe contagion processes occurring in the financial system. Our results indicate that diversification of risk at the individual level can have an ambiguous effect on the global level, leading in some cases to higher systemic risk. Moreover, we show that the same structure can be resilient or fragile depending on the relative strength of the two processes at play. In the financial context, this corresponds to the level of optimism in the market. Thus, our work has interesting implications for the debate on the network architecture that is most resilient to financial crises.

Structure et dynamique des graphes de terrain bipartis : liens internes et prédiction de liens

Oussama Allali

26 mai 2011 à 11h : salle 25-26/101

Slides

Beaucoup de graphes de terrain, comme les relations acteur-film ou fichier-fournisseur, ont une nature bipartie et sont modélisés par des graphes bipartis, dont les nœuds sont divisés en deux ensembles avec des liens entre des nœuds de différents ensembles seulement. Cependant, il y a actuellement un manque de méthodes pour analyser correctement ces graphes, la plupart des méthodes existantes étant conçues pour des graphes classiques. Une approche courante, mais qui a des limites, consiste à transformer les graphes bipartis en graphes classique, par un procédé appelé projection. Cependant ceci entraîne une perte importante d’informations. Dans cet exposé je présenterai les liens internes et je les proposerai comme une nouvelle notion importante pour analyser les graphes de terrain bipartis : elle permet de mesurer la redondance dans ces graphes, et de mesurer la perte d’information entre un graphe biparti et ses projections. Jemontreraien étudiant différents jeux de données que les liens internes sont très fréquents, et que les statistiques associées permettent de souligner les ressemblances et les différences entre les graphes deterrainbipartis et les graphes bipartis aléatoire. La plupart des graphes de terrain sont de plus dynamiques, c’est-à-dire que leur structure évolue au fil du temps par l’ajout et/ou le retrait de nœuds et/ou de liens. L’étude de la dynamique des graphes de terrain peut s’aborder par le problème de la prédiction de nouveaux liens dans ces graphes. Plusieurs travaux ont étudié le problème de la prédiction de liens dans les graphes classiques (non-bipartis). Toutefois, leurs méthodes ne sont pas directement applicables à, ou appropriées pour, les graphes bipartis. Je présenterai une approche basée sur les liens internes pour la prédiction dans les graphes bipartis. Je montrerai que notre méthode fonctionne très bien, beaucoup mieux que l’approche de recommandation classique.

Estimating properties in dynamic systems: the case of churn in P2P networks

Lamia Benamara and Clémence Magnien

Proceedings of the third International Workshop on Network Science for Communication Networks (NetSciCom 2010), In conjunction with IEEE Infocom 2010.

In many systems, such as P2P systems, the dynamicity of participating elements, or churn, has a strong impact. As a consequence, many efforts have been made to characterize it, and in particular to capture the session length distribution. However in most cases, estimating it rigorously is difficult. One of the reasons is that, because the observation window is by definition finite, parts of the sessions that begin before the window and/or end after it are missed. This induces a bias. Although it tends to decrease when the observation window length increases, it is difficult to quantify its importance, or how fast it decreases. Here, we introduce a general methodology that allows us to know if the observation window is long enough to characterize a given property. This methodology is not specific to one study case and may be applied to any property in a dynamic system. We apply this methodology to the study of session lengths in a massive measurement of P2P activity in the eDonkey system. We show that the measurement needs to last for at least one week in order to obtain representative results. We also show that our methodology allows us to precisely characterize the shape of the session length distribution.

Download

Dynamique des réseaux de conseil : les mécanismes sociaux à lorigine dune stabilisation précaire

Emmanuel Lazega

17 mars 2011 de 11h à 12h : salle 25-26/101

La question de savoir si la structure dun collectif reste stable malgré un fort turnover de ses membres et un fort turnover des relations entre ces membres est une question classique de la sociologie. Nous proposons une réponse à cette question en procédant à une analyse longitudinale dun réseau de conseil intra-organisationnel explorant la relation entre la structure formelle de lorganisation et un processus dapprentissage collectif par ses membres. Nous identifions une structure centre-périphérie et un mouvement cyclique de centralisation-décentralisation du réseau de conseil. Nous expliquons cette dynamique comme leffet dune oscillation des membres de lorganisation entre surcharge et conflits épistémiques. La question de la stabilité de la structure malgré un fort turnover de ses membres et un fort turnover des relations entre eux est ainsi traduite en termes de stabilisation plus ou moins précaire de cette dynamique par les acteurs du système eux-mêmes.

Rigorous measurement of IP-level neighborhood of Internet core routers

Christophe Crespelle, Matthieu Latapy, Elie Rotenberg

Proceedings of the third International Workshop on Network Science for Communication Networks (NetSciCom 2010), In conjunction with IEEE Infocom 2010.

Many contributions use the degree distribution of IP-level internet topology. However, current knowledge of this property relies on biased and erroneous measurements, and so it is subject to much debate. We introduce here a new approach, dedicated to the core of the internet, which avoids the issues raised by classical measurements. It is based on the measurement of IP-level neighborhood of internet core routers, for which we design and implement a rigorous method. It consists in sending traceroute probes from many monitors distributed in the internet towards a given target router and carefully selecting the relevant information in collected data. Using simulations, we provide strong evidence of the accuracy of our approach. We then conduct real-world measurements illustrating the practical effectiveness of our method. This constitutes a significant step towards reliable knowledge of the IP-level degree distribution of the core of the internet.

Download

Bluebear : Exploration des risques d’atteinte à la vie privée sur Internet

Arnaud Legout

10 mars 2011 de 11h à 12h : salle 25-26/101

Les risques d’atteinte à la vie privée sur Internet sont largement sous estimés. Il est connu que les internautes laissent énormément d’informations personnelles sur, par exemple, les réseaux sociaux. Cependant, peu de gens savent que toute activité sur Internet laisse des traces. Dans cet exposé, nous allons montrer qu’il est possible de collecter ces traces à grande échelle et sans infrastructure dédiée. Plus précisément, nous allons montrer qu’il est possible de surveiller l’activité sur BitTorrent de centaines de millions d’internautes. Nous allons également montrer qu’en exploitant des mesures faites sur BitTorrent et sur un réseaux de voix sur IP, il est non seulement possible de lier à une identité réelle une liste de téléchargement de manière automatique, mais qu’il est également possible de suivre les déplacements d’un grand nombre de personnes sans que ces personnes n’aient aucun moyen de détecter ni de bloquer ces mesures.

Learning Propagation Schemes in Multi-Graphs

Ludovic Denoyer

10 février 2011 de 11h à 12h : salle 25-26/101

We consider the problem of labeling nodes in a multi-graph where the nodes may be connected with different types of relations. This type of problem occurs in many situations like for example the analysis of social networks or bibliographic data. Relations may be provided (e.g. friends) or computed (e.g. similarity). We propose a new learning algorithm for exploiting the rich multi-relational information in the labeling task. This method is able to optimally learn combining the influence of different relation types. It is one of the very first algorithms able to handle multi-graph information for this classi fication task. We describe experiments on four datasets which show the model ability to deal with complex relationships and to take bene fit of multi-relational information for complex labeling problems.

Accurate characterizing of session lengths in P2P system

Accurate characterizing of session lengths in P2P system

Interactive multiscale visualization of huge graphs: application to a network of weblogs

Massoud Seifi, Jean-loup Guillaume, Matthieu Latapy and Bénédicte Le Grand

Proceedings of the 8th Workshop on Visualization and Knowledge Extraction (EGC 2010)

De nombreux réseaux du monde réel peuvent être modélisés par des grands graphes. Réduire la complexité d’un graphe de manière à ce qu’il puisse être facilement interprété par l’oeil humain est une aide précieuse pour comprendre et analyser ce type de données. Nous proposons une méthodologie de visualisation interactive multi-échelle de grands graphes basée sur une classification des sommets qui nous permet de représenter ces graphes de manière lisible et interprétable. Nous appliquons notre méthodologie à un réseau de blogs francophones.

Download

Link prediction in a file-provider network

Link prediction in a file-provider network

Complex network measurement with link queries

Complex network measurement with link queries

Quantifying paedophile users on a P2P system

Quantifying paedophile users on a P2P system

Keywords popularity in eDonkey queries

Keywords pupularity in eDonkey queries