Temporal Reachability Graphs

J. Whitbeck and M. Dias de Amorim and V. Conan and J.-L. Guillaume

The 18th Annual International Conference on Mobile Computing and Networking, Mobicom’12, pp. 377-388

While a natural fit for modeling and understanding mobile networks, time-varying graphs remain poorly understood. Indeed, many of the usual concepts of static graphs have no obvious counterpart in time-varying ones. In this paper, we introduce the notion of temporal reachability graphs. A (tau, sigma)-reachability graph is a time-varying directed graph derived from an existing connectivity graph. An edge exists from one node to another in the reachability graph at time t if there exists a journey (i.e., a spatiotemporal path) in the connectivity graph from the first node to the second, leaving after t, with a positive edge traversal time tau, and arriving within a maximum delay sigma. We make three contributions. First, we develop the theoretical framework around temporal reachability graphs. Second, we harness our theoretical findings to propose an algorithm for their efficient computation. Finally, we demonstrate the analytic power of the temporal reachability graph concept by applying it to synthetic and real-life data sets. On top of defining clear upper bounds on communication capabilities, reachability graphs highlight asymmetric communication opportunities and offloading potential.

Download

Réseaux dynamiques : de la mesure à la modélisation

Alain Barrat

Vendredi 21 septembre 2012 à 14h, salle 25-26/101

Slides

Dans la dernière décennie, une importante activité de recherche s’est développée au sujet des réseaux complexes, en grande partie motivée par le fait que de nombreux systèmes peuvent être représentés par des réseaux, c’est-à-dire un ensemble de sites ou sommets reliés par des liens. Je présenterai ici la problématique concernant les réseaux complexes dynamiques, via divers exemples : les réseaux d’infrastructure et les réseaux sociaux. Dans ce dernier cadre, je présenterai en particulier le projet SocioPatterns (http://www.sociopatterns.org/), qui a développé dans les dernières années une infrastructure capable de mesurer les interactions sociales en temps réel dans un espace limité, comme une conférence, des bureaux, un hôpital…, et étudie les réseaux sociaux dynamiques correspondants. Je présenterai les résultats obtenus par les déploiements de cette infrastructure, qui révèlent des régularités inattendues dans les interactions sociales. Je présenterai également un modèle de dynamiques sociales qui reproduit un certain nombre de faits observés empiriquement, et je discuterai quelques conséquences de la dynamique du réseau sur les processus qui s’y déroulent. Je conclurai par les perspectives qu’offre le domaine des réseaux dynamiques.

Towards multi-ego-centered communities: a node similarity approach

M. Danisch, J.-L. Guillaume and B. Le Grand

Int. J. of Web Based Communities, Vol. 9, No. 3, pp. 299-322, 2012

The community structure of a graph is defined in various ways in the literature: (i) Partition, where nodes can belong to only one community. This vision is unrealistic and may lead to poor results because most nodes belong to several communities in real-world networks. (ii) Overlapping community structure, which is the most natural view, but is often very difficult to identify in practice due to the complex structure of real-world networks, and the huge number of such possible communities. (iii) Ego-centered community which focuses on individual nodes’ communities and seems to be a good compromise. In this paper we investigate the third vision; we propose a new similarity measure between nodes based on opinion dynamics to unfold ego-centered communities. We call it the carryover opinion. In addition to be parameter-free, the carryover opinion can be calculated in a very time-efficient way and can thus be used in huge graphs. We also go further in the idea of ego-centered communities by introducing the new concept of multi-ego-centered communities, i.e., focusing on the communities of a set of nodes rather than of a single node. A key idea is that, although one node generally belongs to numerous communities, a small set of appropriate nodes can fully characterize a single community.

Download

Diffusion Cascades: Spreading Phenomena in Blog Network Communities

Abdelhamid Salah Brahim, Bénédicte Le Grand and Matthieu Latapy

Parallel Processing Letters 22(1): (2012)

A diffusion cascade occurs when information spreads from one node to the rest of the network through a succession of diffusion events. So far diffusion phenomena have been mostly considered at a macroscopic scale i.e. by studying all nodes of the network. We give a complementary way to analyse network interactions by considering the problem at different scales. To that purpose, we use the community structure of the network to characterize diffusion between nodes (and between communities) and to identify interactions behaviour patterns.

Download

Modèles de graphes aléatoires pour l’analyse de réseaux

Pierre Latouche

Jeudi 14 Juin 2012 à 11h, salle 26-00/101

Slides

Les réseaux sont largement utilisés en sciences sociales afin de décrire les intéractions entre individus. Dans ce contexte, de nombreuses méthodes non-supervisées de clustering ont été développées afin d’extraire des informations, à partir de la topologie des réseaux. La plupart d’entre elles partitionne les noeuds dans des classes disjointes, en fonction de leurs profils de connection. Récemment, des études ont mis en évidence les limites de ces techniques. En effet, elles ont montré qu’un grand nombre de réseaux « réels » contenaient des noeuds connus pour appartenir à plusieurs groupes simultanément. Pour répondre à ce problème, nous proposons le modèle à blocs stochastiques chevauchants, Overlapping Stochastic Block Model (OSBM) en anglais. Cette approche autorise les noeuds à appartenir à plus d’une classe et généralise le très connu Stochastic Block Model, sous certaines hypothèses. Nous proposons un algorithme d’inférence permettant de classer les nouds d’un réseau, ainsi qu’un critère de sélection de modèles pour estimer le nombre de classes. Nous utilisons ces travaux pour analyser la blogosphère politique française.

Deciding on the type of the degree distribution of a graph from traceroute-like measurements

Xiaomin Wang, Matthieu Latapy, Michèle Soria

International Journal of Computer Networks & Communications (IJCNC), May 2012, Volume 4. Number 3

The degree distribution of the Internet topology is considered as one of its main properties. However, it is only known through a measurement procedure which gives a biased estimate. This measurement may in first approximation be modeled by a BFS (Breadth-First Search) tree. We explore here our ability to infer the type (Poisson or power-law) of the degree distribution from such a limited knowledge. We design procedures which estimate the degree distribution of a graph from a BFS of it, and show experimentally (on models and real-world data) that this approach succeeds in making the difference between Poisson and power-law degree distributions.

Download

Relevance of SIR Model for Real-world Spreading Phenomena: Experiments on a Large-scale P2P System

Daniel F. Bernardes, Matthieu Latapy, Fabien Tarissan

Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012), Istanbul, Turkey

Understanding the spread of information on complex networks is a key issue from a theoretical and applied perspective. Despite the effort in developing theoretical models for this phenomenon, gauging them with large-scale real-world data remains an important challenge due to the scarcity of open, extensive and detailed data. In this paper, we explain how traces of peer-to-peer file sharing may be used to this goal. We also perform simulations to assess the relevance of the standard SIR model to mimic key properties of spreading cascade. We examine the impact of the network topology on observed properties and finally turn to the evaluation of two heterogeneous versions of the SIR model. We conclude that all the models tested failed to reproduce key properties of such cascades: typically real spreading cascades are relatively “elongated” compared to simulated ones. We have also observed some interesting similarities common to all SIR models tested.

Download

Outskewer: Using Skewness to Spot Outliers in Samples and Time Series

Sébastien Heymann, Matthieu Latapy, Clémence Magnien

Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012), Istanbul, Turkey

Finding outliers in datasets is a classical problem of high interest for (dynamic) social network analysis. However, most methods rely on assumptions which are rarely met in practice, such as prior knowledge of some outliers or about normal behavior. We propose here Outskewer, a new approach based on the notion of skewness (a measure of the symmetry of a distribution) and its evolution when extremal values are removed one by one. Our method is easy to set up, it requires no prior knowledge on the system, and it may be used on-line. We illustrate its performance on two data sets representative of many use-cases: evolution of ego-centered views of the internet topology, and logs of queries entered into a search engine.

Download

Complex Networks approach to Mutualistic Ecosystems

Laura Hernandez

Jeudi 24 mai 2012 à 11h, salle 25-26/101

Mutualistic ecosystems are usually groups of animals and plants, helping each other to fulfil essential biological functions such as feeding or reproduction as in seed dispersal or pollination networks. Such systems may be described in terms of a complex network, where the nodes represent the animal or plant species and the links represent the existence of a contact between a plant and an animal species. As only contacts between nodes belonging to different guilds are allowed, the corresponding network is bipartite. Coding this information in a bipartite adjacency matrix, it is observed that real ecosystems are not a random collection of interacting species, but they display instead, a high degree of internal organization. Different hypothesis are discussed in the ecological literature to explain this particular order. It is fairly obvious that a detailed explanation of the interaction behaviour of individual species can be of little help to understand the generalized pattern that is found across ecological systems of very different sizes and types, that involve plants of different nature and animals that range from insects to birds. The tools commonly used by ecologists to study these systems are based on the statistical analysis of observed data. In this talk I will present an alternative way to study this problem, by introducing an algorithm that allows us to try different supposed hypothesis in the form of a Contact Preference Rule (CPR) that governs the dynamics of the system. Starting from a random configuration the system is evolved under the studied CPR and the comparison of the order state reached by this artificial system with the order observed in real systems allows us to decide whether a CPR may be considered or not as responsible for the observed order. In particular, I will introduce a new way to measure the order of mutualistic ecosystems and I will discuss about the relationship between the phylogenetic proximity of the members of each guild and the observed order.

How to detect causality effects on large dynamical communication networks: A case study

Tabourier, L. and Stoica, A. and Peruani, F.

Communication Systems and Networks (COMSNETS), 2012 Fourth International Conference on

Here we propose a set of dynamical measures to detect causality effects on communication datasets. Using appropriate comparison models, we are able to enumerate patterns containing causality relationships. This approach is illustrated on a large cellphone call dataset: we show that specific patterns such as short chain-like trees and directed loops are more frequent in real networks than in comparison models at short time scales. We argue that these patterns – which involve a node and its close neighborhood – constitute indirect evidence of active spreading of information only at a local level. This suggests that mobile phone networks are used almost exclusively to communicate information to a closed group of individuals. Furthermore, our study reveals that the bursty activity of the callers promotes larger patterns at small time scales.

Download

Classifying Relationships in Social Networks

Aline Carneiro Viana

Lundi 14 mai 2012 à 11h, salle 25-26/101

Slides

The constant advancement of information systems has allowed more data to be generated and stored from the most diverse situations. It is fascinating that, behind these records, we see the reflection of the environment itself, since every record represents a decision made by some entity. In this work, we modeled real-world scenarios of mobility from using temporal complex networks. The analysis assumes that these systems are composed of entities able to interact in a rational manner, reflecting their interests and activity dynamic. In this direction, we propose a technique for analyzing mobility scenarios from random graphs. This technique examines how the real system would evolve if the agents decisions were random, and from there, you can check, for example, which edges are random and which are derived from social relationships, such as friendship or professional.

Stable community cores in complex networks

Massoud Seifi, Jean-Loup Guillaume, Ivan Junier, Jean-Baptiste Rouquier and Svilen Iskrov

Proceedings of the 3rd Workshop on Complex Networks (CompleNet 2012), Melbourne, Florida

Complex networks are generally composed of dense sub-networks called communities. Many algorithms have been proposed to automatically detect such communities. However, they are often unstable and behave non-deterministically. We propose here to use this non-determinism in order to compute groups of nodes on which community detection algorithms agree most of the time.We show that these groups of nodes, called community cores, are more similar to Ground Truth than communities in real and artificial networks. Furthermore, we show that in contrary to the classical approaches, we can reveal the absence of community structure in random graphs.

Download

Community Cores in Evolving Networks

Massoud Seifi and Jean-Loup Guillaume

Proceedings of the Mining Social Network Dynamic 2012 Workshop (MSND), Inconjunction with the international conference World Wide Web WWW 2012, Lyon,France, pp. 1173-1180

Community structure is a key property of complex networks.Many algorithms have been proposed to automatically detect communities in static networks but few studies haveconsidered the detection and tracking of communities in anevolving network. Tracking the evolution of a given community over time requires a clustering algorithm that producesstable clusters. However, most community detection algorithms are very unstable and therefore unusable for evolvingnetworks. In this paper, we apply the methodology proposedin [14] to detect what we call community cores in evolvingnetworks. We show that cores are much more stable than »classical » communities and that we can overcome the disadvantages of the stabilized methods.

Download

R tip: Make a ggplot output ready for publication

Team pad

Intrinsically dynamic communities from evolving, directed network data

Bivas Mitra, Lionel Tabourier and Camille Roth

Computer Networks, Vol. 56(3), 2012.

Community finding algorithms for networks have recently been extended to dynamic data. Most of these recent methods aim at exhibiting community partitions from successive graph snapshots and thereafter connecting or smoothing these partitions using clever time-dependent features and sampling techniques. These approaches are nonetheless achieving longitudinal rather than dynamic community detection. We assume that commu- nities are fundamentally defined by the repetition of interactions among a set of nodes over time. According to this definition, analyzing the data by considering successive snapshots induces a significant loss of information: we suggest that it blurs essentially dynamic phe- nomena—such as communities based on repeated inter-temporal interactions, nodes switching from a community to another across time, or the possibility that a community survives while its members are being integrally replaced over a longer time period. We propose a formalism which aims at tackling this issue in the context of time-directed data- sets (such as citation networks), and present several illustrations on both empirical and synthetic dynamic networks. We eventually introduce intrinsically dynamic metrics to qualify temporal community structure and emphasize their possible role as an estimator of the quality of the community detection—taking into account the fact that various empir- ical contexts may call for distinct ‘community’ definitions and detection criteria.

Download

Impact of clustering on epidemics in random networks

Emilie Coupechoux

Lundi 2 avril 2012 à 14h, salle 55-65/211

Slides

Motivated by the analysis of social networks, we study a model of network that has both a given degree distribution and a tunable clustering coefficient. We analyze two types of epidemic processes on this random graph model: a diffusion process, which is characterized by an infection probability, each neighbor transmitting the epidemic independently, and a contagion model, which is inspired by a simple coordination game played on the network. Both types of processes have been used to model spread of new ideas, technologies, viruses or worms and results have been obtained for random graphs with no clustering. In this talk, we are interested in the impact of clustering on the growth processes. In both cases, we characterize conditions under which a global cascade is possible, and compute the cascade size explicitly, as a function of the degree distribution and the clustering coefficient. While clustering inhibits the diffusion process (in power-law and regular graphs), its impact for the contagion process is more subtle and depends on the connectivity of the graph: in a low connectivity regime, clustering also inhibits the contagion, while in a high connectivity regime, clustering favors the appearance of global cascades but reduces their size.

File Diffusion in a Dynamic Peer-to-peer Network

Alice Albano, Jean-Loup Guillaume, and Bénédicte Le Grand

Proceedings of the  Mining Social Network Dynamic 2012 Workshop (MSND), In conjunction with the international conference World Wide Web WWW 2012, Lyon, France

Many studies have been made on diffusion in the field of epidemiology, and in the last few years, the development of social networking has induced new types of diffusion. In this paper, we focus on file diffusion on a peer-to-peer dynamic network using eDonkey protocol. On this network, we observe a linear behavior of the actual file diffusion. This result is interesting, because most diffusion models exhibit exponential behaviors. In this paper, we propose a new model of diffusion, based on the SI (Susceptible / Infected) model, which produces results close to the linear behavior of the observed diffusion. We then justify the linearity of this model, and we study its behavior in more details.

Download

Quantifying Paedophile Activity in a Large P2P System

Matthieu Latapy, Clémence Magnien et Raphaël Fournier

in Information Processing and Management, Volume 49, Issue 1, January 2013, Pages 248–263

Increasing knowledge of paedophile activity in P2P systems is a crucial societal concern, with important consequences on child protection, policy making, and internet regulation. Because of a lack of traces of P2P exchanges and rigorous analysis methodology, however, current knowledge of this activity remains very limited. We consider here a widely used P2P system, eDonkey, and focus on two key statistics: the fraction of paedophile queries entered in the system and the fraction of users who entered such queries. We collect hundreds of millions of keyword-based queries; we design a paedophile query detection tool for which we establish false positive and false negative rates using assessment by experts; with this tool and these rates, we then estimate the fraction of paedophile queries in our data; finally, we design and apply methods for quantifying users who entered such queries. We conclude that approximately 0.25% of queries are paedophile, and that more than 0.2% of users enter such queries. These statistics are by far the most precise and reliable ever obtained in this domain.

Download

Dynamics on and of subway networks

Camille Roth

Vendredi 2 mars 2012 à 14h, salle 25/26-101

Slides

Subway networks shape, to some extent, the structure of movements of individuals across a city; similarly, they are being partially shaped by the presence of these individuals in the city. This talk will present two complementary studies describing the dynamic processes which subway networks both host and undergo. The first analysis focuses on dynamics processes occurring on the subway network of a large city (London) in terms of its commuting patterns. It uses the large scale, real-time electronic ticketing data from the Oyster Card system, introduced less than a decade ago, to reveal a part of the structure and organization of the city. More precisely, this study shows that patterns of intraurban movement are strongly heterogeneous in terms of volume, but not in terms of distance travelled, and that there is a polycentric structure composed of large flows organized around a limited number of activity centers. For smaller flows, the pattern of connections becomes richer and more complex and is not strictly hierarchical since it mixes different levels consisting of different orders of magnitude. The second study investigates the temporal evolution of the major subway networks in the world over the last century. The main result is that most of these networks tend to converge to a shape which shares some generic features, despite their geographical and economical differences. These features include a core with branches radiating from it to cover about twice the average radial extension of the core. The core generally includes about 60% of the network stations and exhibits an average degree of order 2.5. Interestingly, core and branches define two distinct and universal regimes in terms of the number of stations at a given distance from the barycenter. This result which was difficult to interpret in the framework of fractal geometry finds here a natural explanation. More broadly, these two types of studies open the way to more integrated analyses of the coevolution between the dynamics on and of subway networks.