Discovering Patterns of Interest in IP Traffic Using Cliques in Bipartite Link Streams

Tiphaine Viard, Raphaël Fournier-S’niehotta, Clémence Magnien and Matthieu Latapy

International Conference on Complex Networks (CompleNet), 2018.

Studying IP traffic is crucial for many applications. We focus here on the detection of (structurally and temporally) dense sequences of interactions, that may indicate botnets or coordinated network scans. More precisely, we model a MAWI capture of IP traffic as a link streams, i.e. a sequence of interactions (t1,t2,u,v) meaning that devices u and v exchanged packets from time t1 to time t2 . This traffic is captured on a single router and so has a bipartite structure: links occur only between nodes in two disjoint sets. We design a method for finding interesting bipartite cliques in such link streams, i.e. two sets of nodes and a time interval such that all nodes in the first set are linked to all nodes in the second set throughout the time interval. We then explore the bipartite cliques present in the considered trace. Comparison with the MAWILab classification of anomalous IP addresses shows that the found cliques succeed in detecting anomalous network activity.

Stream Graphs and Link Streams for the Modeling of Interactions over Time

Matthieu Latapy, Tiphaine Viard and Clémence Magnien

Social Networks Analysis and Mining, 8: 61, 2018

Graph theory provides a language for studying the structure of relations, and it is often used to study interactions over time too. However, it poorly captures the both temporal and structural nature of interactions, that calls for a dedicated formalism. In this paper, we generalize graph concepts in order to cope with both aspects in a consistent way. We start with elementary concepts like density, clusters, or paths, and derive from them more advanced concepts like cliques, degrees, clustering coefficients, or connected components. We obtain a language to directly deal with interactions over time, similar to the language provided by graphs to deal with relations. This formalism is self-consistent: usual relations between different concepts are preserved. It is also consistent with graph theory: graph concepts are special cases of the ones we introduce. This makes it easy to generalize higher-level objects such as quotient graphs, line graphs, k-cores, and centralities. This paper also considers discrete versus continuous time assumptions, instantaneous links, and extensions to more complex cases.

Comparaison de mesures de centralité basées sur les plus courts chemins dans les réseaux dynamiques

Marwan Ghanem , Clémence Magnien, Fabien Tarissan

Extraction et Gestion de Connaissances (EGC 2018)

Définir l’importance des noeuds dans les réseaux statiques est une question de recherche très étudiée depuis de nombreuses années. Dernièrement, des adaptations des métriques classiques ont été proposées pour les réseaux dynamiques. Ces méthodes reposent sur des approches très différentes dans leur façon d’évaluer l’importance des noeuds à un instant donné. Il est donc nécessaire de pouvoir les évaluer et les comparer. Dans cet article, nous comparons trois approches existes pour mieux comprendre ce qui les différencie. Nous montrons que la nature des jeux de données influe grandement sur le comportement des méthodes, et que pour certains d’entre eux, la notion d’importance n’est pas toujours  pertinente.

Culture et (in)dépendance – Les enjeux de lindépendance dans les industries culturelles

Olivier Alexandre, Sophie Noël and Aurélie Pinto

ICCA – Industries culturelles, création, numérique, 2017

« Indépendant », « alternatif », « indie », « underground », « avant-garde », « de création »… Depuis les années 1970, la revendication d’indépendance a pris une importance grandissante dans les univers de production culturelle. Qu’elle se rapporte à des contenus, des méthodes de travail ou des dispositifs de médiation, cette revendication propose une alternative à la domination des groupes et des productions mainstream. Son succès conduit cependant à s’interroger sur la cohérence même d’une notion progressivement transformée en label de qualité. À travers douze contributions traitant de l’édition, du cinéma, de la musique, des médias et de la vulgarisation scientifique, cet ouvrage montre en quoi l’indépendance relève d’une construction sociale tributaire de son environnement institutionnel et marchand. Des ondes aux écrans, de l’Europe aux États-Unis, des managers aux artistes, il met en évidence le balancement entre artisanat de création et recherche d’une structuration économique pérenne. En mettant à distance la dénonciation ritualisée de l’hégémonie des majors et autres « grands groupes » et en s’appuyant sur des terrains ancrés dans différents contextes nationaux, ce livre fait le pari d’une approche transversale pour mieux saisir la manière dont l’indépendance irrigue et structure des filières trop souvent envisagées de manière cloisonnée. Il éclaire ainsi une catégorie de référence des industries culturelles paradoxalement peu étudiée par les sciences sociales, et permet de saisir l’évolution des rapports de force dans des secteurs confrontés à une rationalisation économique et à des mutations technologiques de grande ampleur.

Tracking bitcoin users activity using community detection on a network of weak signals.

Rémy Cazabet, Rym Baccour and Matthieu Latapy

The 6th International Conference on Complex Networks and Their Applications, Nov 2017, Lyon, France.

Abstract Bitcoin is a cryptocurrency attracting a lot of interest both from the gen- eral public and researchers. There is an ongoing debate on the question of users? anonymity: while the Bitcoin protocol has been designed to ensure that the activ- ity of individual users could not be tracked, some methods have been proposed to partially bypass this limitation. In this article, we show how the Bitcoin transac- tion network can be studied using complex networks analysis techniques, and in particular how community detection can be efficiently used to re-identify multiple addresses belonging to a same user.

Download

Dynamic Community Detection

Cazabet, Rémy, Giulio Rossetti, and Fredéric Amblard. 

Encyclopedia of Social Network Analysis and Mining, 2017

Impact of temporal features of cattle exchanges on the size and speed of epidemic outbreaks

Aurore Payen, Lionel Tabourier and Matthieu Latapy

ICCSA 2017, Part II, LNCS 10405 proceedings

Databases recording cattle exchanges offer unique opportunities for a better understanding and fighting of disease spreading. Most studies model contacts with (sequences of) networks, but this approach neglects important dynamical features of exchanges, that are known to play a key role in spreading. We use here a fully dynamic modeling of contacts and empirically compare the spreading outbreaks obtained with it to the ones obtained with network approaches. We show that neglecting time information leads to significant over-estimates of actual sizes of spreading cascades, and that these sizes are much more heterogeneous than generally assumed. Our approach also makes it possible to study the speed of spreading, and we show that the observed speeds vary greatly, even for a same cascade size.

Download

Combining structural and dynamic information to predict activity in link streams

Thibaud Arnoux, Lionel Tabourier and Matthieu Latapy

In proceedings of the International Symposium on Foundations and Applications of Big Data Analytics (FAB), in conjunction with ASONAM, 2017.

A link stream is a sequence of triplets (t, u, v) meaning that nodes u and v have interacted at time t. Capturing both the structural and temporal aspects of interactions is crucial for many real world datasets like contact between individuals. We tackle the issue of activity prediction in link streams, that is to say predicting the number of links occurring during a given period of time and we present a protocol that takes advantage of the temporal and structural information contained in the link stream. We introduce a way to represent the information captured using different features and combine them in a prediction function which is used to evaluate the future activity of links.

Download

Ego-betweenness centrality in link streams

Marwan Ghanem, Florent Coriat and Lionel Tabourier

6th Workshop on Social Network Analysis in Applications (SNAA 2017), in conjunction with ASONAM, 2017.

The ability of a node to relay information in a network is often measured using betweenness centrality. In order to take into account the fact that the role of the nodes vary through time, several adaptations of this concept have been proposed to time- evolving networks. However, these definitions are demanding in terms of computational cost, as they call for the computation of time-ordered paths. We propose a definition of centrality in link streams which is node-centric, in the sense that we only take into account the direct neighbors of a node to compute its centrality. This restriction allows to carry out the computation in a shorter time compared to a case where any couple of nodes in the network should be considered. Tests on empirical data show that this measure is relatively highly correlated to the number of times a node would relay information in a flooding process. We suggest that this is a good indication that this measurement can be of use in practical contexts where a node has a limited knowledge of its environment, such as routing protocols in delay tolerant networks.

Download

Tracking the evolution of temporal patterns of usage in bicycle-Sharing systems using nonnegative matrix factorization on multiple sliding windows

Remy Cazabet, Pablo Jensen, Pierre Borgnat

International Journal of Urban Sciences, 2017, p. 1-15

Bicycle-Sharing Systems (BSS) are growing quickly in popularity all over the world. In this article, we propose a method based on Nonnegative Matrix Factorization to study the typical temporal patterns of usage of the BSS of Lyon, France, by studying logs of rentals. First, we show how this approach allows us to understand the spatial and temporal usage of the system. Second, we show how we can track the evolution of these temporal patterns over several years, and how this information can be used to better understand the BSS, but also changes in the city itself, by considering the stations as social sensors.

Download

Time Weight Content-Based Extensions of Temporal Graphs for Personalized Recommendation

Armel Jacques Nzekon Nzeko’o, Maurice Tchuente, Matthieu Latapy

In WEBIST 2017

Recommender systems are an answer to information overload on the web. They filter and present to customer, a small subset of items that he is most likely to be interested in. Since user’s interests may change over time, accurately capturing these dynamics is important, though challenging. The Session-based Temporal Graph (STG) has been proposed by Xiang et al. to provide temporal recommendations by combining long- and short-term preferences. Later, Yu et al. have introduced an extension called Topic-STG, which takes into account topics extracted from tweets’ textual information. Recently, we pushed the idea further and proposed Content-based STG. However, in all these frameworks, the importance of links does not depend on their arrival time, which is a strong limitation: at any given time, purchases made last week should have a greater influence than purchases made a year ago. In this paper, we address this problem by proposing Time Weight Content-based STG, in which we assign a time-decreasing weight to edges. Using Time-Averaged Hit Ratio, we show that this approach outperforms all previous ones in real-world situations.

Download

Using Degree Constrained Gravity Null-Models to understand the structure of journeys networks in Bicycle Sharing Systems

Remy Cazabet, Pierre Borgnat, Pablo jensen

In ESANN 2017

Bicycle Sharing Systems are now ubiquitous in large cities around the world. In most of these systems, journeys’ data can be ex- tracted, providing rich information to better understand it. Recent works have used network based-machine learning, and in particular space-corrected node clustering, to analyse such datasets. In this paper, we show that spatial-null models used in previous methods have a systematic bias, and we propose a degree-contrained null-model to improve the results. We finally apply the proposed method on the BSS of a city.

Download

Enhancing Space-Aware Community Detection Using Degree Constrained Spatial Null Model

Remy Cazabet, Pierre Borgnat, Pablo Jensen

In Complenet 2017

Null models have many applications on networks, from testing the significance of observations to the conception of algorithms such as community detection. They usually preserve some network properties , such as degree distribution. Recently, some null-models have been proposed for spatial networks, and applied to the community detection problem. In this article, we propose a new null-model adapted to spatial networks, that, unlike previous ones, preserves both the spatial structure and the degrees of nodes. We show the efficacy of this null-model in the community detection case both on synthetic and collected networks.

Download

Characterising inter and intra-community interactions in link streams using temporal motifs

Jean Creusefond, Remy Cazabet

In Complenet 2017

The analysis of dynamic networks has received a lot of attention in recent years, thanks to the greater availability of suitable datasets. One way to analyze such dataset is to study temporal motifs in link streams , i.e. sequences of links for which we can assume causality. In this article, we study the relationship between temporal motifs and communities , another important topic of complex networks. Through experiments on several real-world networks, with synthetic and ground truth community partitions, we identify motifs that are overrepresented at the frontier –or inside of– communities.

Download

Rigorous Measurement of the Internet Degree Distribution

Matthieu Latapy, Elie Rotenberg, Christophe Crespelle, Fabien Tarissan

In Complex Systems, volume 26, number 1 (2017)

The degree distribution of the internet, i.e. the fraction of routers with k links for any k, is its most studied property. It has a crucial influence on network robustness, spreading phenomena, and protocol design. In practice, however, this distribution is observed on partial, biased and erroneous maps. This raises serious concerns about the true knowledge we actually have of this key property. Here, we design and run a drastically new measurement approach for the reliable estimation of the degree distribution of the internet, without resorting to any map. It consists in sampling random core routers and precisely estimating their degree with probes sent from many monitors scattered over the internet. Our measurement shows that the true degree distribution significantly differs from classical assumptions: it is heterogeneous but it decreases sharply, in a way incompatible with a heavy-tailed power law.

Download

Finding remarkably dense sequences of contacts in link streams

Noé Gaumont, Clémence Magnien and Matthieu Latapy

In Social Network Analysis and Mining (2016) 6: 87. doi:10.1007/s13278-016-0396-z

A link stream is a set of quadruplets (b, e, u, v) meaning that a link exists between u and v from time b to time e. Link streams model many real-world situations like contacts between individuals, connections between devices, and others. Much work is currently devoted to the generalization of classical graph and network concepts to link streams. We argue that the density is a valuable notion for understanding and characterizing links streams. We propose a method to capture specific groups of links that are structurally and temporally densely connected and show that they are meaningful for the description of link streams. To find such groups, we use classical graph community detection algorithms, and we assess obtained groups. We apply our method to several real-world contact traces (captured by sensors) and demonstrate the relevance of the obtained structures.

Download

P2PTV Multi-channel Peers Analysis

Marwan Ghanem, Olivier Fourmaux, Fabien Tarissan and Takumi Miyoshi

In The 18th Asia-Pacific Network Operations and Management Symposium (APNOMS’16), Kanazawa, Japan, 2016.

After being the support of the data and voice convergence, the Internet has become one of the main video providers such as TV-stream. As an  alternative to limited or expensive technologies, P2PTV has turned out to be a promising support for such applications. This infrastructure strongly relies on the overlay composed by the peers that consume and diffuse video contents at the same time. Understanding the dynamical properties of this overlay, and in particular how the users switch from one overlay to another, appears to be a key aspect if one wants to improve the quality of P2PTV. In this paper, we investigate the question of relying on non-invasive measurement techniques to track the presence of users on several channels of P2PTV. Using two datasets obtained by using network measurement on P2PTV infrastructure, we show that such an approach contains sufficient information to track the presence of users on several channels. Besides, exploiting the view provided by sliding time windows, we are able to refine the analysis and track users that switch from one channel to another, leading to the detection of super-peers and providing explanations of the different roles they can play in the infrastructure. In addition, by comparing the results obtained on the two datasets, we show how such analyses can shed some light on the evolution of the infrastructure policy.

Download

Predicting links in ego-networks using temporal information

Lionel Tabourier, Anne-Sophie Libert and Renaud Lambiotte

In EPJ Data Science (2016) 5: 1

Link prediction appears as a central problem of network science, as it calls for unfolding the mechanisms that govern the micro-dynamics of the network. In this work, we are interested in ego-networks, that is the mere information of interactions of a node to its neighbors, in the context of social relationships. As the structural information is very poor, we rely on another source of information to predict links among egos’ neighbors: the timing of interactions. We define several features to capture different kinds of temporal information and apply machine learning methods to combine these various features and improve the quality of the prediction. We demonstrate the efficiency of this temporal approach on a cellphone interaction dataset, pointing out features which prove themselves to perform well in this context, in particular the temporal profile of interactions and elapsed time between contacts.

Download

Characterizing and predicting mobile application usage

Keun-Woo Lim, Stefano Secci, Lionel Tabourier and Badis Tebbani

In Computer Communications, 2016, vol. 95, p. 82-94

In this paper, we propose data clustering techniques to predict temporal characteristics of data consumption behavior of different mobile applications via wireless communications. While most of the research on mobile data analytics focuses on the analysis of call data records and mobility traces, our analysis concentrates on mobile application usages, to characterize them and predict their behavior. We exploit mobile application usage logs provided by a Wi-Fi local area network service provider to characterize temporal behavior of mobile applications. More specifically, we generate daily profiles of “what” types of mobile applications users access and “when” users access them. From these profiles, we create usage classes of mobile applications via aggregation of similar profiles depending on data consumption rate, using three clustering techniques that we compare. Furthermore, we show that we can utilize these classes to analyze and predict future usages of each mobile application through progressive comparison using distance and similarity comparison techniques. Finally, we also detect and exploit outlying behavior in application usage profiles and discuss methods to efficiently predict them.

Download

Analysis of the temporal and structural features of threads in a mailing-list

Noé Gaumont, Tiphaine Viard, Raphaël Fournier-S’niehotta, Qinna Wang and Matthieu Latapy

In Complex Networks VII: Proceedings of the 2016 Workshop on Complex Networks.

A link stream is a collection of triplets (t,u,v) indicating that an interaction occurred between u and v at time t. Link streams model many real-world situations like email exchanges between individuals, connections between devices, and others. Much work is currently devoted to the generalization of classical graph and network concepts to link streams. In this paper, we generalize the existing notions of intra-community density and inter-community density. We focus on emails exchanges in the Debian mailing list, and show that threads of emails, like communities in graphs, are dense subsets loosely connected from a link stream perspective.