Combining structural and dynamic information to predict activity in link streams

Thibaud Arnoux, Lionel Tabourier and Matthieu Latapy

In proceedings of the International Symposium on Foundations and Applications of Big Data Analytics (FAB), in conjunction with ASONAM, 2017.

A link stream is a sequence of triplets (t, u, v) meaning that nodes u and v have interacted at time t. Capturing both the structural and temporal aspects of interactions is crucial for many real world datasets like contact between individuals. We tackle the issue of activity prediction in link streams, that is to say predicting the number of links occurring during a given period of time and we present a protocol that takes advantage of the temporal and structural information contained in the link stream. We introduce a way to represent the information captured using different features and combine them in a prediction function which is used to evaluate the future activity of links.

Download

Ego-betweenness centrality in link streams

Marwan Ghanem, Florent Coriat and Lionel Tabourier

6th Workshop on Social Network Analysis in Applications (SNAA 2017), in conjunction with ASONAM, 2017.

The ability of a node to relay information in a network is often measured using betweenness centrality. In order to take into account the fact that the role of the nodes vary through time, several adaptations of this concept have been proposed to time- evolving networks. However, these definitions are demanding in terms of computational cost, as they call for the computation of time-ordered paths. We propose a definition of centrality in link streams which is node-centric, in the sense that we only take into account the direct neighbors of a node to compute its centrality. This restriction allows to carry out the computation in a shorter time compared to a case where any couple of nodes in the network should be considered. Tests on empirical data show that this measure is relatively highly correlated to the number of times a node would relay information in a flooding process. We suggest that this is a good indication that this measurement can be of use in practical contexts where a node has a limited knowledge of its environment, such as routing protocols in delay tolerant networks.

Download

Tracking the evolution of temporal patterns of usage in bicycle-Sharing systems using nonnegative matrix factorization on multiple sliding windows

Remy Cazabet, Pablo Jensen, Pierre Borgnat

International Journal of Urban Sciences, 2017, p. 1-15

Bicycle-Sharing Systems (BSS) are growing quickly in popularity all over the world. In this article, we propose a method based on Nonnegative Matrix Factorization to study the typical temporal patterns of usage of the BSS of Lyon, France, by studying logs of rentals. First, we show how this approach allows us to understand the spatial and temporal usage of the system. Second, we show how we can track the evolution of these temporal patterns over several years, and how this information can be used to better understand the BSS, but also changes in the city itself, by considering the stations as social sensors.

Download

Time Weight Content-Based Extensions of Temporal Graphs for Personalized Recommendation

Armel Jacques Nzekon Nzeko’o, Maurice Tchuente, Matthieu Latapy

In WEBIST 2017

Recommender systems are an answer to information overload on the web. They filter and present to customer, a small subset of items that he is most likely to be interested in. Since user’s interests may change over time, accurately capturing these dynamics is important, though challenging. The Session-based Temporal Graph (STG) has been proposed by Xiang et al. to provide temporal recommendations by combining long- and short-term preferences. Later, Yu et al. have introduced an extension called Topic-STG, which takes into account topics extracted from tweets’ textual information. Recently, we pushed the idea further and proposed Content-based STG. However, in all these frameworks, the importance of links does not depend on their arrival time, which is a strong limitation: at any given time, purchases made last week should have a greater influence than purchases made a year ago. In this paper, we address this problem by proposing Time Weight Content-based STG, in which we assign a time-decreasing weight to edges. Using Time-Averaged Hit Ratio, we show that this approach outperforms all previous ones in real-world situations.

Download

Using Degree Constrained Gravity Null-Models to understand the structure of journeys networks in Bicycle Sharing Systems

Remy Cazabet, Pierre Borgnat, Pablo jensen

In ESANN 2017

Bicycle Sharing Systems are now ubiquitous in large cities around the world. In most of these systems, journeys’ data can be ex- tracted, providing rich information to better understand it. Recent works have used network based-machine learning, and in particular space-corrected node clustering, to analyse such datasets. In this paper, we show that spatial-null models used in previous methods have a systematic bias, and we propose a degree-contrained null-model to improve the results. We finally apply the proposed method on the BSS of a city.

Download

Enhancing Space-Aware Community Detection Using Degree Constrained Spatial Null Model

Remy Cazabet, Pierre Borgnat, Pablo Jensen

In Complenet 2017

Null models have many applications on networks, from testing the significance of observations to the conception of algorithms such as community detection. They usually preserve some network properties , such as degree distribution. Recently, some null-models have been proposed for spatial networks, and applied to the community detection problem. In this article, we propose a new null-model adapted to spatial networks, that, unlike previous ones, preserves both the spatial structure and the degrees of nodes. We show the efficacy of this null-model in the community detection case both on synthetic and collected networks.

Download

Characterising inter and intra-community interactions in link streams using temporal motifs

Jean Creusefond, Remy Cazabet

In Complenet 2017

The analysis of dynamic networks has received a lot of attention in recent years, thanks to the greater availability of suitable datasets. One way to analyze such dataset is to study temporal motifs in link streams , i.e. sequences of links for which we can assume causality. In this article, we study the relationship between temporal motifs and communities , another important topic of complex networks. Through experiments on several real-world networks, with synthetic and ground truth community partitions, we identify motifs that are overrepresented at the frontier –or inside of– communities.

Download

Rigorous Measurement of the Internet Degree Distribution

Matthieu Latapy, Elie Rotenberg, Christophe Crespelle, Fabien Tarissan

In Complex Systems, volume 26, number 1 (2017)

The degree distribution of the internet, i.e. the fraction of routers with k links for any k, is its most studied property. It has a crucial influence on network robustness, spreading phenomena, and protocol design. In practice, however, this distribution is observed on partial, biased and erroneous maps. This raises serious concerns about the true knowledge we actually have of this key property. Here, we design and run a drastically new measurement approach for the reliable estimation of the degree distribution of the internet, without resorting to any map. It consists in sampling random core routers and precisely estimating their degree with probes sent from many monitors scattered over the internet. Our measurement shows that the true degree distribution significantly differs from classical assumptions: it is heterogeneous but it decreases sharply, in a way incompatible with a heavy-tailed power law.

Download

Finding remarkably dense sequences of contacts in link streams

Noé Gaumont, Clémence Magnien and Matthieu Latapy

In Social Network Analysis and Mining (2016) 6: 87. doi:10.1007/s13278-016-0396-z

A link stream is a set of quadruplets (b, e, u, v) meaning that a link exists between u and v from time b to time e. Link streams model many real-world situations like contacts between individuals, connections between devices, and others. Much work is currently devoted to the generalization of classical graph and network concepts to link streams. We argue that the density is a valuable notion for understanding and characterizing links streams. We propose a method to capture specific groups of links that are structurally and temporally densely connected and show that they are meaningful for the description of link streams. To find such groups, we use classical graph community detection algorithms, and we assess obtained groups. We apply our method to several real-world contact traces (captured by sensors) and demonstrate the relevance of the obtained structures.

Download

P2PTV Multi-channel Peers Analysis

Marwan Ghanem, Olivier Fourmaux, Fabien Tarissan and Takumi Miyoshi

In The 18th Asia-Pacific Network Operations and Management Symposium (APNOMS’16), Kanazawa, Japan, 2016.

After being the support of the data and voice convergence, the Internet has become one of the main video providers such as TV-stream. As an  alternative to limited or expensive technologies, P2PTV has turned out to be a promising support for such applications. This infrastructure strongly relies on the overlay composed by the peers that consume and diffuse video contents at the same time. Understanding the dynamical properties of this overlay, and in particular how the users switch from one overlay to another, appears to be a key aspect if one wants to improve the quality of P2PTV. In this paper, we investigate the question of relying on non-invasive measurement techniques to track the presence of users on several channels of P2PTV. Using two datasets obtained by using network measurement on P2PTV infrastructure, we show that such an approach contains sufficient information to track the presence of users on several channels. Besides, exploiting the view provided by sliding time windows, we are able to refine the analysis and track users that switch from one channel to another, leading to the detection of super-peers and providing explanations of the different roles they can play in the infrastructure. In addition, by comparing the results obtained on the two datasets, we show how such analyses can shed some light on the evolution of the infrastructure policy.

Download

Predicting links in ego-networks using temporal information

Lionel Tabourier, Anne-Sophie Libert and Renaud Lambiotte

In EPJ Data Science (2016) 5: 1

Link prediction appears as a central problem of network science, as it calls for unfolding the mechanisms that govern the micro-dynamics of the network. In this work, we are interested in ego-networks, that is the mere information of interactions of a node to its neighbors, in the context of social relationships. As the structural information is very poor, we rely on another source of information to predict links among egos’ neighbors: the timing of interactions. We define several features to capture different kinds of temporal information and apply machine learning methods to combine these various features and improve the quality of the prediction. We demonstrate the efficiency of this temporal approach on a cellphone interaction dataset, pointing out features which prove themselves to perform well in this context, in particular the temporal profile of interactions and elapsed time between contacts.

Download

Characterizing and predicting mobile application usage

Keun-Woo Lim, Stefano Secci, Lionel Tabourier and Badis Tebbani

In Computer Communications, 2016, vol. 95, p. 82-94

In this paper, we propose data clustering techniques to predict temporal characteristics of data consumption behavior of different mobile applications via wireless communications. While most of the research on mobile data analytics focuses on the analysis of call data records and mobility traces, our analysis concentrates on mobile application usages, to characterize them and predict their behavior. We exploit mobile application usage logs provided by a Wi-Fi local area network service provider to characterize temporal behavior of mobile applications. More specifically, we generate daily profiles of “what” types of mobile applications users access and “when” users access them. From these profiles, we create usage classes of mobile applications via aggregation of similar profiles depending on data consumption rate, using three clustering techniques that we compare. Furthermore, we show that we can utilize these classes to analyze and predict future usages of each mobile application through progressive comparison using distance and similarity comparison techniques. Finally, we also detect and exploit outlying behavior in application usage profiles and discuss methods to efficiently predict them.

Download

Analysis of the temporal and structural features of threads in a mailing-list

Noé Gaumont, Tiphaine Viard, Raphaël Fournier-S’niehotta, Qinna Wang and Matthieu Latapy

In Complex Networks VII: Proceedings of the 2016 Workshop on Complex Networks.

A link stream is a collection of triplets (t,u,v) indicating that an interaction occurred between u and v at time t. Link streams model many real-world situations like email exchanges between individuals, connections between devices, and others. Much work is currently devoted to the generalization of classical graph and network concepts to link streams. In this paper, we generalize the existing notions of intra-community density and inter-community density. We focus on emails exchanges in the Debian mailing list, and show that threads of emails, like communities in graphs, are dense subsets loosely connected from a link stream perspective.

Computing maximal cliques in link streams

Tiphaine Viard, Matthieu Latapy and Clémence Magnien

Theoretical Computer Science (TCS), Volume 609, Part 1, 4 January 2016, Pages 245–252.

A link stream is a collection of triplets (t, u, v) indicating that an interaction occurred between u and v at time t. We generalize the classical notion of cliques in graphs to such link streams: for a given , a -clique is a set of nodes and a time interval such that all pairs of nodes in this set interact at least once during each sub-interval of duration . We propose an algorithm to enumerate all maximal (in terms of nodes or time interval) cliques of a link stream, and illustrate its practical relevance on a real-world contact trace.

Efficient and simple generation of random simple connected graphs with prescribed degree sequence

Fabien Viger and Matthieu Latapy

Journal of Complex Networks (2015) 4 (1): 15-37

We address here the problem of generating random graphs uniformly from the set of simple connected graphs having a prescribed degree sequence. Our goal is to provide an algorithm designed for practical use both because of its ability to generate very large graphs (efficiency) and because it is easy to implement (simplicity). We focus on a family of heuristics for which we introduce optimality conditions, and show how this optimality can be reached in practice. We then propose a different approach, specifically designed for real-world degree distributions, which outperforms the first one. Based on a conjecture which we argue rigorously and which was confirmed by strong empirical evidence, we finally reduce the best asymptotic complexity bound known so far.

RankMerging: Apprentissage supervisé de classements pour la prédiction de liens dans les grands réseaux sociaux

Lionel Tabourier, Anne-Sophie Libert et Renaud Lambiotte

EGC 2015, 15ème conférence internationale sur l’extraction et la gestion des connaissances

Trouver les liens manquants dans un grand réseau social est une tâche difficile, car ces réseaux sont peu denses, et les liens peuvent correspondre à des environnements structurels variés. Dans cet article, nous décrivons RankMerging, une méthode d’apprentissage supervisé simple pour combiner l’information obtenue par différentes méthodes de classement. Afin d’illustrer son intérêt, nous l’appliquons à un réseau d’utilisateurs de téléphones portables, pour montrer comment un opérateur peut détecter des liens entre les clients de ses concurrents. Nous montrons que RankMerging surpasse les méthodes à disposition pour prédire un nombre variable de liens dans un grand graphe épars.

Revealing intricate properties of communities in the bipartite structure of online social networks

Raphaël Tackx, Jean-Loup Guillaume and Fabien Tarissan

In IEEE Ninth International Conference on Research Challenges in Information Science (RCIS’15), Athènes, Greece, 2015

Many real-world networks based on human activities exhibit a bipartite structure. Although bipartite graphs seem appropriate to analyse and model their properties, it has been shown that standard metrics fail to reproduce intricate patterns observed in real networks. In particular, the overlapping of the neighbourhood of communities is difficult to capture precisely. In this work, we tackle this issue by analysing the structure of 4 real-world networks coming from online social activities. We first analyse their structure using standard metrics. Surprisingly, the clustering coefficient turns out to be less relevant than the redundancy coefficient to account for overlapping patterns. We then propose new metrics, namely the dispersion and the monopoly coefficients, and show that they help refining the study of bipartite overlaps. Finally, we compare the results obtained on real networks with the ones obtained on random bipartite models. This shows that the patterns captured by the redundancy and the dispersion coefficients are strongly related to the real nature of the observed overlaps.

Download

Temporal properties of legal decision networks: a case study from the International Criminal Court

Fabien Tarissan and Raphaëlle Nollez-Goldbach

In 28th International Conference on Legal Knowledge and Information Systems (JURIX’15), Braga, Portugal, 2015.

Many studies have proposed to apply artificial intelligence techniques to legal networks, whether it be for highlighting legal reasoning, resolving conflict or extracting information from legal databases. In this context, a new line of research has recently emerged which consists in considering legal decisions as elements of complex networks and conduct a structural analysis of the relations between the decisions. It has proved to be efficient for detecting important decisions in legal rulings. In this paper, we follow this approach and propose to extend structural analyses with temporal properties. We define in particular the notion of relative in-degree, temporal distance and average longevity and use those metrics to rank the legal decisions of the two first trials of the International Criminal Court. The results presented in this paper highlight non trivial temporal properties of those legal networks, such as the presence of decisions with an unexpected high longevity, and show the relevance of the proposed relative in-degree property to detect landmark decisions. We validate the outcomes by confronting the results to the one obtained with the standard in-degree property and provide juridical explanations of the decisions identified as important by our approach.

Download

Augmenter les retweets sur Twitter : comment tirer parti des mentions ?

Soumajit Pramanik, Qinna Wang, Maximilien Danisch, Mohit Sharma, Sumanth Bandi, Jean-Loup Guillaume, Stéphane Raux and Bivas Mitra

6ème conférence sur les Modèles et l’Analyse des Réseaux : Approches Mathématiques et Informatique (MARAMI), Paris, 2015

Alors que Twitter est devenu incontournable, la propagation des tweets et hashtags est toujours largement incomprise. Le propagation d’information sur Twitter est principalement due aux retweets et aux mentions mais, alors que les retweets ne permettent d’atteindre que les abonnés d’un individu, les mentions permettent d’atteindre n’importe qui directement. De nombreuses études ont montré que les mentions sont largement utilisées sur Twitter, mais surtout qu’elles sont fondamentales pour augmenter la popularité des tweets et hashtags. Des méthodes automatiques pour choisir les bons utilisateurs à mentionner pourraient donc permettre d’augmenter la visibilité des tweets. Dans cet article nous proposons un système de recommandation de mentions en temps réel pour aug- menter la popularité d’un tweet. Ce système est basé sur un modèle de propagation de tweet dans un graphe multiplexe construit à partir d’une étude de données réelles. Il permet de clairement faire la différence entre les propagations dues aux mentions et celles dues aux abonnements. Les simulations du modèle donnent des résultats similaires aux observations empiriques et sont également fondées sur des résultats analytiques. En utilisant ces différents résultats nous proposons une stratégie de recom- mandation effective et une application Twitter associée.

Download

Déplier la structure communautaire dun réseau en mesurant la proximité aux représentants de communauté

Maximilien Danisch, Jean-Loup Guillaume and Bénédicte Le Grand

6ème conférence sur les Modèles et l’Analyse des Réseaux : Approches Mathématiques et Informatique (MARAMI), Paris, 2015

Nous proposons un algorithme pour déplier la structure communautaire des grands graphes de terrain. L’algorithme est basé sur la détection de la communauté de chaque représentant communau- taire : nœud contenu dans une seule communauté et important en son sein. Cette détection est faite avec une approche à base de mesure de proximité développée récemment. Par comparaison avec d’autres méthodes de l’état de l’art nous montrons que notre algorithme a des performances équivalentes voire meilleures et est capable de traiter les plus grands graphes de terrain.

Download