When Diversity Is Needed… But Not Expected!

Sylvain Castagnos

Thursday, May 17th, 2018, 11h, salle 26-00/101, Campus Jussieu

Slides

Recent studies have highlighted the correlation between users’ satisfaction and diversity within recommenders, especially the fact that diversity increases users’ confidence when choosing an item. Understanding the reasons of this positive impact on recommenders is now becoming crucial. Based on this assumption, we designed a user study that focuses on the utility of this new dimension, as well as its perceived qualities. This study has been conducted on 250 users and it compared 5 recommendation approaches, based on collaborative filtering, content-based filtering and popularity, along with various degrees of diversity. Results show that, when recommendations are made explicit, diversity may reduce users’ acceptance rate. However, it helps increasing users’ satisfaction. Moreover, this study highlights the need to build users’ preference models that are diverse enough, so as to generate good recommendations.

Using network analysis to identify generic legal trends in European Human Rights Law

Henrik Palmer Olsen

Jeudi 16 Novembre 2017, 14h, salle 26-00/332, Campus Jussieu

This presentation explores how the complete collected set of judgments from the European Court of Human Rights can be presented as a network of case to references, and how this network can be analysed via the use of already known network analysis approaches. The presentation will first show how network analysis can be used to calculate the main legal content of a case (i.e. which legal right the case is mostly concerned with and for which the case is therefore a precedent). It will then move on to showing how, Page-rank of a case in the total network of cases can be compared to the Page-rank of the same case in its local network and how this comparison can be used to further calculate generic centrality in the overall network. In using this novel approach, it is possible to identify cases that have strong precedent, not for a specific right, but for more generic legal content that the Court uses in dealing with applications from citizens.

Gendarmes, Voleurs et Topologie algébrique

David Ellison

Jeudi 15 Juin 2017 à 11h, Salle 24-25/405, Campus Jussieu

Le jeu du gendarme et du voleur, introduit par Alain Quilliot dans sa thèse en 1978, est un jeu à deux joueurs sur un graphe. Le gendarme commence en choisissant son point de départ sur un sommet du graphe ; puis le voleur choisit le sien. Ensuite, ils se déplacent chacun leur tour le long des arêtes du graphe. La question est de savoir si le gendarme a une stratégie qui lui permet d’attraper le voleur. Dans le cas contraire, la question devient : combien faut-il de gendarmes pour attraper le voleur ? Quilliot a démontré dans sa thèse qu’un seul gendarme suffit à attraper le voleur si et seulement si le graphe est démontable, c’est-à-dire si et seulement si on peut le réduire à un seul sommet en retirant successivement des sommets où le voleur peut être coincé. Il s’ensuit que les graphes démontables correspondent à la classe d’homotopie du point, et que certains invariants homotopiques, comme les groupes d’homologie, permettent de découvrir des propriétés structurelles des graphes où le gendarme peut attraper le voleur.

Community detection in attributed graphs

Christine Largeron

Mardi 25 Avril 2017 à 11h, Salle 24-25/405, Campus Jussieu

A wide variety of applications require data modeled by a graph of interconnected nodes, known as social networks or information networks. When nodes are described by attributes, this is an attributed network.Examples of such networks are citation or collaboration networks or social media. One fundamental property of these networks is that they tend to naturally form communities and, many mining methods have been proposed to identify these communities but these methods are typically basedsolely on relationships between nodes in the graph. This is notably the case of Louvain, a greedy agglomerative clustering method which optimizes the modularity measure. Introduced by Newman, the modularity allows to estimate the quality of a partition butwithout taking into account the attributes associated with the nodes. For this reason, we have designed a complementary measure, based on inertia and specially conceived to evaluate the quality of a partition basedon real attributes describing the vertices. We have also proposed I-Louvain, a method for community detection in attributed graphs where real attributes are associated with the vertices. Our experiments showedthat combining the relational information with the attributes allows to detect the communities more efficiently with poor data.

Structure and dynamics of multiplex networks

Federico Battiston

11 Avril 2017, 11h, Salle 24-25/405

Many real-world complex systems consist of a set of elementary units connected by relationships of different kinds. All such systems are better described in terms of multiplex networks, where the links at each layer represent a different type of interaction between the same set of nodes, rather than in terms of (single-layer) networks. In this talk I will introduce multiplex networks and several metrics to measure multiplexity at different scales. Measures are validated and applied to real-world systems with examples from collaboration networks, terrorist networks and the brain. I will also show how multiplexity can produce the emergence of qualitatively novel dynamical behavior, focusing on the case of social dynamics. Resume I am currently a PostDoc in the Aramis Lab at the Brain & Spine Institute working with Mario Chavez and Fabrizio De Vico Fallani. I am also the Chair of the Young Researchers Network on Complex Systems, the younger branch of the Complex Systems Society. I earned my PhD in applied mathematics from Queen Mary University of London working in the complex systems and networks group of Vito Latora and as a part of the European Project LASAGNE on multilayer networks. Before that, I got an undergraduate and a master degree in theoretical physics from Sapienza University of Rome. You can find more information about me at: http://www.maths.qmul.ac.uk/~battiston/

Contribution à la qualité des informations dans les réseaux sociaux: Identifier et analyser les motifs récurrents pour détecter les phénomènes sociaux

Mezghani Manel

16 Mars 2017 – Salle 24-25/405

Lanalyse des données issues des réseaux sociaux pose des questions au niveau de la qualité des informations (cest à dire des informations identifiées, fiables et exactes) car on y trouve par exemple du spam social et des rumeurs, ce qui accroît la difficulté dobtenir des informations précises et pertinentes. Mon projet de recherche est axé sur la qualité des données dans les réseaux sociaux. Cette direction de recherche est très intéressante car elle touche différents domaines et peut être utile dans beaucoup dapplications. En effet, pouvoir évaluer la qualité de linformation peut être utilisé par exemple pour éliminer des spams/spammeurs, pour éviter danalyser des rumeurs, etc. Ceci permet par exemple davoir un meilleur résultat de recherche, déviter de mauvaises recommandations de produits ou de diffuser une bonne information aux utilisateurs. Une voie pertinente pour contribuer à lamélioration de la qualité de linformation et que je compte explorer est la détection et lanalyse de motifs récurrents des phénomènes sociaux (par exemple: spam, rumeur) au cours du temps dans les graphes de données issus des réseaux sociaux. Ceci permet de filtrer des données erronées et ainsi elles contribuent à fournir des données nettoyées qui peuvent être utilisées dans dautres applications. Ce séminaire sera loccasion de présenter mes travaux de recherche associé à lanalyse des réseaux sociaux; et de discuter de mon projet de recherche.

Un nouvel algorithme de clustering pour la détection danomalies

Damien Nogues

salle 24-25/405 à 11h00

La détection dintrusion est une problématique importante en cyber-sécurité. Cependant, de nombreuses approches classiques utilisent une approche à base de règles. Pour chaque attaque, unesignature de détection est créée. Ces approches ne permettent donc pas de détecter de nouvelles attaques, ou des attaques connues mais modifiées. Pour cette raison, nous nous sommesintéressés aux approches de détection danomalies. Ces méthodes utilisent des techniques dapprentissage non supervisé pour détecter des déviations à la norme. Ainsi, les techniques declustering sont fréquemment utilisées. Mais les algorithmes classiques de clustering, tel que k-means, ne sont pas très adaptés à ce problème. En effet, ces derniers nécessitent souvent de fixera priori le nombre de classes désiré. De plus, ces algorithmes sont en général conçus pour traiter des données quantitatives, ou parfois qualitatives. En détection dintrusions, les données sontsouvent hétérogènes. Nous détaillerons donc un nouveau critère de clustering adapté aux données hétérogènes, ainsi quun algorithme doptimisation de ce critère. Enfin, nous verronscomment il est possible dutiliser la partition obtenue pour effectuer une détection danomalies efficace sur des flux réseaux.

Towards understanding and leveraging the structure of real-world graphs

Maximilien Danisch

Campus Jussieu, Salle 26-00/124 à 11h00

Real-world graphs (a.k.a. complex networks) are ubiquitous: the web, Facebook, Twitter, brain networks, protein interaction networks, etc. Studying these graphs and solving computational problems on them (say maximum clique, partitioning or dense subgraph) has applications in many fields.I will first show that the structure of some real-world graphs is special. In particular, they are not adversarial and some difficult problems (say NP-complete or NP-hard problems) can be solved on some huge real-world graphs (say 2G edges or more).I will then present two works along the lines of « understanding and leveraging the structure of real-world graphs in order to build better algorithms »: (i) Enumerating all k-cliques and (ii) Computing the density-friendly graph decomposition.

Coalition games on interaction graphs

Nicolas Bousquet

15 Novembre 2016, salle 26-00/124

We consider cooperative games where the viability of a coalition is determined by whether or not its members have the ability to communicate amongst themselves. This necessary condition for viability was proposed by Myerson and is modeled via an interaction graph; a coalition S of vertices is then viable if and only if the graph induced graph S is connected. The non-emptiness of the core of a coalition game can be tested by a well-known covering LP. Moreover, the integrality gap of its dual packing LP defines exactly the multiplicative least-core and the relative cost of stability of the coalition game. This gap is upper bounded by the packing-covering ratio which is known to be at most the treewidth of the interaction graph plus one. We examine the packing-covering ratio and integrality gaps of graphical coalition games in more detail. First we introduce a new graph parameter, called the vinewidth (a parameter derived from the treewidth), which characterizes the worst packing-covering ratio. Then we will show that this new parameter correctly evaluates both primal and dual integrality gaps. Joint work with Zhentao Li and Adrian Vetta.

Mining Time-stamped Network Data Spectral Methods and Centrality Measures

Ingo Scholtes

Mercredi 19 Octobre 2016à 11h00 salle 24-25/405

Recent research has highlighted limitations of studying complex systems with time-varying topologies from the perspective of static, time-aggregated networks. Non-Markovian characteristics resulting from the specific ordering of interactions in temporal networks were identified as one important mechanism that alters causality and affects dynamical processes. So far, an analytical explanation for this phenomenon and for the significant variations observed across different systems is missing. Summarizing our recent research in this area, in this talk I will introduce a framework that allows to analyze temporal networks with non-Markovian characteristics. The framework is based on higher-order aggregate networks, a simple generalization of the commonly used static representation of temporal network data. I will show that spectral properties of such higher-order aggregate networks can explain the slow-down of diffusion processes compared to aggregate networks, which has been observed in a number of empirical data sets. I further show that we can derive an exact analytical prediction for the magnitude of this change compared to the weighted, time-aggregate network. I finally present recent results on the analysis of node centralities in non-Markovian temporal networks, concluding that this approach provides interesting perspectives for (i) temporal community detection by spectral clustering, (ii) refined measures of centrality for time-evolving networks, and (iii) analytical studies of dynamical processes in complex systems with time-evolving interaction topologies.

On the notion of hyperbolicity in graphs

David Coudert

Salle 24-25/405

The Gromov hyperbolicity is an important parameter for analyzing complex networks since it is a measure of the tree-likeness of a graph from a metric perspective. This notion has been recently applied in different contexts, such as the design of routing schemes, network security, computational biology, the analysis of graph algorithms, and the classification of complex networks. For instance, it gives bounds on the best possible stretch of some greedy-routing schemes in Internet-like graphs. The best known algorithm for computing hyperbolicity has running-time O(n^{3.69}), which is clearly prohibitive for big graphs. In this talk, we will present recent advances on the computation of this parameter. We will present an algorithm that performs well in practice. Although its time complexity is in O(n^4), it can compute the hyperbolicity of graphs with 100.000 nodes in short time. We will discuss the limitations of this algorithm and related open problems. We will also provide some results on the time complexity lower bound. In addition, we will show how to use some simple properties of the graph to get tight bounds on its hyperbolicity, e.g., being bipartite, a line graph, vertex transitive, etc. These properties are used for establishing bounds on the hyperbolicity of various interconnection network topologies proposed for large data centers.

Classification of online discussions: is tree structure sufficient enough?

Mattias Mano

Vendredi 30 Septembre à 10h30, Salle 26-00/332

Slides

Over the past years, online communities have considerably grown. Especially, a new kind of forums emerged: the « Questions and Answers » (Q&A) forums. They cover lots of different fields, from opinion questions (such as Yahoo! Answers, Reddit, Quora, …) to really technical questions in Computer Science (Stack Overflow, Bugzilla, …) or in Mathematics (Math Overflow). We study the opinion forum « Reddit – Change My View ». Participants debate on society subject. Using graph theory modeling, I wonder if structural informations of the discussion (graph) are sufficient enough to be characteristic of what happen in the discussion.

Parameterized complexity: from graph minor theory to efficient algorithms

Christophe Paul

Vendredi 08 juillet 2016 à 11h, salle 24-25/405

Slides

Parameterized complexity suggests a multi-parameter analysis of the computational complexity of hard problems. The idea is to understand the influence of parameters, distinct from the input size, in the resolution of a problem. Such parameters could be the solution size or the structural parameters such as width parameters. After an introduction to parameterized complexity, we will present some of the algorithmic consequences of the graph minor theory. From the work of Robertson and Seymour, it is known that every graph family closed under minor can be recognized in cubic time. However for most of such graph family, such a result is existential only. Since then constructive meta-algorithmic theorems have been proposed (including Courcelles theorem) within the framework of parameterized complexity. We will discuss recent developments in this line of research that led to efficient algorithms for large family of problems.

Graphs: is there something between theory and practice?

Gilles Tredan

Vendredi 24 juin 2016 à 11h, salle 24-25/405

My first part will focus on the problem of charting graphs: can we believe the maps we build for complex systems? My second part will present efforts to capture AFK social interaction networks (the Souk project), and figure out what to do with the obtained dynamic graphs. My third part is yet to be defined. My hole presentation will try to find a balance between algorithmic perspectives and data analysis.

Detection and classification of network traffic anomalies

Johan Mazel

Mercredi 8 juin 2016 à 11h, salle 24-25/405

Internet plays a central role in our lives. However, it is an extremely diverse and complex system. Ranging from non-malicious unexpected events such as flash-crowds and failures, to network attacks such as denials-of-service and network scans, network traffic anomalies can have serious detrimental effects on the performance and integrity of the network. Anomaly detection is thus paramount in order to guarantee users’ access to Internet resources. In this talk, we will address recent advances in network traffic anomaly detection and classification, that leverage graph analysis techniques, machine learning techniques and big data.

Suppressing diffusion processes on arbitrary networks using treatment resources of limited efficiency

Argyris Kalogeratos

Lundi 13 juin 2016 à 11h, salle 26-00/332

Slides

In many real-life situations, it is critical to dynamically suppress or remove an undesired diffusion process (viruses, information, behaviors, etc.). The talk will present a framework for Dynamic Resource Allocation (DRA) assuming a continuous-time SIS epidemic model, and that a budget of treatment resources of limited efficiency are at the disposal of authorities. Special emphasis will be given on the macro- and microscopic (or local) properties of the network structure for the problem and two strategies will be presented that fall in this framework: a) a simple yet effective greedy approach, and b) a more sophisticated one that uses a precomputed priority plan of how the healing strategy should proceed on a specific network. Additionally, extensions in competitive scenarios will be discussed.

Kempe equivalence of colourings

Marthe Bonamy

Vendredi 3 juin 2016 à 11h, salle 24-25/405

Slides

Given a colouring of a graph, a Kempe change is the operation of picking a maximal bichromatic subgraph and switching the two colours in it. Two colourings are Kempe equivalent if they can be obtained from each other through a series of Kempe changes. Kempe changes were first introduced in a failed attempt to prove the Four Colour Theorem, but they proved to be a powerful tool for other colouring problems. They are also relevant for more applied questions, most notably in theoretical physics. Consider a graph with no vertex of degree more than some integer D. In 2007, Mohar conjectured that all its D-colourings are Kempe-equivalent. Feghali, Johnson and Paulusma proved in 2015 that this is true for D=3, with the exception of one single graph which disproves the conjecture in its generality. We settle the remaining cases by proving the conjecture holds for every integer D at least 4. This is a joint work with Nicolas Bousquet (LIRIS, Ecole Centrale Lyon, France), Carl Feghali (Durham University, UK) and Matthew Johnson (Durham University, UK).

Temporal density of complex networks and ego-community dynamics

Sergey Kirgizov

Lundi 04 julliet 2016 à 11h, salle 24-25/405

Slides

At first, we say that a ego-community structure is a probability measure defined on the set of network nodes. Any subset of nodes may engender its own ego-community structure around. Many community detection algorithms can be modified to yield a result of this type, for instance, the personalized pagerank. We also recall that community detection algorithms (including personalized pagerank) can be viewed from different perspectives: random walks, convergence of markov chain, spectral clustering, optimization, mincut(s), discrete cheeger inequality(ies), etc. Next, we present a continuous version of Viard-Latapy-Magnien link streams, that we call « temporal density ». Classical kernel density estimation is used to move from discrete link streams towards their continuous counterparts. Using matrix perturbation theory we can prove that ego-community structure changes smoothly when the network evolves smoothly. This is very important, for example, for visualization purposes. Combining the temporal density and personalized pagerank methods, we are able to visualize and study the evolution of the ego-community structures of complex networks with a large number of temporal links in order to extract interacting information. For example, we can detect events, trace the evolution of (ego-)community structure, etc. We illustrate and validate our approach using « Primary school temporal network data » provided by sociopatterns.org, and we show how the temporal density can be applied to the study of very large datasets, such as a collection of tweets written by European Parliament candidates during European Parliament election in 2014.

La structure communautaire : évaluation et analyse de motifs dans les flux de liens

Jean Creusefond

Vendredi 13 mai 2016 à 11h, salle 24-25/405

Slides

Cet exposé sera en deux parties relativement indépendantes : l’évaluation des structures communautaires par le biais des vérités de terrain et l’analyse des l’appartenance communautaire des motifs dans les flux de liens L’évaluation de structures communautaires de manière théorique est très délicate : de multiples propriétés structurelles sont considérées comme importantes, par conséquent considérer une structure comme meilleure qu’une autre implique des choix arbitraires sur ces préférences, matérialisé par le choix d’une fonction de qualité ou de benchmarks. Afin d’éviter ces problèmes, beaucoup de chercheurs évaluent maintenant leurs résultats par comparaison avec des structures communautaires extraites en même temps que des jeux de données, en argumentant que la proximité entre leurs résultats et la vérité de terrain est une preuve significative de pertinence. Dans cette partie, je vais discuter d’une méthodologie permettant de concilier les deux approches et d’identifier quelles vérités de terrain favorisent quelles fonctions de qualité. Je soulignerai notamment le choix de la fonction de comparaison de partitions, souvent considéré comme anodin, mais changeant en fait radicalement les résultats. Pour référence, le programme développé (incluant un grand nombre d’algorithmes de détection de communautés et de fonctions de qualité) est entièrement disponible à l’adresse suivante : https://codacom.greyc.fr/ En seconde partie, je discuterai de travaux en cours d’analyse de flots de liens : des graphe dont chaque arc est étiqueté par un temps et où les multiarcs sont possibles. Les flots de liens qui nous intéressent ici représentent des réseaux de communication, c’est-à-dire que chaque arc représente une interaction orientée entre deux utilisateurs. Fréquemment, les algorithmes de détection de communautés qui tentent de les analyser agglomèrent le réseau de communication sur des fenêtres temporelles, où des méthodes traditionnelles (ou adaptées) peuvent êtres appliquées. Dans ce cas, une information est perdue : la causalité entre les liens. Par exemple, si un ensemble de personnes ont systématiquement la même structure de communication (ex : quand « A » interagit avec « B », celui-ci intéragit ensuite systématiquement avec « C » et « D »), peut-on en déduire la structure communautaire associée? Afin d’évaluer l’impact de cette information, je me suis intéressé aux motifs : des chaînes de communication dont la causalité semble probable (la première interaction a probablement entraîné la suivante, etc.). Le lien entre ces motifs et la structure communautaire reste donc à analyser, et je présenterai les outils mis au point à ce dessein ainsi que quelques résultats préliminaires.

Degeneracy-based mining of social and information networks: dynamics and applications

Fragkiskos Malliaros

Vendredi 01 avril 2016 à 11h, salle 24-25/405

Slides

Networks have become ubiquitous as data from diverse disciplines can naturally be mapped to graph structures. The problem of extracting meaningful information from large scale graph data in an efficient and effective way has become crucial and challenging with several important applications and towards this end, graph mining and analysis methods constitute prominent tools. In this talk, I will present part of my work that builts upon computationally efficient graph mining methods in order to: (i) design models for analyzing the structure and dynamics of real-world networks towards unraveling properties that can further be used in practical applications; (ii) develop algorithmic tools for large-scale analytics on data with inherent (e.g., social networks) or without inherent (e.g., text) graph structure. Our approaches rely on the concepts of graph degeneracy and core decomposition in graphs. In particular, for the former point I will show how to model the engagement dynamics of large social networks and how to assess their vulnerability with respect to user departures from the network. In both cases, by unraveling the dynamics of real social networks, regularities and patterns about their structure and formation can be identified; such knowledge can further be used in various applications including churn prediction and anomaly detection. For the latter, I will present a core decomposition-based approach for locating influential nodes in complex networks, with direct applications to epidemic control and viral marketing.