Tag Archives: p2p

Papers

Relevance of SIR Model for Real-world Spreading Phenomena: Experiments on a Large-scale P2P System

Daniel F. Bernardes, Matthieu Latapy, Fabien Tarissan

Understanding the spread of information on complex networks is a key issue from a theoretical and applied perspective. Despite the effort in developing theoretical models for this phenomenon, gauging them with large-scale real-world data remains an important challenge due to the scarcity of open, extensive and detailed data. In this paper, we explain how traces of peer-to-peer file sharing may be used to this goal. We also perform simulations to assess the relevance of the standard SIR model to mimic key properties of spreading cascade. We examine the impact of the network topology on observed properties and finally turn to the evaluation of two heterogeneous versions of the SIR model. We conclude that all the models tested failed to reproduce key properties of such cascades: typically real spreading cascades are relatively “elongated” compared to simulated ones. We have also observed some interesting similarities common to all SIR models tested.
Posted in Papers | Also tagged , |
Papers

File Diffusion in a Dynamic Peer-to-peer Network

Alice Albano, Jean-Loup Guillaume, and Bénédicte Le Grand

Many studies have been made on diffusion in the field of epidemiology, and in the last few years, the development of social networking has induced new types of diffusion. In this paper, we focus on file diffusion on a peer-to-peer dynamic network using eDonkey protocol. On this network, we observe a linear behavior of the actual file diffusion. This result is interesting, because most diffusion models exhibit exponential behaviors. In this paper, we propose a new model of diffusion, based on the SI (Susceptible / Infected) model, which produces results close to the linear behavior of the observed diffusion. We then justify the linearity of this model, and we study its behavior in more details.

Posted in Papers | Also tagged , |
Papers

Quantifying paedophile queries in a large P2P system

Matthieu Latapy, Clémence Magnien and Raphaël Fournier

Increasing knowledge of paedophile activity in P2P systems is a crucial societal concern, with important consequences on child protection, policy making, and internet regulation. Because of a lack of traces of P2P exchanges and rigorous analysis methodology, however, current knowledge of this activity remains very limited. We consider here a widely used P2P system, eDonkey, and focus on two key statistics: the fraction of paedophile queries entered in the system and the fraction of users who entered such queries. We collect hundreds of millions of keyword-based queries; we design a paedophile query detection tool for which we establish false positive and false negative rates using assessment by experts; with this tool and these rates, we then estimate the fraction of paedophile queries in our data; finally, we design and apply methods for quantifying users who entered such queries. We conclude that approximately 0.25 % of queries are paedophile, and that more than 0.2 % of users enter such queries. These statistics are by far the most precise and reliable ever obtained in this domain.

Posted in Papers | Also tagged |
Papers

Estimating properties in dynamic systems: the case of churn in P2P networks

Lamia Benamara and Clémence Magnien

In many systems, such as P2P systems, the dynamicity of participating elements, or churn, has a strong impact. As a consequence, many efforts have been made to characterize it, and in particular to capture the session length distribution. However in most cases, estimating it rigorously is difficult. One of the reasons is that, because the observation window is by definition finite, parts of the sessions that begin before the window and/or end after it are missed. This induces a bias. Although it tends to decrease when the observation window length increases, it is difficult to quantify its importance, or how fast it decreases.

Here, we introduce a general methodology that allows us to know if the observation window is long enough to characterize a given property. This methodology is not specific to one study case and may be applied to any property in a dynamic system. We apply this methodology to the study of session lengths in a massive measurement of P2P activity in the eDonkey system. We show that the measurement needs to last for at least one week in order to obtain representative results. We also show that our methodology allows us to precisely characterize the shape of the session length distribution.

Posted in Papers | Also tagged , |
Plots

Accurate characterizing of session lengths in P2P system

Accurate characterizing of session lengths in P2P system

> Lamia Benamara et Clémence magnien When trying to characterize the dynamics of a system, we are faced with two problems. First, the observation window must be long enough to be representative. Second, the fact that it is finite still induces a bias in the observations, sessions beginning/ending before/after the measurement window are not seen […]

Posted in Plots | Also tagged , |
Plots

Link prediction in a file-provider network

Link prediction in a file-provider network

> By Oussama Allali, Matthieu Latapy and Clémence Magnien Link prediction is a key research problem within the analysis of network dynamics. It aims at predicting the links which will appear in future evolution of the network. We consider here a set of peers and files, where each peer is linked to the files it […]

Posted in Plots | Also tagged , , |
Plots

Quantifying paedophile users on a P2P system

Quantifying paedophile users on a P2P system

> By Raphaël Fournier and Matthieu Latapy P2P systems are known to host a large amount of paedophile activity. Thus, quantifying the number of paedophile users on a P2P system is crucial, for many reasons: easy access to such content is a major societal concern, policy making and law-enforcement budgeting rely on this figure and […]

Posted in Plots | Also tagged |
Plots

Keywords popularity in eDonkey queries

Keywords pupularity in eDonkey queries

> Bénédicte Le Grand This plot represents the evolution of the popularity of the keywords ‘avi’, ‘madonna’ and ‘jackson’ in eDonkey queries (captured on an eDonkey server during 102 days in 2009). The values on the y-axis represent the proportion of occurrences of these keywords in eDonkey queries for each day of the capture (x-axis). […]

Posted in Plots | Tagged |
Plots

Number of file-id discovered in a client-side eDonkey measurement

Number of file-id discovered in a client-side eDonkey measurement

> By Christophe Berger, Clémence Magnien, Matthieu Latapy, Firas Bessadok and Phillipe Jarlov We conduct a measurement of files available in eDonkey as follows. Our client connects to all eDonkey servers it discovers (it knows an initial lists of servers and explores the set of all servers reachable from these). Then it sends every 12 […]

Posted in Plots | Also tagged , |
Videos

Evolution of degree distribution during measurement

Evolution of degree distribution during measurement

> By Clémence Magnien and Matthieu Latapy Download When one wants to study a complex network, one generally first has to conduct an intricate and expensive measurement. This measurement gives a sample of the network which is generally partial and may be biased. In Complex Network Measurements: Estimating the Relevance of Observed Properties we propose […]

Posted in Videos | Also tagged , |