Daniel F. Bernardes, Matthieu Latapy, Fabien Tarissan
Daniel F. Bernardes, Matthieu Latapy, Fabien Tarissan
Daniel F. Bernardes, Matthieu Latapy, Fabien Tarissan
Daniel F. Bernardes, Matthieu Latapy, Fabien Tarissan
Alice Albano, Jean-Loup Guillaume, and Bénédicte Le Grand
Many studies have been made on diffusion in the field of epidemiology,
and in the last few years, the development of social networking has
induced new types of diffusion. In this paper, we focus on file
diffusion on a peer-to-peer dynamic network using eDonkey protocol. On
this network, we observe a linear behavior of the actual file
diffusion. This result is interesting, because most diffusion models
exhibit exponential behaviors. In this paper, we propose a new model
of diffusion, based on the SI (Susceptible / Infected) model, which
produces results close to the linear behavior of the observed
diffusion. We then justify the linearity of this model, and we study
its behavior in more details.
Alice Albano, Jean-Loup Guillaume, and Bénédicte Le Grand
Matthieu Latapy, Clémence Magnien and Raphaël Fournier
Increasing knowledge of paedophile activity in P2P systems is a crucial societal
concern, with important consequences on child protection, policy making, and
internet regulation. Because of a lack of traces of P2P exchanges and rigorous
analysis methodology, however, current knowledge of this activity remains very
limited. We consider here a widely used P2P system, eDonkey, and focus on two
key statistics: the fraction of paedophile queries entered in the system and the
fraction of users who entered such queries. We collect hundreds of millions of
keyword-based queries; we design a paedophile query detection tool for which we
establish false positive and false negative rates using assessment by experts;
with this tool and these rates, we then estimate the fraction of paedophile
queries in our data; finally, we design and apply methods for quantifying users
who entered such queries. We conclude that approximately 0.25 % of queries are
paedophile, and that more than 0.2 % of users enter such queries. These
statistics are by far the most precise and reliable ever obtained in this
domain.
Matthieu Latapy, Clémence Magnien and Raphaël Fournier
In many systems, such as P2P systems, the dynamicity of participating elements, or churn, has a strong impact. As a consequence, many efforts have been made to characterize it, and in particular to capture the session length distribution. However in most cases, estimating it rigorously is difficult. One of the reasons is that, because the observation window is by definition finite, parts of the sessions that begin before the window and/or end after it are missed. This induces a bias. Although it tends to decrease when the observation window length increases, it is difficult to quantify its importance, or how fast it decreases.
Here, we introduce a general methodology that allows us to know if the observation window is long enough to characterize a given property. This methodology is not specific to one study case and may be applied to any property in a dynamic system. We apply this methodology to the study of session lengths in a massive measurement of P2P activity in the eDonkey system. We show that the measurement needs to last for at least one week in order to obtain representative results. We also show that our methodology allows us to precisely characterize the shape of the session length distribution.
Lamia Benamara and Clémence Magnien
> Lamia Benamara et Clémence magnien When trying to characterize the dynamics of a system, we are faced with two problems. First, the observation window must be long enough to be representative. Second, the fact that it is finite still induces a bias in the observations, sessions beginning/ending before/after the measurement window are not seen […]
> By Oussama Allali, Matthieu Latapy and Clémence Magnien Link prediction is a key research problem within the analysis of network dynamics. It aims at predicting the links which will appear in future evolution of the network. We consider here a set of peers and files, where each peer is linked to the files it […]
> By Raphaël Fournier and Matthieu Latapy P2P systems are known to host a large amount of paedophile activity. Thus, quantifying the number of paedophile users on a P2P system is crucial, for many reasons: easy access to such content is a major societal concern, policy making and law-enforcement budgeting rely on this figure and […]
> Bénédicte Le Grand This plot represents the evolution of the popularity of the keywords ‘avi’, ‘madonna’ and ‘jackson’ in eDonkey queries (captured on an eDonkey server during 102 days in 2009). The values on the y-axis represent the proportion of occurrences of these keywords in eDonkey queries for each day of the capture (x-axis). […]
> By Christophe Berger, Clémence Magnien, Matthieu Latapy, Firas Bessadok and Phillipe Jarlov We conduct a measurement of files available in eDonkey as follows. Our client connects to all eDonkey servers it discovers (it knows an initial lists of servers and explores the set of all servers reachable from these). Then it sends every 12 […]
> By Clémence Magnien and Matthieu Latapy Download When one wants to study a complex network, one generally first has to conduct an intricate and expensive measurement. This measurement gives a sample of the network which is generally partial and may be biased. In Complex Network Measurements: Estimating the Relevance of Observed Properties we propose […]