> By Matthieu Latapy and Clémence Magnien
In the filenames of files exchanged in a P2P system, and in the keyword-based queries sent by users, age indications often appear in the form of a number followed by yo.
The plot gives the distribution of these numbers in filenames and queries as observed in a 10 weeks measurement: for each n from 1 to 20, we selected all filenames and queries containing the string nyo, and we plotted for each x the fraction of these strings with n lower than or equal to x.
We can see that the vast majority consists of ages below 18 (92% (resp. 98%) of filename (resp. queries) indications concern ages strictly below 18), and that there are a very large number of young, and even very young, ages: about half the queries and 40 percent of the filenames refer to ages of 10 years old or less, and approximately 15% of queries and 7% of filenames refer to ages of 5 years old or less.
One striking observation is that queries focus on younger ages than filenames: for all ages up to 11, the proportion of queries for this age is larger than the proportion of filenames containing this age.