Ages in queries and filenames

> By Matthieu Latapy and Clémence Magnien

Ages in queries and filenames

Ages in queries and filenames

In the filenames of files exchanged in a P2P system, and in the keyword-based queries sent by users, age indications often appear in the form of a number followed by yo.

The plot gives the distribution of these numbers in filenames and queries as observed in a 10 weeks measurement: for each n from 1 to 20, we selected all filenames and queries containing the string nyo, and we plotted for each x the fraction of these strings with n lower than or equal to x.

We can see that the vast majority consists of ages below 18 (92% (resp. 98%) of filename (resp. queries) indications concern ages strictly below 18), and that there are a very large number of young, and even very young, ages: about half the queries and 40 percent of the filenames refer to ages of 10 years old or less, and approximately 15% of queries and 7% of filenames refer to ages of 5 years old or less.

One striking observation is that queries focus on younger ages than filenames: for all ages up to 11, the proportion of queries for this age is larger than the proportion of filenames containing this age.

This entry was posted in Plots and tagged ,