> By Clémence Magnien and Frédéric Ouédraogo
traceroute is a tool that gives the internet path followed by your packets towards a given destination, under the form of a series of IP addresses. Consider a given set of destinations, and let us call the result of a traceroute towards each of these destinations a measurement round. We are interested in studying how the set of IP addresses we can observe in a measurement round evolves with time. Therefore, we repeat these measurement rounds periodically (every 15 minutes approximately), using a fixed destination set (for more details about these measurements, see this paper).
For each IP address we see at least once during our measurements, we record two things: the number of rounds in which we have observed it, and its number of appearances, i.e. rounds in which the IP is present, but was not present in the round before. For instance, an IP address that we see at rounds 1, 5,6,7 and 10 has been observed in 5 distinct rounds, and has appeared three times.
We then make the above plot: each dot corresponds to an IP address. The coordinate of the dot on the x-axis is the number of rounds the IP address was observed in; its coordinate on the y-axis is its number of appearances. Surprisingly, the plot exhibits a clear geometric shape: we can see a triangle and a circle-like shape in it. These shapes can however be explained.
By definition of the plot, no dot can appear outside of the triangle: no IP address can appear a larger number of times than the number of rounds it was observed in (therefore we cannot have y > x); conversely, no IP address can appear a larger number of times that the number of rounds it was not observed in, since an appearance is defined as a round in which the IP address is not observed, followed by a round in which it is observed (therefore we cannot have y > 4676 – x, 4676 being the total number of measurement rounds). This defines the borders of the triangle.
The circle is in fact a parabola. Consider an IP address that was observed in exactly x distinct rounds during our measurements. If we suppose the rounds this address was observed in were chosen at random among our 4676 measurement rounds, then we can compute the expected number of appearances of this address. A given round corresponds to an appearance with the probability that the address was observed in this round, multiplied by the probability that it was not observed in the previous round, which gives (x / 4676) * (4676 – x / 4676). To obtain the number of expected number of appearances, just multiply this probability by the total number of rounds, giving the equation of the parabola.
The fact that the parabola can clearly be observed means that a large number of IP addresses seem to behave randomly in our observations: they appear the same number of times as they would if they were observed at random times. Dots above the parabola correspond to addresses that tend to blink on and off more than expected; finally, a large number of dots below the parabola mean that many IP addresses tend to be more stable than expected: they appear fewer times than expected, meaning that when they appear they tend to stay there a large number of consecutive rounds before disappearing.