> By Bénédicte Le Grand
These figures represent Galois lattices built from two samples of online social networks (Myspace and DailyMotion).
Galois lattices, from Formal Concept Analysis, cluster data (here the members of a social network sample) according to their common properties (here the members’ contacts).
Each node of the lattice is called a concept; it contains a set of members who have some common contacts. Concepts are linked by generalization / specialization relationships which provide intuitive navigation paths. The lower a concept is in the lattice, the more specific it is (i.e. its members have a greater number of common contacts than the members of concepts located at higher levels).
Members of the social network samples may appear in several concepts as clusters may be overlapping. This avoids any information loss during the clustering process. Moreover, the label of each cluster is explicitly given by the list of common properties, which makes interpretation easier.
However, Galois lattices grow exponentially with the number of members and contacts and they rapidly become too complex to be interpreted at a glance. Moreover, the shapes of the lattice are not always meaningful: these two lattices, although quite different, do not necessarily imply radically different structures for the underlying datasets.
One solution to compare such lattices is to develop measures capturing as much information as possible from the lattice. Two conceptual measures have been defined so far: Relatedness and Closeness. Relatedness expresses the tendency of a member to be clustered with many (or few) other members of his social network. Closeness expresses the tendency of a member to share many (or few) contacts with the members with which it is clustered.
Such metrics allow to characterize each member of the social network and therefore of the whole dataset and can be used to identify hot topics, leaders and outliers. Community detection is another goal of this work and is currently under study.