Presentation by Esteban Bautista on Fractional PageRank

Esteban Bautista

Lundi 25 Janvier 2021, à 14h Présentiel: salle 25-26-105,  Revoir La Présentation:


Graph-Based Semi-Supervised Learning (G-SSL) techniques learn from both labelled and unlabelled data to build better classifiers. This classification paradigm has received considerable attention since modern applications allow to collect large amounts of unlabelled but structured data, naturally encoded by a graph, in a relatively easy and inexpensive manner, while annotated data is expensive to obtain.  Despite successes, the performance of G-SSL techniques can still be improved, particularly in challenging data settings or unbalanced scenarios. To address such limitations, I will present a novel G-SSL method based on the positive γ -th powers of the graph Laplacian, referred to as the Lγ-PageRank method. I will present a theoretical analysis of the new method, showing that (i) for γ < 1, it extends the standard PageRank algorithm to Lévy processes: where random walkers can now perform far-distant jumps in a single step; and (ii) for γ > 1, it classifies data on signed graphs: where nodes belonging to one same class are more likely to share positive edges while nodes from different classes are more likely to be connected with negative edges. I will also show the existence of an optimal Laplacian power maximizing performance, for which I will propose an algorithm for its automatic estimation. Lastly, I will show the significant classification improvements allowed by the proposed approach on several real-world datasets commonly used for classification.