Supervised learning of sheared distributions using linearized optimal transport
Varun Khurana  1@  , Harish Kannan  1  , Caroline Moosmüller  1  , Alexander Cloninger  1, 2  
1 : Department of Mathematics [San Diego]
2 : Halicioǧlu Data Science Institute

Detecting differences and building classifiers between distributions, given only finite samples, are important tasks in a number of scientific fields. Optimal transport (OT) has evolved as the most natural concept to measure the distance between distributions and has gained significant importance in machine learning in recent years. However, OT often fails to exploit reduced complexity in case the family of distributions is generated by simple group actions. In this talk, we discuss how optimal transport embeddings can be used to deal with this issue, both on a theoretical and a computational level. In particular, we embed the space of distributions into an L^2-space by mapping a distribution to its OT map with respect to a fixed reference distribution. We further give an exact characterization of distributions for which this embedding is an isometry. In the embedding space, we use regular machine learning techniques to achieve linear separability when the classes of distributions are generated by a family of shearings, describing conditions under which two classes of sheared distributions can be linearly separated. We also give necessary bounds on these shearing transformations to achieve a pre-specified separation level. Furthermore, embedding into multiple L^2 spaces allows for not only larger families of transformations but also a greater level of linear separation. Finally, our theoretical results are verified empirically on image classification tasks.

Personnes connectées : 2 Vie privée