[Lecture] Jian Huang:Deep Optimal Transport and Generative Learning
Time: 8:30-9:30am, Oct. 26th, 2020, Monday
Venue: Zoom ID:689 9105 6390
Spreker:Jian Huang Professor of University of Iowa
Abstract:The long-standing methodology for learning an underlying distribution in statistics and machine learning relies on a prescribed statistical model, which can be difficult to specify in many modern applications such as image analysis, computer vision and natural language processing. In contrast, generative models aim to learn the underlying generating mechanism of data by estimating a nonlinear map that transforms a reference distribution to the target distribution.This modeling approach has achieved impressive successes in many machine learning tasks. In this work, we propose an optimal transport approach for generative learning with theoretical guarantees. We formulate the problem of generative learning as that of finding an optimal transport map from a reference to the target, which is equivalent to solving a fully nonlinear Monge-Ampere equation. Interpreting the infinitesimal linearization of the Monge-Ampere equation from the perspective of gradient flows leads to a stochastic McKean-Vlasov equation that defines a map pushes forward the reference to the target. We apply the forward Euler method to solve this equation. The resulting forward Euler map is the composition of a sequence of simple residual maps, which are computationally stable and easy to train. The key task in training is the estimation of the density ratios that determine the residual maps. We estimate the density ratios based on the Bregman divergence with a gradient penalty using deep density-ratio fitting. We show that the proposed density-ratio estimators do not suffer from the “curse of dimensionality” if data is supported on a lower-dimensional manifold. Numerical experiments with multi-mode synthetic datasets and comparisons with the existing methods on real benchmark datasets support our theoretical results and demonstrate the effectiveness of the proposed method.
Speaker’s Bio:Professor Huang’s current research interests include high-dimensional statistics, bioinformatics, and statistical machine learning. Professor Huang has made contributions through innovative research and development of methods and algorithms to the areas of high-dimensional statistics, computational statistics, statistical genetics and genomics, semiparametric models, and survival. Many of his publications appeared in top-ranked journals, including Annals of Statistics, Bioinformatics, Biometrics, Biometrika, Econometrika, Journal of the American Statistical Association, Journal of Machine Learning Research, PNAS, and The American Journal of Human Genetics. Professor Huang was bestowed a National Institutes of Health Research Scientist Development Award in 1998 and elected a Fellow of the American Statistical Association in 2009. He served as associate editor of Annals of Statistics in 2013-2015, Statistica Sinica and Statistics and Its Interface. Professor Huang has been designated a Highly Cited Researcher from 2015 to 2019 by Clarivate Analytics (formerly the Intellectual Property & Science business of Thomson Reuters), ranking among the top one percent of researchers from 2003 to 2019, for most cited papers in the field of Mathematics.