Student Probability Seminar
Efficient Training of Energy-Based Models Using Jarzynski Equality
Speaker: Davide Carbone, Courant (NYU)
Location: Warren Weaver Hall 517
Date: Monday, October 23, 2023, 12:30 p.m.
Synopsis:
Energy-based models (EBMs) are generative models inspired by statistical physics with a wide range of applications in unsupervised learning. Classical training methods, e.g. contrastive divergence, are strongly interconnected with the problem of sampling from probability densities in high dimension -- indeed, this task is often plagued by sampling biases induced by slow mixing of standard routines, as for instance unadjusted Langevin algorithm (ULA). A novel approach based on nonequilibrium thermodynamics is presented: the use of Jarzynski equality together with tools from sequential Monte-Carlo sampling can be used to train an EBM in a efficient and controlled way. Specifically, we introduce a modification of the unadjusted Langevin algorithm (ULA) in which each walker acquires a weight that enables the estimation of the gradient of the cross-entropy at any step during the training. We illustrate these results with numerical experiments on Gaussian mixture distributions as well as the MNIST and CIFAR-10 datasets. We show that the proposed approach outperforms methods based on the contrastive divergence algorithm in all the considered situations.