Probability and Mathematical Physics Seminar

Products of Many Random Matrices and Gradients in Deep Neural Networks

Speaker: Boris Hanin, Texas A&M / Facebook AI

Location: Warren Weaver Hall 512

Date: Friday, February 8, 2019, 11 a.m.

Synopsis:

Neural networks have experienced a renaissance in recent years, finding success in tasks from machine vision (e.g. self-driving cars) to natural language processing (e.g. Alexa or Siri) and reinforcement learning (e.g. AlphaGo). A mathematical theory of how and why they work is only in the very starting stages.

The purpose of this talk is to address an important numerical stability issue for neural networks, known as the exploding and vanishing gradient problem. I will explain what this problem is and how it turns precisely into a problem of studying products of many matrices in the regime where both the sizes and the number of matrices tend to infinity together. I will present some joint work with Mihai Nica on the behavior of matrices in this regime.