Probability and Mathematical Physics Seminar
Products of Many Random Matrices and Gradients in Deep Neural Networks
Speaker: Boris Hanin, Texas A&M / Facebook AI
Location: Warren Weaver Hall 512
Date: Friday, February 8, 2019, 11 a.m.
Synopsis:
Neural networks have experienced a renaissance in recent years, finding success in tasks from machine vision (e.g. self-driving cars) to natural language processing (e.g. Alexa or Siri) and reinforcement learning (e.g. AlphaGo). A mathematical theory of how and why they work is only in the very starting stages.
The purpose of this talk is to address an important numerical stability issue for neural networks, known as the exploding and vanishing gradient problem. I will explain what this problem is and how it turns precisely into a problem of studying products of many matrices in the regime where both the sizes and the number of matrices tend to infinity together. I will present some joint work with Mihai Nica on the behavior of matrices in this regime.