Special Seminar
Theory and Practice of Efficient Learning at Scale (Pretraining and Finetuning)
Speaker: Soufiane Hayou, Simons Institute (UC Berkeley)
Location: Online
Videoconference link: https://nyu.zoom.us/j/98715394371
Date: Tuesday, November 12, 2024, 2 p.m.
Synopsis:
State-of-the-art performance is achieved via a series of engineered modifications to existing neural architectures and their training procedures. A common feature of these networks is their large-scale nature: modern neural networks consist of Billions - if not 100's of Billions - of trainable parameters. Moreover, empirical evaluations (generally) support the claim that increasing the scale of neural networks (e.g. width and depth) boosts model performance if done correctly. However, given a neural network model, it is not straightforward to address the crucial question `how do we adjust the training hyperparameters (initialization, learning rate, etc) as we scale the network?'. In this talk, I will show how we can leverage different mathematical results to efficiently scale and train neural networks with applications both in pretraining and fine-tuning.
Zoom link: https://nyu.zoom.us/j/98715394371