Atmosphere Ocean Science Colloquium

Machine learning techniques to construct patched analog ensembles for data assimilation

Speaker: Minah Yang, Courant, NYU

Location: Warren Weaver Hall 1302

Date: Wednesday, October 6, 2021, 3:30 p.m.

Synopsis:

Using generative models from the machine learning literature to create artificial ensemble members for use within data assimilation schemes has been introduced in [Grooms QJRMS, 2020] as constructed analog ensemble optimal interpolation (cAnEnOI). Specifically, we study general and variational autoencoders for the machine learning component of this method, and combine the ideas of constructed analogs and ensemble optimal interpolation in the data assimilation piece. To extend the scalability of cAnEnOI for use in data assimilation on complex dynamical models, we propose using patching schemes to divide the global spatial domain into digestible chunks. Using patches makes training the generative models possible and has the added benefit of being able to exploit parallelism during the generative step. Testing this new algorithm on a 1D toy model, we find that larger patch sizes make it harder to train an accurate generative model (i.e. a model whose reconstruction error is small), while conversely the data assimilation performance improves at larger patch sizes. There is thus a sweet spot where the patch size is large enough to enable good data assimilation performance, but not so large that it becomes difficult to train an accurate generative model. In our tests the new patched cAnEnOI method outperforms the original (unpatched) cAnEnOI, as well as the ensemble square root filter approach. I will also present some preliminary results from applying cAnEnOI to the Quasi-Geostrophic Coupled Model, a medium complexity 2D model.