Monday, February 24, 2014 at 5:10 pm

Cantor Film Center (36 E. 8th Street), Room 101

Michael I. Jordan (University of California, Berkeley)

On the Computational and Statistical Interface & 'Big Data'

The rapid growth in the size and scope of datasets in science and technology has created a need for novel foundational perspectives on data analysis that blend the statistical and computational sciences. That classical perspectives from these fields are not adequate to address emerging problems in "Big Data" is apparent from their sharply divergent nature at an elementary level in computer science, the growth of the number of data points is a source of "complexity" that must be tamed via algorithms or hardware, whereas in statistics, the growth of the number of data points is a source of "simplicity" in that inferences are generally stronger and asymptotic results can be invoked. We wish to blend these perspectives. In this talk we show how statistical decision theory provides a mathematical point of departure for achieving such a blending. We develop theoretical tradeoffs between statistical risk, amount of data and "externalities" such as computation, communication and privacy. We develop procedures that allow one to choose desired operating points along such tradeoff curves. [Joint work with Venkat Chandrasekaran, John Duchi, Martin Wainwright and Yuchen Zhang].