# Algorithms for Approximate Sparse Regression and Closest Induced Flats

## Jean Cardinal, Université Libre de Bruxelles

## Date and time: 2pm (New York time), Tuesday, November 30, 2021

## Place: On Zoom (details provided on the seminar mailing list)

The sparse regression problem, also known as best subset selection
problem, can be cast as follows: Given a set $S$ of n points in $ℝ^d$, a
point $y∈ ℝ^d$, and an integer $2 ≤ k ≤ d$, find an affine combination of
at most $k$ points of $S$ that is nearest to $y$. We describe a $O(n^{k-1}
\mathop{\mathrm{polylog}} n)$-time randomized $(1+ε)$-approximation algorithm for this
problem with $d$ and $ε$ constant. This is the first algorithm for this
problem running in time $o(n^k)$. Its running time is similar to the
query time of a data structure proposed by Har-Peled, Indyk, and
Mahabadi, while not requiring any preprocessing. Up to polylogarithmic
factors, it matches a conditional lower bound relying on a conjecture
about affine degeneracy testing.