Report
Distribution-Free Robust Linear Regression
العنوان: | Distribution-Free Robust Linear Regression |
---|---|
المؤلفون: | Mourtada, Jaouad, Vaškevičius, Tomas, Zhivotovskiy, Nikita |
المصدر: | Math. Stat. Learn. 4 (2021), 253-292 |
سنة النشر: | 2021 |
المجموعة: | Computer Science Mathematics Statistics |
مصطلحات موضوعية: | Mathematics - Statistics Theory, Computer Science - Machine Learning, Statistics - Machine Learning |
الوصف: | We study random design linear regression with no assumptions on the distribution of the covariates and with a heavy-tailed response variable. In this distribution-free regression setting, we show that boundedness of the conditional second moment of the response given the covariates is a necessary and sufficient condition for achieving nontrivial guarantees. As a starting point, we prove an optimal version of the classical in-expectation bound for the truncated least squares estimator due to Gy\"{o}rfi, Kohler, Krzy\.{z}ak, and Walk. However, we show that this procedure fails with constant probability for some distributions despite its optimal in-expectation performance. Then, combining the ideas of truncated least squares, median-of-means procedures, and aggregation theory, we construct a non-linear estimator achieving excess risk of order $d/n$ with an optimal sub-exponential tail. While existing approaches to linear regression for heavy-tailed distributions focus on proper estimators that return linear functions, we highlight that the improperness of our procedure is necessary for attaining nontrivial guarantees in the distribution-free setting. Comment: 29 pages, to appear in Mathematical Statistics and Learning |
نوع الوثيقة: | Working Paper |
DOI: | 10.4171/MSL/27 |
URL الوصول: | http://arxiv.org/abs/2102.12919 |
رقم الانضمام: | edsarx.2102.12919 |
قاعدة البيانات: | arXiv |
DOI: | 10.4171/MSL/27 |
---|