The Essential Histogram

التفاصيل البيبلوغرافية
العنوان: The Essential Histogram
المؤلفون: Li, Housen, Munk, Axel, Sieling, Hannes, Walther, Guenther
المصدر: Biometrika, 2020
سنة النشر: 2016
المجموعة: Mathematics
Statistics
مصطلحات موضوعية: Mathematics - Statistics Theory, Statistics - Methodology, 62G10, 62H30
الوصف: The histogram is widely used as a simple, exploratory display of data, but it is usually not clear how to choose the number and size of bins. We construct a confidence set of distribution functions that optimally address the two main tasks of the histogram: estimating probabilities and detecting features such as increases and modes in the distribution. We define the essential histogram as the histogram in the confidence set with the fewest bins. Thus the essential histogram is the simplest visualization of the data that optimally achieves the main tasks of the histogram. The only assumption we make is that the data are independent and identically distributed. We provide a fast algorithm for the essential histogram, and illustrate our methodology with examples. An R-package is available on CRAN.
Comment: Extension to discrete data is included. A R-package "essHist" is available from https://CRAN.R-project.org/package=essHist
نوع الوثيقة: Working Paper
DOI: 10.1093/biomet/asz081
URL الوصول: http://arxiv.org/abs/1612.07216
رقم الانضمام: edsarx.1612.07216
قاعدة البيانات: arXiv