With Shared Microexponents, A Little Shifting Goes a Long Way

التفاصيل البيبلوغرافية
العنوان: With Shared Microexponents, A Little Shifting Goes a Long Way
المؤلفون: Rouhani, Bita, Zhao, Ritchie, Elango, Venmugil, Shafipour, Rasoul, Hall, Mathew, Mesmakhosroshahi, Maral, More, Ankit, Melnick, Levi, Golub, Maximilian, Varatkar, Girish, Shao, Lei, Kolhe, Gaurav, Melts, Dimitry, Klar, Jasmine, L'Heureux, Renee, Perry, Matt, Burger, Doug, Chung, Eric, Deng, Zhaoxia, Naghshineh, Sam, Park, Jongsoo, Naumov, Maxim
سنة النشر: 2023
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Hardware Architecture
الوصف: This paper introduces Block Data Representations (BDR), a framework for exploring and evaluating a wide spectrum of narrow-precision formats for deep learning. It enables comparison of popular quantization standards, and through BDR, new formats based on shared microexponents (MX) are identified, which outperform other state-of-the-art quantization approaches, including narrow-precision floating-point and block floating-point. MX utilizes multiple levels of quantization scaling with ultra-fine scaling factors based on shared microexponents in the hardware. The effectiveness of MX is demonstrated on real-world models including large-scale generative pretraining and inferencing, and production-scale recommendation systems.
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2302.08007
رقم الانضمام: edsarx.2302.08007
قاعدة البيانات: arXiv