التفاصيل البيبلوغرافية
العنوان: |
With Shared Microexponents, A Little Shifting Goes a Long Way |
المؤلفون: |
Rouhani, Bita, Zhao, Ritchie, Elango, Venmugil, Shafipour, Rasoul, Hall, Mathew, Mesmakhosroshahi, Maral, More, Ankit, Melnick, Levi, Golub, Maximilian, Varatkar, Girish, Shao, Lei, Kolhe, Gaurav, Melts, Dimitry, Klar, Jasmine, L'Heureux, Renee, Perry, Matt, Burger, Doug, Chung, Eric, Deng, Zhaoxia, Naghshineh, Sam, Park, Jongsoo, Naumov, Maxim |
سنة النشر: |
2023 |
المجموعة: |
Computer Science |
مصطلحات موضوعية: |
Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Hardware Architecture |
الوصف: |
This paper introduces Block Data Representations (BDR), a framework for exploring and evaluating a wide spectrum of narrow-precision formats for deep learning. It enables comparison of popular quantization standards, and through BDR, new formats based on shared microexponents (MX) are identified, which outperform other state-of-the-art quantization approaches, including narrow-precision floating-point and block floating-point. MX utilizes multiple levels of quantization scaling with ultra-fine scaling factors based on shared microexponents in the hardware. The effectiveness of MX is demonstrated on real-world models including large-scale generative pretraining and inferencing, and production-scale recommendation systems. |
نوع الوثيقة: |
Working Paper |
URL الوصول: |
http://arxiv.org/abs/2302.08007 |
رقم الانضمام: |
edsarx.2302.08007 |
قاعدة البيانات: |
arXiv |