Conference
Best of both, Structured and Unstructured Sparsity in Neural Networks
العنوان: | Best of both, Structured and Unstructured Sparsity in Neural Networks |
---|---|
المؤلفون: | Schulte-Schüren, Christopher, Wagner, Sven, Runge, Armin, Bariamis, Dimitrios, Hammer, Barbara, Yoneki, Eiko, Nardi, Luigi |
بيانات النشر: | ACM |
سنة النشر: | 2023 |
المجموعة: | PUB - Publications at Bielefeld University |
الوصف: | Schulte-Schüren C, Wagner S, Runge A, et al. Best of both, Structured and Unstructured Sparsity in Neural Networks. In: Proceedings of the 3rd Workshop on Machine Learning and Systems . New York, NY, USA: ACM; 2023: 104-108. ; Besides quantization, pruning has shown to be one of the most effective methods to reduce the inference time and required energy of Deep Neural Networks (DNNs). In this work, we propose a sparsity definition that reflects the number of saved operations by pruned parameters to guide the pruning process in order to save as many operations as possible. Based on this, we show the importance of the baseline model's size and quantify the overhead of unstructured sparsity for a commercial-of-the-shelf AI Hardware Accelerator (HWA) in terms of latency reductions. Furthermore, we show that a combination of both structured and unstructured sparsity can mitigate this effect. |
نوع الوثيقة: | conference object report |
اللغة: | English |
Relation: | info:eu-repo/semantics/altIdentifier/isbn/9798400700842; https://pub.uni-bielefeld.de/record/2984048 |
الاتاحة: | https://pub.uni-bielefeld.de/record/2984048 |
Rights: | info:eu-repo/semantics/closedAccess |
رقم الانضمام: | edsbas.95E7E97E |
قاعدة البيانات: | BASE |
الوصف غير متاح. |