Best of both, Structured and Unstructured Sparsity in Neural Networks

التفاصيل البيبلوغرافية
العنوان: Best of both, Structured and Unstructured Sparsity in Neural Networks
المؤلفون: Schulte-Schüren, Christopher, Wagner, Sven, Runge, Armin, Bariamis, Dimitrios, Hammer, Barbara, Yoneki, Eiko, Nardi, Luigi
بيانات النشر: ACM
سنة النشر: 2023
المجموعة: PUB - Publications at Bielefeld University
الوصف: Schulte-Schüren C, Wagner S, Runge A, et al. Best of both, Structured and Unstructured Sparsity in Neural Networks. In: Proceedings of the 3rd Workshop on Machine Learning and Systems . New York, NY, USA: ACM; 2023: 104-108. ; Besides quantization, pruning has shown to be one of the most effective methods to reduce the inference time and required energy of Deep Neural Networks (DNNs). In this work, we propose a sparsity definition that reflects the number of saved operations by pruned parameters to guide the pruning process in order to save as many operations as possible. Based on this, we show the importance of the baseline model's size and quantify the overhead of unstructured sparsity for a commercial-of-the-shelf AI Hardware Accelerator (HWA) in terms of latency reductions. Furthermore, we show that a combination of both structured and unstructured sparsity can mitigate this effect.
نوع الوثيقة: conference object
report
اللغة: English
Relation: info:eu-repo/semantics/altIdentifier/isbn/9798400700842; https://pub.uni-bielefeld.de/record/2984048
الاتاحة: https://pub.uni-bielefeld.de/record/2984048
Rights: info:eu-repo/semantics/closedAccess
رقم الانضمام: edsbas.95E7E97E
قاعدة البيانات: BASE