Academic Journal
Enhancing Neural Architecture Search with Multiple Hardware Constraints for Deep Learning Model Deployment on Tiny IoT Devices
العنوان: | Enhancing Neural Architecture Search with Multiple Hardware Constraints for Deep Learning Model Deployment on Tiny IoT Devices |
---|---|
المؤلفون: | Alessio Burrello, Matteo Risso, Beatrice Alessandra Motetti, Enrico Macii, Luca Benini, Daniele Jahier Pagliari |
المساهمون: | Burrello, Alessio, Risso, Matteo, Motetti, BEATRICE ALESSANDRA, Macii, Enrico, Benini, Luca, JAHIER PAGLIARI, Daniele |
بيانات النشر: | IEEE |
سنة النشر: | 2024 |
المجموعة: | PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino) |
مصطلحات موضوعية: | Deep Learning, TinyML, Computing Architecture, Energy-efficiency, Neural Architecture Search, Hardware-aware NAS, IoT |
الوصف: | The rapid proliferation of computing domains relying on Internet of Things (IoT) devices has created a pressing need for efficient and accurate deep-learning (DL) models that can run on low-power devices. However, traditional DL models tend to be too complex and computationally intensive for typical IoT end-nodes. To address this challenge, Neural Architecture Search (NAS) has emerged as a popular design automation technique for co-optimizing the accuracy and complexity of deep neural networks. Nevertheless, existing NAS techniques require many iterations to produce a network that adheres to specific hardware constraints, such as the maximum memory available on the hardware or the maximum latency allowed by the target application. In this work, we propose a novel approach to incorporate multiple constraints into so-called Differentiable NAS optimization methods, which allows the generation, in a single shot, of a model that respects user-defined constraints on both memory and latency in a time comparable to a single standard training. The proposed approach is evaluated on five IoT-relevant benchmarks, including the MLPerf Tiny suite and Tiny ImageNet, demonstrating that, with a single search, it is possible to reduce memory and latency by 87.4% and 54.2%, respectively (as defined by our targets), while ensuring non-inferior accuracy on state-of-the-art hand-tuned deep neural networks for TinyML. |
نوع الوثيقة: | article in journal/newspaper |
وصف الملف: | ELETTRONICO |
اللغة: | English |
Relation: | info:eu-repo/semantics/altIdentifier/wos/WOS:001309973800006; volume:12; issue:3; firstpage:780; lastpage:794; numberofpages:15; journal:IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING; https://hdl.handle.net/11583/2982927; https://ieeexplore.ieee.org/document/10278089 |
DOI: | 10.1109/TETC.2023.3322033 |
الاتاحة: | https://hdl.handle.net/11583/2982927 https://doi.org/10.1109/TETC.2023.3322033 https://ieeexplore.ieee.org/document/10278089 |
Rights: | info:eu-repo/semantics/openAccess |
رقم الانضمام: | edsbas.35459562 |
قاعدة البيانات: | BASE |
DOI: | 10.1109/TETC.2023.3322033 |
---|