Academic Journal

Enhancing Neural Architecture Search with Multiple Hardware Constraints for Deep Learning Model Deployment on Tiny IoT Devices

التفاصيل البيبلوغرافية
العنوان: Enhancing Neural Architecture Search with Multiple Hardware Constraints for Deep Learning Model Deployment on Tiny IoT Devices
المؤلفون: Alessio Burrello, Matteo Risso, Beatrice Alessandra Motetti, Enrico Macii, Luca Benini, Daniele Jahier Pagliari
المساهمون: Burrello, Alessio, Risso, Matteo, Motetti, BEATRICE ALESSANDRA, Macii, Enrico, Benini, Luca, JAHIER PAGLIARI, Daniele
بيانات النشر: IEEE
سنة النشر: 2024
المجموعة: PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)
مصطلحات موضوعية: Deep Learning, TinyML, Computing Architecture, Energy-efficiency, Neural Architecture Search, Hardware-aware NAS, IoT
الوصف: The rapid proliferation of computing domains relying on Internet of Things (IoT) devices has created a pressing need for efficient and accurate deep-learning (DL) models that can run on low-power devices. However, traditional DL models tend to be too complex and computationally intensive for typical IoT end-nodes. To address this challenge, Neural Architecture Search (NAS) has emerged as a popular design automation technique for co-optimizing the accuracy and complexity of deep neural networks. Nevertheless, existing NAS techniques require many iterations to produce a network that adheres to specific hardware constraints, such as the maximum memory available on the hardware or the maximum latency allowed by the target application. In this work, we propose a novel approach to incorporate multiple constraints into so-called Differentiable NAS optimization methods, which allows the generation, in a single shot, of a model that respects user-defined constraints on both memory and latency in a time comparable to a single standard training. The proposed approach is evaluated on five IoT-relevant benchmarks, including the MLPerf Tiny suite and Tiny ImageNet, demonstrating that, with a single search, it is possible to reduce memory and latency by 87.4% and 54.2%, respectively (as defined by our targets), while ensuring non-inferior accuracy on state-of-the-art hand-tuned deep neural networks for TinyML.
نوع الوثيقة: article in journal/newspaper
وصف الملف: ELETTRONICO
اللغة: English
Relation: info:eu-repo/semantics/altIdentifier/wos/WOS:001309973800006; volume:12; issue:3; firstpage:780; lastpage:794; numberofpages:15; journal:IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING; https://hdl.handle.net/11583/2982927; https://ieeexplore.ieee.org/document/10278089
DOI: 10.1109/TETC.2023.3322033
الاتاحة: https://hdl.handle.net/11583/2982927
https://doi.org/10.1109/TETC.2023.3322033
https://ieeexplore.ieee.org/document/10278089
Rights: info:eu-repo/semantics/openAccess
رقم الانضمام: edsbas.35459562
قاعدة البيانات: BASE
الوصف
DOI:10.1109/TETC.2023.3322033