MemPool-3D: Boosting Performance and Efficiency of Shared-L1 Memory Many-Core Clusters with 3D Integration

التفاصيل البيبلوغرافية
العنوان: MemPool-3D: Boosting Performance and Efficiency of Shared-L1 Memory Many-Core Clusters with 3D Integration
المؤلفون: Matheus Cavalcante, Anthony Agnesina, Samuel Riedel, Moritz Brunion, Alberto Garcia-Ortiz, Dragomir Milojevic, Francky Catthoor, Sung Kyu Lim, Luca Benini
المساهمون: Matheus Cavalcante, Anthony Agnesina, Samuel Riedel, Moritz Brunion, Alberto Garcia-Ortiz, Dragomir Milojevic, Francky Catthoor, Sung Kyu Lim, Luca Benini
بيانات النشر: IEEE
345 E 47TH ST, NEW YORK, NY 10017 USA
سنة النشر: 2022
المجموعة: IRIS Università degli Studi di Bologna (CRIS - Current Research Information System)
مصطلحات موضوعية: Many-core, 3D Integration, 3D-ICs
الوصف: Three-dimensional integrated circuits promise power, performance, and footprint gains compared to their 2D counterparts, thanks to drastic reductions in the interconnects' length through their smaller form factor. We can leverage the potential of 3D integration by enhancing MemPool, an open-source many-core design with 256 cores and a shared pool of L1 scratchpad memory connected with a low-latency interconnect. MemPool's baseline 2D design is severely limited by routing congestion and wire propagation delay, making the design ideal for 3D integration. In architectural terms, we increase MemPool's scratchpad memory capacity beyond the sweet spot for 2D designs, improving performance in a common digital signal processing kernel. We propose a 3D MemPool design that leverages a smart partitioning of the memory resources across two layers to balance the size and utilization of the stacked dies. In this paper, we explore the architectural and the technology parameter spaces by analyzing the power, performance, area, and energy efficiency of MemPool instances in 2D and 3D with 1 MiB, 2 MiB, 4 MiB, and 8 MiB of scratchpad memory in a commercial 28nm technology node. We observe a performance gain of 9.1% when running a matrix multiplication on MemPool-3D with 4 MiB of scratchpad memory compared to the MemPool 2D counterpart. In terms of energy efficiency, we can implement the MemPool-3D instance with 4 MiB of L1 memory on an energy budget 15% smaller than its 2D counterpart, and 3.7% smaller than the MemPool-2D instance with a fourth of the L1 scratchpad memory capacity.
نوع الوثيقة: conference object
وصف الملف: ELETTRONICO
اللغة: English
Relation: info:eu-repo/semantics/altIdentifier/isbn/978-3-9819263-6-1; info:eu-repo/semantics/altIdentifier/wos/WOS:000819484300076; ispartofbook:2022 Design, Automation & Test in Europe Conference & Exhibition (DATE); 2022 Design, Automation & Test in Europe Conference & Exhibition (DATE); firstpage:394; lastpage:399; numberofpages:6; serie:PROCEEDINGS - DESIGN, AUTOMATION, AND TEST IN EUROPE CONFERENCE AND EXHIBITION; https://hdl.handle.net/11585/905384; info:eu-repo/semantics/altIdentifier/scopus/2-s2.0-85130791268
DOI: 10.23919/date54114.2022.9774726
الاتاحة: https://hdl.handle.net/11585/905384
https://doi.org/10.23919/date54114.2022.9774726
رقم الانضمام: edsbas.6024FD1E
قاعدة البيانات: BASE
الوصف
DOI:10.23919/date54114.2022.9774726