Hardware Accelerators for Machine Learning: From 3D Manycore to Processing-in-Memory Architectures

Hardware Accelerators for Machine Learning: From 3D Manycore to Processing-in-Memory Architectures
Author :
Publisher :
Total Pages : 0
Release :
ISBN-10 : 9798352956595
ISBN-13 :
Rating : 4/5 ( Downloads)

Book Synopsis Hardware Accelerators for Machine Learning: From 3D Manycore to Processing-in-Memory Architectures by : Aqeeb Iqbal Arka

Download or read book Hardware Accelerators for Machine Learning: From 3D Manycore to Processing-in-Memory Architectures written by Aqeeb Iqbal Arka and published by . This book was released on 2022 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Big data applications such as - deep learning and graph analytics require hardware platforms that are energy-efficient yet computationally powerful. 3D manycore architectures are the key to efficiently executing such compute- and data-intensive applications. Through silicon via (TSV)-based 3D manycore system is a promising solution in this direction as it enables integration of disparate heterogeneous computing cores on a single system. Recent industry trends show the viability of 3D integration in real products (e.g., Intel Lakefield SoC Architecture, the AMD Radeon R9 Fury X graphics card, and Xilinx Virtex-7 2000T/H580T, etc.). However, the achievable performance of conventional through-silicon-via (TSV)-based 3D systems is ultimately bottlenecked by the horizontal wires (wires in each planar die). Moreover, current TSV 3D architectures suffer from thermal limitations. Hence, TSV-based architectures do not realize the full potential of 3D integration. Monolithic 3D (M3D) integration, a breakthrough technology to achieve "More Moore and More Than Moore," and opens up the possibility of designing cores and associated network routers using multiple layers by utilizing monolithic inter-tier vias (MIVs) and hence, reducing the effective wire length. Compared to TSV-based 3D ICs, M3D offers the "true" benefits of vertical dimension for system integration: the size of a MIV used in M3D is over 100x smaller than a TSV. However, designing these new architectures often involves optimizingmultiple conflicting objectives (e.g., performance, thermal, etc.) due to thepresence of a mix of computing elements and communication methodologies; each with a different requirement for high performance. To overcome the difficult optimization challenges due to the large design space and complex interactions among the heterogeneous components (CPU, GPU, Last Level Cache, etc.) in an M3D-based manycore chip, Machine Learning algorithms can be explored as a promising solution to this problem and. The first part of this dissertation focuses on the design of high-performance and energy-efficient architectures for big-data applications, enabled by M3D vertical integration and data-driven machine learning algorithms. As an example, we consider heterogeneous manycore architectures with CPUs, GPUs, and Cache as the choice of hardware platform in this part of the work. The disparate nature of these processing elements introduces conflicting design requirements that need to be satisfied simultaneously. Moreover, the on-chip traffic pattern exhibited by different big-data applications (like many-to-few-to-many in CPU/GPU-based manycore architectures) need to be incorporated in the design process for optimal power-performance trade-off. In this dissertation, we first design a M3D-enabled heterogeneous manycore architecture and we demonstrate the efficacy of machine learning algorithms for efficiently exploring a large design space. For large design space exploration problems, the proposed machine learning algorithm can find good solutions in significantly less amount of time than exiting state-of-the-art counterparts. However, the M3D-enabled heterogeneous manycore architecture is still limited by the inherent memory bandwidth bottlenecks of traditional von-Neumann architectures. As a result, later in this dissertation, we focus on Processing-in-Memory (PIM) architectures tailor-made to accelerate deep learning applications such as Graph Neural Networks (GNNs) as such architectures can achieve massive data parallelism and do not suffer from memory bandwidth-related issues. We choose GNNs as an example workload as GNNs are more complex compared to traditional deep learning applications as they simultaneously exhibit attributes of both deep learning and graph computations. Hence, it is both compute- and data-intensive in nature. The high amount of data movement required by GNN computation poses a challenge to conventional von-Neuman architectures (such as CPUs, GPUs, and heterogeneous system-on-chips (SoCs)) as they have limited memory bandwidth. Hence, we propose the use of PIM-based non-volatile memory such as Resistive Random Access Memory (ReRAM). We leverage the efficient matrix operations enabled by ReRAMs and design manycore architectures that can facilitate the unique computation and communication needs of large-scale GNN training. We then exploit various techniques such as regularization methods to further accelerate GNN training ReRAM-based manycore systems. Finally, we streamline the GNN training process by reducing the amount of redundant information in both the GNN model and the input graph.Overall, this work focuses on the design challenges of high-performance and energy-efficient manycore architectures for machine learning applications. We propose novel architectures that use M3D or ReRAM-based PIM architectures to accelerate such applications. Moreover, we focus on hardware/software co-design to ensure the best possible performance.


Hardware Accelerators for Machine Learning: From 3D Manycore to Processing-in-Memory Architectures Related Books

Hardware Accelerators for Machine Learning: From 3D Manycore to Processing-in-Memory Architectures
Language: en
Pages: 0
Authors: Aqeeb Iqbal Arka
Categories: Machine learning
Type: BOOK - Published: 2022 - Publisher:

DOWNLOAD EBOOK

Big data applications such as - deep learning and graph analytics require hardware platforms that are energy-efficient yet computationally powerful. 3D manycore
Towards Heterogeneous Multi-core Systems-on-Chip for Edge Machine Learning
Language: en
Pages: 199
Authors: Vikram Jain
Categories: Technology & Engineering
Type: BOOK - Published: 2023-09-15 - Publisher: Springer Nature

DOWNLOAD EBOOK

This book explores and motivates the need for building homogeneous and heterogeneous multi-core systems for machine learning to enable flexibility and energy-ef
In-Memory Computing Hardware Accelerators for Data-Intensive Applications
Language: en
Pages: 145
Authors: Baker Mohammad
Categories: Technology & Engineering
Type: BOOK - Published: 2023-10-27 - Publisher: Springer Nature

DOWNLOAD EBOOK

This book describes the state-of-the-art of technology and research on In-Memory Computing Hardware Accelerators for Data-Intensive Applications. The authors di
Hardware Accelerators in Data Centers
Language: en
Pages: 280
Authors: Christoforos Kachris
Categories: Technology & Engineering
Type: BOOK - Published: 2018-08-21 - Publisher: Springer

DOWNLOAD EBOOK

This book provides readers with an overview of the architectures, programming frameworks, and hardware accelerators for typical cloud computing applications in
Machine Learning-Enabled Vertically Integrated Heterogeneous Manycore Systems for Big-Data Analytics
Language: en
Pages: 101
Authors: Biresh Kumar Joardar
Categories: Big data
Type: BOOK - Published: 2020 - Publisher:

DOWNLOAD EBOOK

The rising use of deep learning and other big-data algorithms has led to an increasing demand for hardware platforms that are computationally powerful, yet ener