Built-in Fault-Tolerant Computing Paradigm for Resilient Large-Scale Chip Design

Built-in Fault-Tolerant Computing Paradigm for Resilient Large-Scale Chip Design
Author :
Publisher : Springer Nature
Total Pages : 318
Release :
ISBN-10 : 9789811985515
ISBN-13 : 9811985510
Rating : 4/5 (510 Downloads)

Book Synopsis Built-in Fault-Tolerant Computing Paradigm for Resilient Large-Scale Chip Design by : Xiaowei Li

Download or read book Built-in Fault-Tolerant Computing Paradigm for Resilient Large-Scale Chip Design written by Xiaowei Li and published by Springer Nature. This book was released on 2023-03-01 with total page 318 pages. Available in PDF, EPUB and Kindle. Book excerpt: With the end of Dennard scaling and Moore’s law, IC chips, especially large-scale ones, now face more reliability challenges, and reliability has become one of the mainstay merits of VLSI designs. In this context, this book presents a built-in on-chip fault-tolerant computing paradigm that seeks to combine fault detection, fault diagnosis, and error recovery in large-scale VLSI design in a unified manner so as to minimize resource overhead and performance penalties. Following this computing paradigm, we propose a holistic solution based on three key components: self-test, self-diagnosis and self-repair, or “3S” for short. We then explore the use of 3S for general IC designs, general-purpose processors, network-on-chip (NoC) and deep learning accelerators, and present prototypes to demonstrate how 3S responds to in-field silicon degradation and recovery under various runtime faults caused by aging, process variations, or radical particles. Moreover, we demonstrate that 3S not only offers a powerful backbone for various on-chip fault-tolerant designs and implementations, but also has farther-reaching implications such as maintaining graceful performance degradation, mitigating the impact of verification blind spots, and improving chip yield. This book is the outcome of extensive fault-tolerant computing research pursued at the State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences over the past decade. The proposed built-in on-chip fault-tolerant computing paradigm has been verified in a broad range of scenarios, from small processors in satellite computers to large processors in HPCs. Hopefully, it will provide an alternative yet effective solution to the growing reliability challenges for large-scale VLSI designs.


Built-in Fault-Tolerant Computing Paradigm for Resilient Large-Scale Chip Design Related Books

Built-in Fault-Tolerant Computing Paradigm for Resilient Large-Scale Chip Design
Language: en
Pages: 318
Authors: Xiaowei Li
Categories: Computers
Type: BOOK - Published: 2023-03-01 - Publisher: Springer Nature

DOWNLOAD EBOOK

With the end of Dennard scaling and Moore’s law, IC chips, especially large-scale ones, now face more reliability challenges, and reliability has become one o
Fault-Tolerance Techniques for High-Performance Computing
Language: en
Pages: 325
Authors: Thomas Herault
Categories: Computers
Type: BOOK - Published: 2015-07-01 - Publisher: Springer

DOWNLOAD EBOOK

This timely text presents a comprehensive overview of fault tolerance techniques for high-performance computing (HPC). The text opens with a detailed introducti
Fault Tolerant Computer Architecture
Language: en
Pages: 116
Authors: Daniel Sorin
Categories: Technology & Engineering
Type: BOOK - Published: 2009-07-08 - Publisher: Morgan & Claypool Publishers

DOWNLOAD EBOOK

For many years, most computer architects have pursued one primary goal: performance. Architects have translated the ever-increasing abundance of ever-faster tra
Use of On-chip Redundancy for Fault-tolerant Very Large Scale Integrated Circuit Design
Language: en
Pages: 264
Authors: Tülin Erdim Mangir
Categories: Integrated circuits
Type: BOOK - Published: 1981 - Publisher:

DOWNLOAD EBOOK

Fault-Tolerant Computer Architecture Modeling and Analysis
Language: en
Pages: 313
Authors: M. P. Kastner
Categories:
Type: BOOK - Published: 1989 - Publisher:

DOWNLOAD EBOOK

This document is the final report. It describes a three-year research effort to develop an engineering tool capable of supporting design and analysis of large-s