Soft-error Resilient On-chip Memory Structures

Soft-error Resilient On-chip Memory Structures
Author :
Publisher :
Total Pages : 126
Release :
ISBN-10 : OCLC:935402734
ISBN-13 :
Rating : 4/5 (34 Downloads)

Book Synopsis Soft-error Resilient On-chip Memory Structures by : Shuai Wang

Download or read book Soft-error Resilient On-chip Memory Structures written by Shuai Wang and published by . This book was released on 2010 with total page 126 pages. Available in PDF, EPUB and Kindle. Book excerpt: Soft errors induced by energetic particle strikes in on-chip memory structures, such as L1 data/instruction caches and register files, have become an increasing challenge in designing new generation reliable microprocessors. Due to their transient/random nature, soft errors cannot be captured by traditional verification and testing process due to the irrelevancy to the correctness of the logic. This dissertation is thus focusing on the reliability characterization and cost-effective reliable design of on-chip memories against soft errors. Due to various performance, area/size, and energy constraints in various target systems, many existing unoptimized protection schemes on cache memories may eventually prove significantly inadequate and ineffective. This work develops new lifetime models for data and tag arrays residing in both the data and instruction caches. These models facilitate the characterization of cache vulnerability of the stored items at various lifetime phases. The design methodology is further exemplified by the proposed reliability schemes targeting at specific vulnerable phases. Benchmarking is carried out to showcase the effectiveness of these approaches. The tag array demands high reliability against soft errors while the data array is fully protected in on-chip caches, because of its crucial importance to the correctness of cache accesses. Exploiting the address locality of memory accesses, this work proposes a Tag Replication Buffer (TRB) to protect information integrity of the tag array in the data cache with low performance, energy and area overheads. To provide a comprehensive evaluation of the tag array reliability, this work also proposes a refined evaluation metric, detected-without-replica-TVF (DOR-TVF), which combines the TVF and access-with-replica (AWR) analysis. Based on the DOR-TVF analysis, a TRB scheme with early write-back (TRB-EWB) is proposed, which achieves a zero DOR-TVF at a negligible performance overhead. Recent research, as well as the proposed optimization schemes in this cache vulnerability study, have focused on the design of cost-effective reliable data caches in terms of performance, energy, and area overheads based on the assumption of fixed error rates. However, for systems in operating environments that vary with time or location, those schemes will be either insufficient or over-designed for the changing error rates. This work explores the design of a self-adaptive reliable data cache that dynamically adapts its employed reliability schemes to the changing operating environments in order to maintain a target reliability. The experimental evaluation shows that the self-adaptive data cache achieves similar reliability to a cache protected by the most reliable scheme, while simultaneously minimizing the performance and power overheads. Besides the data/instruction caches, protecting the register file and its data buses is crucial to reliable computing in high-performance microprocessors. Since the register file is in the critical path of the processor pipeline, any reliable design that increases either the pressure on the register file or the register file access latency is not desirable. This work proposes to exploit narrow-width register values, which represent the majority of generated values, for making the duplicates within the same register data item. A detailed architectural vulnerability factor (AVF) analysis shows that this in-register duplication (IRD) scheme significantly reduces the AVF in the register file compared to the conventional design. The experimental evaluation also shows that IRD provides superior read-with-duplicate (RWD) and error detection/recovery rates under heavy error injection as compared to previous reliability schemes, while only incurring a small power overhead. By integrating the proposed reliable designs in data/instruction caches and register files, the vulnerability of the entire microprocessor is dramatically reduced. The new lifetime model, the self-adaptive design and the narrow-width value duplication scheme proposed in this work can also provide guidance to architects toward highly efficient reliable system design.


Soft-error Resilient On-chip Memory Structures Related Books

Soft-error Resilient On-chip Memory Structures
Language: en
Pages: 126
Authors: Shuai Wang
Categories:
Type: BOOK - Published: 2010 - Publisher:

DOWNLOAD EBOOK

Soft errors induced by energetic particle strikes in on-chip memory structures, such as L1 data/instruction caches and register files, have become an increasing
Built-in Fault-Tolerant Computing Paradigm for Resilient Large-Scale Chip Design
Language: en
Pages: 318
Authors: Xiaowei Li
Categories: Computers
Type: BOOK - Published: 2023-03-01 - Publisher: Springer Nature

DOWNLOAD EBOOK

With the end of Dennard scaling and Moore’s law, IC chips, especially large-scale ones, now face more reliability challenges, and reliability has become one o
Resilient On-chip Memory Design in the Nano Era
Language: en
Pages: 219
Authors: Abbas Banaiyanmofrad
Categories:
Type: BOOK - Published: 2015 - Publisher:

DOWNLOAD EBOOK

Aggressive technology scaling in the nano-scale regime makes chips more susceptible to failures. This causes multiple reliability challenges in the design of mo
Circuit and Layout Techniques for Soft-error-resilient Digital CMOS Circuits
Language: en
Pages: 156
Authors: Hsiao-Heng Kelin Lee
Categories:
Type: BOOK - Published: 2011 - Publisher: Stanford University

DOWNLOAD EBOOK

Radiation-induced soft errors are a major concern for modern digital circuits, especially memory elements. Unlike large Random Access Memories that can be prote
Software Design for Resilient Computer Systems
Language: en
Pages: 218
Authors: Igor Schagaev
Categories: Technology & Engineering
Type: BOOK - Published: 2016-02-13 - Publisher: Springer

DOWNLOAD EBOOK

This book addresses the question of how system software should be designed to account for faults, and which fault tolerance features it should provide for highe