Test and fault-tolerance for network-on-chip infrastructures

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Test and fault-tolerance for network-on-chip infrastructures Grecu, Cristian

Abstract

The demands of future computing, as well as the challenges of nanometer-era VLSI design, will require new design techniques and design styles that are simultaneously high performance, energy-efficient, and robust to noise and process variation. One of the emerging problems concerns the communication mechanisms between the increasing number of blocks, or cores, that can be integrated onto a single chip. The bus-based systems and point-to-point interconnection strategies in use today cannot be easily scaled to accommodate the large numbers of cores projected in the near future. Network-on-chip (NoC) interconnect infrastructures are one of the key technologies that will enable the emergence of many-core processors and systems-on-chip with increased computing power and energy efficiency. This dissertation is focused on testing, yield improvement and fault-tolerance of such NoC infrastructures. A fast, efficient test method is developed for NoCs, that exploits their inherent parallelism to reduce the test time by transporting test data on multiple paths and testing multiple NoC components concurrently. The improvement of test time varies, depending on the NoC architecture and test transport protocol, from 2X to 34X, compared to current NoC test methods. This test mechanism is used subsequently to perform detection of NoC link permanent faults, which are then repaired by an on-chip mechanism that replaces the faulty signal lines with fault-free ones, thereby increasing the yield, while maintaining the same wire delay characteristics. The solution described in this dissertation improves significantly the achievable yield of NoC inter-switch channels â from 4% improvement for an 8-bit wide channel, to a 71% improvement for a 128-bit wide channel. The direct benefit is an improved fault-tolerance and increased yield and long-term reliability of NoC based multicore systems.

Item Metadata

Title	Test and fault-tolerance for network-on-chip infrastructures
Creator	Grecu, Cristian
Publisher	University of British Columbia
Date Issued	2008
Description	The demands of future computing, as well as the challenges of nanometer-era VLSI design, will require new design techniques and design styles that are simultaneously high performance, energy-efficient, and robust to noise and process variation. One of the emerging problems concerns the communication mechanisms between the increasing number of blocks, or cores, that can be integrated onto a single chip. The bus-based systems and point-to-point interconnection strategies in use today cannot be easily scaled to accommodate the large numbers of cores projected in the near future. Network-on-chip (NoC) interconnect infrastructures are one of the key technologies that will enable the emergence of many-core processors and systems-on-chip with increased computing power and energy efficiency. This dissertation is focused on testing, yield improvement and fault-tolerance of such NoC infrastructures. A fast, efficient test method is developed for NoCs, that exploits their inherent parallelism to reduce the test time by transporting test data on multiple paths and testing multiple NoC components concurrently. The improvement of test time varies, depending on the NoC architecture and test transport protocol, from 2X to 34X, compared to current NoC test methods. This test mechanism is used subsequently to perform detection of NoC link permanent faults, which are then repaired by an on-chip mechanism that replaces the faulty signal lines with fault-free ones, thereby increasing the yield, while maintaining the same wire delay characteristics. The solution described in this dissertation improves significantly the achievable yield of NoC inter-switch channels â from 4% improvement for an 8-bit wide channel, to a 71% improvement for a 128-bit wide channel. The direct benefit is an improved fault-tolerance and increased yield and long-term reliability of NoC based multicore systems.
Extent	2104734 bytes
Genre	Thesis/Dissertation
Type	Text
File Format	application/pdf
Language	eng
Date Available	2008-11-28
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0065579
URI	http://hdl.handle.net/2429/2816
Degree	Doctor of Philosophy - PhD
Program	Electrical and Computer Engineering
Affiliation	Applied Science, Faculty of; Electrical and Computer Engineering, Department of
Degree Grantor	University of British Columbia
Graduation Date	2009-05
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Test and fault-tolerance for network-on-chip infrastructures Grecu, Cristian

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights