2016
DOI: 10.1145/2886781
|View full text |Cite
|
Sign up to set email alerts
|

A Survey on Design Approaches to Circumvent Permanent Faults in Networks-on-Chip

Abstract: Increasing fault rates in current and future technology nodes coupled with on-chip components in the hundreds calls for robust and fault-tolerant Network-on-Chip (NoC) designs. Given the central role of NoCs in today's many-core chips, permanent faults impeding their original functionality may significantly influence performance, energy consumption, and correct operation of the entire system. As a result, fault-tolerant NoC design gained much attention in recent years. In this article, we review the vast resea… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
21
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 35 publications
(21 citation statements)
references
References 119 publications
0
21
0
Order By: Relevance
“…Moreover, approaches focus on different types of random hardware faults: transient and intermittent faults [16]- [19]; permanent faults [24]; or both [20]- [23]. Comprehensive overviews are found in [25] and [26]. The key technique varies with the approach: from retransmission protocols and adaptive routing to stochastic broadcasts.…”
Section: Related Workmentioning
confidence: 99%
“…Moreover, approaches focus on different types of random hardware faults: transient and intermittent faults [16]- [19]; permanent faults [24]; or both [20]- [23]. Comprehensive overviews are found in [25] and [26]. The key technique varies with the approach: from retransmission protocols and adaptive routing to stochastic broadcasts.…”
Section: Related Workmentioning
confidence: 99%
“…In more detail, a chip exhibits different error rates in different periods of its lifetime. Without loss of generality, this variability has been modelled as a "bathtub" curve [39], as shown in Figure 3. In its infant period, there is a very high but decreasing failure rate, until a plateau of minimum, constant failure rate is reached at its grace period.…”
Section: Target Application Modelmentioning
confidence: 99%
“…The distributed nature of the targeted systems on both Processing Elements (PEs) and Resource Management imposes extra design requirements and increased complexity to provide fault tolerance guarantees in an online and timely manner. Therefore, it has been identified that in order to effectively mitigate variability issues in kilo-core SoCs, it is mandatory to intervene and leverage techniques for increased dependability in all layers of system design ranging from hardware [39] to high level application development [18,31].…”
Section: Introductionmentioning
confidence: 99%
“…This issue also affects link and router of NoC that must require a specific attention, in order to maximize yield and to ensure correct operation. This emphasizes the significance of robust design solutions and has led to fault tolerance becoming a fundamental design constraint [3]. In this context, many fault tolerance techniques have been proposed at several levels (circuit/system and hardware/software) for critical applications.…”
Section: Introductionmentioning
confidence: 99%