2002
DOI: 10.1147/rd.461.0077
|View full text |Cite
|
Sign up to set email alerts
|

Fault-tolerant design of the IBM pSeries 690 system using POWER4 processor technology

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
15
0

Year Published

2004
2004
2023
2023

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 38 publications
(15 citation statements)
references
References 5 publications
0
15
0
Order By: Relevance
“…Redundancy for fault-tolerance can be incorporated in different levels of granularity. Commercial high-availability multiprocessor systems like the IBM p690 have been designed to exploit redundancy at chip and module level [25]. These systems can map detected failures to individual CPUs and achieve system recovery by de-configuring the faulty processor during runtime or bootup.…”
Section: Related Workmentioning
confidence: 99%
“…Redundancy for fault-tolerance can be incorporated in different levels of granularity. Commercial high-availability multiprocessor systems like the IBM p690 have been designed to exploit redundancy at chip and module level [25]. These systems can map detected failures to individual CPUs and achieve system recovery by de-configuring the faulty processor during runtime or bootup.…”
Section: Related Workmentioning
confidence: 99%
“…Redundancy is a commonly used technique for improving lifetime reliability as well as yield of processor systems [1] [6] [13]. When applied to microprocessors, chips can maintain operability in the presence of defects or failures by detecting and isolating, correcting, and/or replacing microarchitecture components reactively on a first-come, first-served basis after components become faulty.…”
Section: Proactive Use Of Redundancymentioning
confidence: 99%
“…Some techniques are based on adjusting the operational characteristics (e.g., supply voltage, frequency, threshold voltage, or duty cycle) to reduce or recover from wearout stress conditions of failure mechanisms [10][12] [22]. Others are based on using some form of redundancy (e.g., component sparing) used reactively to tolerate the effects of wearout [6] [13].…”
Section: Introductionmentioning
confidence: 99%
“…Error correction codes are much more difficult to implement on latches, since latches are often used individually, or in small groups. Heavy weight microarchitectural techniques such as duplicated or triplicated computation units, time redundant computations, or watchdog processors, are applied in mission critical systems [28], [29], [30], [31]. In non-mission critical systems, derating factors alone may void the need for microarchitectural changes [32].…”
Section: B Ser Simulation Toolmentioning
confidence: 99%