Brief Announcement: TRIX: Low-Skew Pulse Propagation for Fault-Tolerant Hardware

Lenzen, Christoph; Wiederhake, Ben

doi:10.1007/978-3-030-64348-5_23

Cited by 1 publication

(4 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We provide a positive answer to the above question for the special case of 𝑓 = 1. This is achieved by using the same grid as in [20], but with a different rule for forwarding pulses. Our novel algorithm is designed as a discrete and fault-tolerant counterpart to the GCS algorithm from [18].…”

Section: Our Contributionmentioning

confidence: 99%

“…In summary, it is essential to get as close as possible to the minimum required connectivity. This train of thought led to the study of fault-tolerant clock distribution in low-degree networks [6,20]. Both of these works have in common that they assume that the clock signal is generated at a central location.…”

Section: Introductionmentioning

confidence: 99%

“…Even worse, for each fault this bound increases by 𝑑. • In [20], each fault adds at most 𝑢 to the local skew. Observe that the used grid also has the minimum required connectivity, as each node has only 3 incoming and outgoing edges each.…”

Section: Introductionmentioning

confidence: 99%

“…Fig.1. TRIX[20] (top) and HEX[6] (bottom) grids. TRIX uses the naive pulse forwarding scheme of waiting for the second copy of each pulse before forwarding it.…”

mentioning

confidence: 99%

See 3 more Smart Citations

Gradient TRIX

Lenzen¹,

Srinivas²

2023

Preprint

View full text Add to dashboard Cite

Gradient clock synchronization (GCS) algorithms minimize the worst-case clock offset between the nodes in a distributed network of diameter 𝐷 and size 𝑛. They achieve optimal offsets of Θ(log 𝐷) locally, i.e. between adjacent nodes [18], and Θ(𝐷) globally [2]. As demonstrated in [3], this is a highly promising approach for improved clocking schemes for large-scale synchronous Systems-on-Chip (SoC). Unfortunately, in large systems, faults hinder their practical use. State of the art fault-tolerant GCS [4] has a drawback that is fatal in this setting: It relies on node and edge replication. For 𝑓 = 1, this translates to at least 16-fold edge replication and high degree nodes, far from the optimum of 2𝑓 + 1 = 3 for tolerating up to 𝑓 faulty neighbors.In this work, we present a self-stabilizing GCS algorithm for a grid-like directed graph with optimal node inand out-degrees of 3 that tolerates 1 faulty in-neighbor. If nodes fail with independent probability 𝑝 ∈ 𝑜 (𝑛 −1/2 ), it achieves asymptotically optimal local skew of Θ(log 𝐷) with probability 1 − 𝑜 (1); this holds under general worst-case assumptions on link delay and clock speed variations, provided they change slowly relative to the speed of the system. The failure probability is the largest possible ensuring that with probabity 1 − 𝑜 (1) for each node at most one in-neighbor fails. As modern hardware is clocked at gigahertz speeds and the algorithm can simultaneously sustain a constant number of arbitrary changes due to faults in each clock cycle, this results in sufficient robustness to dramatically increase the size of reliable synchronously clocked SoCs.

show abstract

Section: Our Contributionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

“…Fig.1. TRIX[20] (top) and HEX[6] (bottom) grids. TRIX uses the naive pulse forwarding scheme of waiting for the second copy of each pulse before forwarding it.…”

mentioning

confidence: 99%

See 2 more Smart Citations