Exact lower time bounds for computing Boolean functions on CREW PRAMs

Dietzfelbinger, Martin; Kutyowski, M.; Reischuk, Rüdiger

doi:10.1016/s0022-0000(05)80003-0

Cited by 43 publications

(41 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Dietzfelbinger, Kutylowski, and Reischuk [17] later proved a similar lower bound for randomized crew pram algorithms. The difficulty in extending either of these results to the crqw pram is that in the crqw pram, the running time of a step may be different on different inputs.…”

Section: Observation 62 a P-processor Crqw Pram Deterministic Algormentioning

confidence: 80%

“…More recently, Cypher [14] analyzed the performance of a maximum-finding algorithm under assumptions similar to the simd-qrqw pram. Dietzfelbinger, Kutylowski, and Reischuk [17] defined the few-write pram, that permits one-step concurrent writing of up to κ writes, where κ is a parameter of the model, as well as unlimited concurrent reading. Valiant [61] introduced the bsp model (see section 5) and studied a specialization of the model with logarithmic periodicity and constant throughput, which we call here the standard bsp model.…”

Section: Related Workmentioning

confidence: 99%

“…We can derive an Ω(log n/ log log n) lower bound for the or function using a lower bound result of Dietzfelbinger, Kutylowski, and Reischuk [17] for the few-write pram. Recall that the few-write pram models are parameterized by the number of concurrent writes to a location permitted in a unit-time step.…”

Section: Deterministic Algorithmsmentioning

confidence: 99%

“…Dietzfelbinger, Kutylowski, and Reischuk [17] proved an Ω(log n/ log κ) lower bound for the or function on the κ-write pram. Let T be the time for the or function on the crqw pram.…”

Section: Observation 62 a P-processor Crqw Pram Deterministic Algormentioning

confidence: 99%

“…The algorithms for both problems take linear work and O(log n/ log log n) time with high probability. In contrast, the or function requires Ω(log n) expected time on a randomized crew pram with arbitrarily many processors ( [17], following [12]). Also presented is a linear work, O( √ log n) time w.h.p.…”

mentioning

confidence: 99%

See 4 more Smart Citations

The Queue-Read Queue-Write PRAM Model: Accounting for Contention in Parallel Algorithms

Gibbons¹,

Matias²,

Ramachandran³

1998

SIAM J. Comput.

View full text Add to dashboard Cite

Abstract. This paper introduces the queue-read queue-write (qrqw) parallel random access machine (pram) model, which permits concurrent reading and writing to shared-memory locations, but at a cost proportional to the number of readers/writers to any one memory location in a given step. Prior to this work there were no formal complexity models that accounted for the contention to memory locations, despite its large impact on the performance of parallel programs. The qrqw pram model reflects the contention properties of most commercially available parallel machines more accurately than either the well-studied crcw pram or erew pram models: the crcw model does not adequately penalize algorithms with high contention to shared-memory locations, while the erew model is too strict in its insistence on zero contention at each step.The qrqw pram is strictly more powerful than the erew pram. This paper shows a separation of log n between the two models, and presents faster and more efficient qrqw algorithms for several basic problems, such as linear compaction, leader election, and processor allocation. Furthermore, we present a work-preserving emulation of the qrqw pram with only logarithmic slowdown on Valiant's bsp model, and hence on hypercube-type noncombining networks, even when latency, synchronization, and memory granularity overheads are taken into account. This matches the bestknown emulation result for the erew pram, and considerably improves upon the best-known efficient emulation for the crcw pram on such networks. Finally, the paper presents several lower bound results for this model, including lower bounds on the time required for broadcasting and for leader election.Key words. models of parallel computation, parallel algorithms, pram, memory contention, work-time framework AMS subject classifications. 68Q05, 68Q22, 68Q25PII. S009753979427491 1. Introduction. The parallel random access machine (pram) model of computation is the most-widely used model for the design and analysis of parallel algorithms (see, e.g., [40,39,58]). The pram model consists of a number of processors operating in lock-step and communicating by reading and writing locations in a shared memory. Existing pram models can be distinguished by their rules regarding contention for shared memory locations. These rules are generally classified into two groups:• Exclusive read/write: Each location can be read or written by at most one processor in each unit-time pram step.• Concurrent read/write: Each location can be read or written by any number of processors in each unit-time pram step. For concurrent writing, the value written depends on the write-conflict rule of the model, e.g., in the arbitrary concurrent-write pram, an arbitrary processor succeeds in writing its value.

show abstract

Section: Observation 62 a P-processor Crqw Pram Deterministic Algormentioning

confidence: 80%

Section: Related Workmentioning

confidence: 99%

Section: Deterministic Algorithmsmentioning

confidence: 99%

“…Dietzfelbinger, Kutylowski, and Reischuk [17] proved an Ω(log n/ log κ) lower bound for the or function on the κ-write pram. Let T be the time for the or function on the crqw pram.…”

Section: Observation 62 a P-processor Crqw Pram Deterministic Algormentioning

confidence: 99%

mentioning

confidence: 99%

See 3 more Smart Citations