Parallel Construction of Simultaneous Deterministic Finite Automata on Shared-Memory Multicores

Jung, Minyoung; Park, Jin-Woo; Blieberger, Johann; Burgstaller, Bernd

doi:10.1109/icpp.2017.36

Cited by 2 publications

(2 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…There has also been some successful work on speeding up composition using multiple CPU cores (Jurish and Würzner, 2013;Mytkowicz et al, 2014;Jung et al, 2017). This is a challenge because many of the algorithms used in NLP do not parallelize in a straightforward way and previous work using multi-core implementations do not handle the reduction of identical edges generated during the composition.…”

Section: Introductionmentioning

confidence: 99%

“…To our knowledge, this is the first successful attempt to do so. Our approach treats the composed FST as a sparse graph and uses some techniques from the work of Merrill et al (2012); Jung et al (2017) to explore the graph and generate the composed edges during the search. We obtain a speedup of 4.5× against OpenFST's implementation and 6× against our own serial implementation.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Composing Finite State Transducers on GPUs

Argueta¹,

Chiang

2018

Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

View full text Add to dashboard Cite

Weighted finite state transducers (FSTs) are frequently used in language processing to handle tasks such as part-of-speech tagging and speech recognition. There has been previous work using multiple CPU cores to accelerate finite state algorithms, but limited attention has been given to parallel graphics processing unit (GPU) implementations. In this paper, we introduce the first (to our knowledge) GPU implementation of the FST composition operation, and we also discuss the optimizations used to achieve the best performance on this architecture. We show that our approach obtains speedups of up to 6× over our serial implementation and 4.5× over OpenFST.

show abstract

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%