Accelerating the CKY Parsing Using FPGAs

Bordim, Jacir Luiz; Ito, Yasuaki; Nakano, Koji

doi:10.1007/3-540-36265-7_5

Cited by 26 publications

(27 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…in a 32-bit integer for each i, j, and k. The CKY parsing can be done by iterative simulation of a combinational logic circuit [8], [9], the BPBC technique can be applied to it.…”

Section: The Idea Of the Bpbc Technique Ismentioning

confidence: 99%

“…In [16], it was shown that parsing can be accomplished on a one-way linear array of n 2 finite-state processors in linear time. Since these parallel algorithms need at least n processors, they are unrealistic for large n. Ciressan et al [17], [18] and Bordim et al [8], [9] have presented hardwares for the CKY parsing for context-free grammars and have tested them using FPGAs. In [8], it has been shown that the CKY parsing with 64 non-terminal symbols and 8192 production rules can be done in 162µs for an input string of length 32 using an APEX20K family FPGA.…”

Section: The Idea Of the Bpbc Technique Ismentioning

confidence: 99%

“…Since these parallel algorithms need at least n processors, they are unrealistic for large n. Ciressan et al [17], [18] and Bordim et al [8], [9] have presented hardwares for the CKY parsing for context-free grammars and have tested them using FPGAs. In [8], it has been shown that the CKY parsing with 64 non-terminal symbols and 8192 production rules can be done in 162µs for an input string of length 32 using an APEX20K family FPGA. Because the circuit can run in about 35MHz, by estimating that the performance of the latest FPGA is about 10 times higher than the previous one, we can expect that the same circuit implemented in the latest FPGA may run in approximately 350MHz.…”

Section: The Idea Of the Bpbc Technique Ismentioning

confidence: 99%

See 2 more Smart Citations

An Efficient GPU Implementation of CKY Parsing Using the Bitwise Parallel Bulk Computation Technique

Fujita

Nakano

Ito

et al. 2017

IEICE Trans. Inf. & Syst.

Self Cite

View full text Add to dashboard Cite

SUMMARYThe main contribution of this paper is to present an efficient GPU implementation of bulk computation of the CKY parsing for a context-free grammar, which determines if a context-free grammar derives each of a lot of input strings. The bulk computation is to execute the same algorithm for a lot of inputs in turn or at the same time. The CKY parsing is to determine if a context-free grammar derives a given string. We show that the bulk computation of the CKY parsing can be implemented in the GPU efficiently using Bitwise Parallel Bulk Computation (BPBC) technique. We also show the rule minimization technique and the dynamic scheduling method for further acceleration of the CKY parsing on the GPU. The experimental results using NVIDIA TITAN X GPU show that our implementation of the bitwise-parallel CKY parsing for strings of length 32 takes 395µs per string with 131072 production rules for 512 non-terminal symbols.

show abstract

“…in a 32-bit integer for each i, j, and k. The CKY parsing can be done by iterative simulation of a combinational logic circuit [8], [9], the BPBC technique can be applied to it.…”

Section: The Idea Of the Bpbc Technique Ismentioning

confidence: 99%

Section: The Idea Of the Bpbc Technique Ismentioning

confidence: 99%

Section: The Idea Of the Bpbc Technique Ismentioning

confidence: 99%

See 1 more Smart Citation

An Efficient GPU Implementation of CKY Parsing Using the Bitwise Parallel Bulk Computation Technique

Fujita

Nakano

Ito

et al. 2017

IEICE Trans. Inf. & Syst.

Self Cite

View full text Add to dashboard Cite

show abstract

“…We have used Nallatech Xtreme DSP kit [13], which is a PCI board with Xilinx VirtexII family FPGA XC2V3000-4 [6], and embedded a circuit to perform the local exhaustive search for a window of size . To reduce the amount of used FPGA resource and the delay, we use the instance-specific approach [3,4,11], which embeds a hardware depending on a part of the input instance. The instance-specific approach is applied as follows.…”

Section: Figure 1 Am Screening Fm Screening and Cluster-dot Fm Scrmentioning

confidence: 99%

“…Hence, we are going to define the Gaussian error of as follows. Gaussian error at each pixel location is defined by (3) and the total Gaussian error is defined by (4) Since the Gaussian filter approximates the characteristics of the human visual system, we can think that image reproduces original gray-scale image if is small non-cluster-dot screening 2-cluster-dot screening 3-cluster-dot screening 4-cluster-dot screening The best binary image may have dots with isolated pixels. For example, let be a binary image of size with every pixel having intensity .…”

Section: Fm Screening Based On the Human Visual Systemmentioning

confidence: 99%

Cluster-dot Screening by Local Exhaustive Search with Hardware Accelaration

Ito

Nakano

2007

2007 IEEE International Parallel and Distributed Processing Symposium

Self Cite

View full text Add to dashboard Cite

show abstract

Parallel CYK Membership Test on GPUs

Kim

Choi

Lee

et al. 2014

Advanced Information Systems Engineering

View full text Add to dashboard Cite

Nowadays general-purpose computing on graphics processing units (GPGPUs) performs computations what were formerly handled by the CPU using hundreds of cores on GPUs. It often improves the performance of sequential computation when the running program is wellstructured and formulated for massive threading. The CYK algorithm is a well-known algorithm for the context-free language membership test and has been used in many applications including grammar inferences, compilers and natural language processing. We revisit the CYK algorithm and its structural properties suitable for parallelization. Based on the discovered properties, we then parallelize the algorithm using different combinations of memory types and data allocation schemes using a GPU. We evaluate the algorithm based on real-world data and herein demonstrate the performance improvement compared with CPU-based computations.

show abstract

Accelerating the CKY Parsing Using FPGAs

Cited by 26 publications

References 6 publications

An Efficient GPU Implementation of CKY Parsing Using the Bitwise Parallel Bulk Computation Technique

An Efficient GPU Implementation of CKY Parsing Using the Bitwise Parallel Bulk Computation Technique

Cluster-dot Screening by Local Exhaustive Search with Hardware Accelaration

Parallel CYK Membership Test on GPUs

Contact Info

Product

Resources

About