“…in a 32-bit integer for each i, j, and k. The CKY parsing can be done by iterative simulation of a combinational logic circuit [8], [9], the BPBC technique can be applied to it.…”
Section: The Idea Of the Bpbc Technique Ismentioning
confidence: 99%
“…In [16], it was shown that parsing can be accomplished on a one-way linear array of n 2 finite-state processors in linear time. Since these parallel algorithms need at least n processors, they are unrealistic for large n. Ciressan et al [17], [18] and Bordim et al [8], [9] have presented hardwares for the CKY parsing for context-free grammars and have tested them using FPGAs. In [8], it has been shown that the CKY parsing with 64 non-terminal symbols and 8192 production rules can be done in 162µs for an input string of length 32 using an APEX20K family FPGA.…”
Section: The Idea Of the Bpbc Technique Ismentioning
confidence: 99%
“…Since these parallel algorithms need at least n processors, they are unrealistic for large n. Ciressan et al [17], [18] and Bordim et al [8], [9] have presented hardwares for the CKY parsing for context-free grammars and have tested them using FPGAs. In [8], it has been shown that the CKY parsing with 64 non-terminal symbols and 8192 production rules can be done in 162µs for an input string of length 32 using an APEX20K family FPGA. Because the circuit can run in about 35MHz, by estimating that the performance of the latest FPGA is about 10 times higher than the previous one, we can expect that the same circuit implemented in the latest FPGA may run in approximately 350MHz.…”
Section: The Idea Of the Bpbc Technique Ismentioning
SUMMARYThe main contribution of this paper is to present an efficient GPU implementation of bulk computation of the CKY parsing for a context-free grammar, which determines if a context-free grammar derives each of a lot of input strings. The bulk computation is to execute the same algorithm for a lot of inputs in turn or at the same time. The CKY parsing is to determine if a context-free grammar derives a given string. We show that the bulk computation of the CKY parsing can be implemented in the GPU efficiently using Bitwise Parallel Bulk Computation (BPBC) technique. We also show the rule minimization technique and the dynamic scheduling method for further acceleration of the CKY parsing on the GPU. The experimental results using NVIDIA TITAN X GPU show that our implementation of the bitwise-parallel CKY parsing for strings of length 32 takes 395µs per string with 131072 production rules for 512 non-terminal symbols.
“…in a 32-bit integer for each i, j, and k. The CKY parsing can be done by iterative simulation of a combinational logic circuit [8], [9], the BPBC technique can be applied to it.…”
Section: The Idea Of the Bpbc Technique Ismentioning
confidence: 99%
“…In [16], it was shown that parsing can be accomplished on a one-way linear array of n 2 finite-state processors in linear time. Since these parallel algorithms need at least n processors, they are unrealistic for large n. Ciressan et al [17], [18] and Bordim et al [8], [9] have presented hardwares for the CKY parsing for context-free grammars and have tested them using FPGAs. In [8], it has been shown that the CKY parsing with 64 non-terminal symbols and 8192 production rules can be done in 162µs for an input string of length 32 using an APEX20K family FPGA.…”
Section: The Idea Of the Bpbc Technique Ismentioning
confidence: 99%
“…Since these parallel algorithms need at least n processors, they are unrealistic for large n. Ciressan et al [17], [18] and Bordim et al [8], [9] have presented hardwares for the CKY parsing for context-free grammars and have tested them using FPGAs. In [8], it has been shown that the CKY parsing with 64 non-terminal symbols and 8192 production rules can be done in 162µs for an input string of length 32 using an APEX20K family FPGA. Because the circuit can run in about 35MHz, by estimating that the performance of the latest FPGA is about 10 times higher than the previous one, we can expect that the same circuit implemented in the latest FPGA may run in approximately 350MHz.…”
Section: The Idea Of the Bpbc Technique Ismentioning
SUMMARYThe main contribution of this paper is to present an efficient GPU implementation of bulk computation of the CKY parsing for a context-free grammar, which determines if a context-free grammar derives each of a lot of input strings. The bulk computation is to execute the same algorithm for a lot of inputs in turn or at the same time. The CKY parsing is to determine if a context-free grammar derives a given string. We show that the bulk computation of the CKY parsing can be implemented in the GPU efficiently using Bitwise Parallel Bulk Computation (BPBC) technique. We also show the rule minimization technique and the dynamic scheduling method for further acceleration of the CKY parsing on the GPU. The experimental results using NVIDIA TITAN X GPU show that our implementation of the bitwise-parallel CKY parsing for strings of length 32 takes 395µs per string with 131072 production rules for 512 non-terminal symbols.
“…We have used Nallatech Xtreme DSP kit [13], which is a PCI board with Xilinx VirtexII family FPGA XC2V3000-4 [6], and embedded a circuit to perform the local exhaustive search for a window of size . To reduce the amount of used FPGA resource and the delay, we use the instance-specific approach [3,4,11], which embeds a hardware depending on a part of the input instance. The instance-specific approach is applied as follows.…”
Section: Figure 1 Am Screening Fm Screening and Cluster-dot Fm Scrmentioning
confidence: 99%
“…Hence, we are going to define the Gaussian error of as follows. Gaussian error at each pixel location is defined by (3) and the total Gaussian error is defined by (4) Since the Gaussian filter approximates the characteristics of the human visual system, we can think that image reproduces original gray-scale image if is small non-cluster-dot screening 2-cluster-dot screening 3-cluster-dot screening 4-cluster-dot screening The best binary image may have dots with isolated pixels. For example, let be a binary image of size with every pixel having intensity .…”
Section: Fm Screening Based On the Human Visual Systemmentioning
Nowadays general-purpose computing on graphics processing units (GPGPUs) performs computations what were formerly handled by the CPU using hundreds of cores on GPUs. It often improves the performance of sequential computation when the running program is wellstructured and formulated for massive threading. The CYK algorithm is a well-known algorithm for the context-free language membership test and has been used in many applications including grammar inferences, compilers and natural language processing. We revisit the CYK algorithm and its structural properties suitable for parallelization. Based on the discovered properties, we then parallelize the algorithm using different combinations of memory types and data allocation schemes using a GPU. We evaluate the algorithm based on real-world data and herein demonstrate the performance improvement compared with CPU-based computations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.