Koichi Akabe scite author profile

Koichi Akabe

5Publications

1Citation Statement Received

167Citation Statements Given

How they've been cited

How they cite others

104

167

Affiliations

Nara Institute of Science and Technology

Publications

Order By: Most citations

Pseudogen: A Tool to Automatically Generate Pseudo-Code from Source Code

Fudaba

Oda

Akabe

et al. 2015

View full text Add to dashboard Cite

Understanding the behavior of source code written in an unfamiliar programming language is difficult. One way to aid understanding of difficult code is to add corresponding pseudo-code, which describes in detail the workings of the code in a natural language such as English. In spite of its usefulness, most source code does not have corresponding pseudocode because it is tedious to create. This paper demonstrates a tool Pseudogen that makes it possible to automatically generate pseudo-code from source code using statistical machine translation (SMT). 1 Pseudogen currently supports generation of English or Japanese pseudo-code from Python source code, and the SMT framework makes it easy for users to create new generators for their preferred source code/pseudo-code pairs.

show abstract

Engineering faster double‐array Aho–Corasick automata

Kanda¹,

Akabe²,

Oda

2023

Softw Pract Exp

View full text Add to dashboard Cite

Multiple pattern matching in strings is a fundamental problem in text processing applications such as regular expressions or tokenization. This article studies efficient implementations of double-array Aho-Corasick automata (DAACs), data structures for quickly performing the multiple pattern matching. The practical performance of DAACs is improved by carefully designing the data structure, and many implementation techniques have been proposed thus far. A problem in DAACs is that comprehensive descriptions and experimental analyses on their ideas are not provided. Engineers face difficulties in implementing an efficient DAAC. In this article, we review implementation techniques for DAACs and provide a comprehensive description of them. We also propose several new techniques for further improvement. We conduct exhaustive experiments through real-world datasets and reveal the best combination of techniques to achieve a higher performance in DAACs. The best combination is different from those used in the most popular libraries of DAACs, which demonstrates that their performance can be further enhanced. On the basis of our experimental analysis, we developed a new Rust library for fast multiple pattern matching using DAACs, named Daachorse, as open-source software at https://github. com/daac-tools/daachorse. Experiments demonstrate that Daachorse outperforms other AC-automaton implementations, indicating its suitability as a fast alternative for multiple pattern matching in many applications.

show abstract

Information retrieval on oncology knowledge base using recursive paraphrase lattice

Akabe¹,

Takeuchi²,

Aoki³

et al. 2021

Journal of Biomedical Informatics

View full text Add to dashboard Cite

Engineering faster double-array Aho-Corasick automata

Kanda¹,

Akabe²,

Oda³

2022

Preprint

View full text Add to dashboard Cite

Multiple pattern matching in strings is a fundamental problem in text processing applications such as regular expressions or tokenization. This paper studies efficient implementations of double-array Aho-Corasick automata (DAACs), data structures for quickly performing the multiple pattern matching. The practical performance of DAACs is improved by carefully designing the data structure, and many implementation techniques have been proposed thus far. A problem in DAACs is that their ideas are not aggregated. Since comprehensive descriptions and experimental analyses are unavailable, engineers face difficulties in implementing an efficient DAAC.In this paper, we review implementation techniques for DAACs and provide a comprehensive description of them. We also propose several new techniques for further improvement. We conduct exhaustive experiments through real-world datasets and reveal the best combination of techniques to achieve a higher performance in DAACs. The best combination is different from those used in the most popular libraries of DAACs, which demonstrates that their performance can be further enhanced. On the basis of our experimental analysis, we developed a new Rust library for fast multiple pattern matching using DAACs, named Daachorse, as open-source software at https://github.com/ daac-tools/daachorse. Experiments demonstrate that Daachorse outperforms other AC-automaton implementations, indicating its suitability as a fast alternative for multiple pattern matching in many applications.

show abstract

Error Selection Methods for Machine Translation Error Analysis

Akabe

Neubig

Sakti

et al. 2016

Journal of Natural Language Processing

View full text Add to dashboard Cite

Error analysis is used to improve accuracy of machine translation (MT) systems. Various methods of analyzing MT errors have been proposed; however, most of these methods are based on differences between translations and references that are translated independently by human translators, and few methods have been proposed for manual error analysis. This work proposes a method that uses a machine learning framework to identify errors in MT output, and improves efficiency of manual error analysis. Our method builds models that classify low and high quality translations, then identifies features of low quality translations to improve efficiency of the manual analysis. Experiments showed that by using our methods, we could improve the efficiency of MT error analysis.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Koichi Akabe

Pseudogen: A Tool to Automatically Generate Pseudo-Code from Source Code

Engineering faster double‐array Aho–Corasick automata

Information retrieval on oncology knowledge base using recursive paraphrase lattice

Engineering faster double-array Aho-Corasick automata

Error Selection Methods for Machine Translation Error Analysis

Contact Info

Product

Resources

About