SUMMARYThe trie method is widely used as a dictionary retrieval technique in natural language processing, and the double array method is extremely good in terms of high retrieval speed and compactness as a data structure for implementing the trie method. However, although placing the double array in main storage is effective for fast retrieval, the size of the double array increases as the number of storage keys increases. Therefore, in this paper, the authors propose a divided double array method that can execute retrieval efficiently even when using a small main storage area and can also further reduce the amount of storage by dividing the double array and storing it in secondary storage. However, a problem with this technique is that the number of accesses to secondary storage increases because the divided tries must be scanned. To deal with this problem, the authors proposed a technique that reduces the number of times the divided tries are scanned to maintain the fast access speed of the double array. They performed experiments for sets containing from 70,000 to 230,000 keys, and from the results, it was apparent that the number of accesses to secondary storage during retrieval for a single key was less than or equal to 2, and the storage capacity required for the divided tries was reduced by approximately 50% compared with the undivided trie.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.