2002
DOI: 10.1007/3-540-36390-4_12
|View full text |Cite
|
Sign up to set email alerts
|

Compilation Methods of Minimal Acyclic Finite-State Automata for Large Dictionaries

Abstract: Abstract. We present a reflection on the evolution of the different methods for constructing minimal deterministic acyclic finite-state automata from a finite set of words. We outline the most important methods, including the traditional ones (which consist of the combination of two phases: insertion of words and minimization of the partial automaton) and the incremental algorithms (which add new words one by one and minimize the resulting automaton on-the-fly, being much faster and having significantly lower … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2004
2004
2006
2006

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 15 publications
(9 citation statements)
references
References 4 publications
0
9
0
Order By: Relevance
“…We consider a lexicon for Spanish built from Galena [9], which includes 514,781 different words, to illustrate this aspect. The lexicon is recognized by an fa containing 58,170 states connected by 153,599 transitions, of sufficient size to allow us to consider it as a representative starting point for our purposes.…”
Section: Resultsmentioning
confidence: 99%
“…We consider a lexicon for Spanish built from Galena [9], which includes 514,781 different words, to illustrate this aspect. The lexicon is recognized by an fa containing 58,170 states connected by 153,599 transitions, of sufficient size to allow us to consider it as a representative starting point for our purposes.…”
Section: Resultsmentioning
confidence: 99%
“…We generate from G a numbered minimal acyclic finite automaton for the language L(G). In practice, we choose a device [4] generated by Galena [3]. A fa is a 5-tuple A = (Q, Σ, δ, q 0 , Q f ) where: Q is the set of states, Σ the set of input symbols, δ is a function of Q × Σ into 2 Q defining the transitions of the automaton, q 0 the initial state and Q f the set of final states.…”
Section: The Error Repair Modelmentioning
confidence: 99%
“…We choose to work with a lexicon for Galician built from Galena [3], which includes 304.331 different words, to illustrate this aspect. The lexicon is recognized by a fa containing 16.837 states connected by 43.446 transitions, whose entity we consider sufficient for our purposes.…”
Section: The System At Workmentioning
confidence: 99%
“…The main reasons for compressing a very large dictionary of words into a finite-state automaton are that its representation of the set of words is compact, and that the process of looking up a word in the dictionary is proportional to the length of the word, and therefore very fast [7]. Of particular interest for natural language processing applications are minimal acyclic finite-state automata, which recognize finite sets of words, and which can be constructed in various ways [15,5]. The aim of the present work was to build a general architecture to handle a large Spanish dictionary of synonyms [2].…”
Section: A Computational View Of Synonymymentioning
confidence: 99%
“…To complete this model, we only need the implementation of the functions Word to Index and Index to Word. Both functions operate over a special type of automata, the numbered minimal acyclic finite-state automata described in [5], allowing us to efficiently perform perfect hashing between numbers and words.…”
Section: General Architecture Of An Electronic Dictionary Of Synonymsmentioning
confidence: 99%