“…A popular and successful approach in speech and, more recently, handwriting recognition, is to combine linguistic knowledge with the knowledge of a specific feature domain (acoustic features in speech recognition, shape features in handwriting recognition) to form an integrated recognizer (decoder). A common technique is to embed feature models (often HMMs) into a language model by replacing each transition in the language model with the feature model of the corresponding symbol (Bahl & Jelinek, 1983;Hu, Brown & Turin, 1994;Makhoul, Starner, Schwartz & Chou, 1994;Nathan, Beigi, Subrahmonia, Cleary & Maruyama, 1995). There are also less integrated approaches where the language model network is multiplied by a segmentation network where each transition is weighted by a score generated by the corresponding symbol recognizer (Bengio, Cun & Henderson, 1994;Schenkel, Guyon & Henderson, 1995).…”