Applying automatic speech recognition technology to air traffic management

Kopald, Hunter; Chanen, Ari; Chen, Shuo; Smith, Elida C.; Tarakan, Robert

doi:10.1109/dasc.2013.6719700

Cited by 14 publications

(7 citation statements)

References 4 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Whether or not a verbatim transcription of the speech is needed, however, depends on the particular application of the speech recognition system (Kopald, Chanen, et al, 2013). For example, in some cases, the system may only need to identify the presence of a particular word or phrase, whereas in others it may need to recognize more content to decipher the speaker's overall intent.…”

Section: Automatic Speech Recognition and Air Traffic Controlmentioning

confidence: 99%

“…However, nonstandard phraseology deviations, fast pace (cadence), slurring, and accents in controller speech, as well as acoustic distortions introduced by the ATC environment and voice switching equipment, complicate the speech recognition task. The limited population of speakers and the application of various speech recognition tuning techniques can help mitigate these challenges (Kopald, Chanen, et al, 2013).…”

Section: Automatic Speech Recognition and Air Traffic Controlmentioning

confidence: 99%

See 1 more Smart Citation

Design and Evaluation of the Closed Runway Operation Prevention Device

Kopald

Chen

2014

Proceedings of the Human Factors and Ergonomics Society Annual

Self Cite

View full text Add to dashboard Cite

The MITRE Corporation The MITRE Corporation was asked by the Federal Aviation Administration to perform an initial operational feasibility analysis on a speech recognition-based concept called the Closed Runway Operation Prevention Device (CROPD). This paper describes the activities conducted as part of the design and evaluation of the CROPD and outlines how a human-centered perspective of the system and operational environment informs the specifications of user interface design and system functionality.

show abstract

Section: Automatic Speech Recognition and Air Traffic Controlmentioning

confidence: 99%

Section: Automatic Speech Recognition and Air Traffic Controlmentioning

confidence: 99%

Design and Evaluation of the Closed Runway Operation Prevention Device

Kopald

Chen

2014

Proceedings of the Human Factors and Ergonomics Society Annual

Self Cite

View full text Add to dashboard Cite

show abstract

“…Actually, ASR has been applied to many air traffic works. Kopald et al reviewed the importance of ASR on reducing human errors in air traffic operation [18]. Ferreiros et al studied the speech interface for air traffic control and designed a system for voice guidance in terminals [19].…”

Section: Introductionmentioning

confidence: 99%

Real-time Controlling Dynamics Sensing in Air Traffic System

Lin

Tan

Yang

et al. 2019

Sensors

View full text Add to dashboard Cite

In order to obtain real-time controlling dynamics in air traffic system, a framework is proposed to introduce and process air traffic control (ATC) speech via radiotelephony communication. An automatic speech recognition (ASR) and controlling instruction understanding (CIU)-based pipeline is designed to convert the ATC speech into ATC related elements, i.e., controlling intent and parameters. A correction procedure is also proposed to improve the reliability of the information obtained by the proposed framework. In the ASR model, acoustic model (AM), pronunciation model (PM), and phoneme- and word-based language model (LM) are proposed to unify multilingual ASR into one model. In this work, based on their tasks, the AM and PM are defined as speech recognition and machine translation problems respectively. Two-dimensional convolution and average-pooling layers are designed to solve special challenges of ASR in ATC. An encoder–decoder architecture-based neural network is proposed to translate phoneme labels into word labels, which achieves the purpose of ASR. In the CIU model, a recurrent neural network-based joint model is proposed to detect the controlling intent and label the controlling parameters, in which the two tasks are solved in one network to enhance the performance with each other based on ATC communication rules. The ATC speech is now converted into ATC related elements by the proposed ASR and CIU model. To further improve the accuracy of the sensing framework, a correction procedure is proposed to revise minor mistakes in ASR decoding results based on the flight information, such as flight plan, ADS-B. The proposed models are trained using real operating data and applied to a civil aviation airport in China to evaluate their performance. Experimental results show that the proposed framework can obtain real-time controlling dynamics with high performance, only 4% word-error rate. Meanwhile, the decoding efficiency can also meet the requirement of real-time applications, i.e., an average 0.147 real time factor. With the proposed framework and obtained traffic dynamics, current ATC applications can be accomplished with higher accuracy. In addition, the proposed ASR pipeline has high reusability, which allows us to apply it to other controlling scenes and languages with minor changes.

show abstract

“…ASR in ATC domain has been explored to a limited extent [3,4] and multi-modal speech recognition has been largely explored with visual data [5]. Methods that utilize the radar data to improve ASR through semi-supervised learning have only been recently explored [2,6].…”

Section: Introductionmentioning

confidence: 99%

Iterative Learning of Speech Recognition Models for Air Traffic Control

et al. 2018

View full text Add to dashboard Cite

Automatic Speech Recognition (ASR) has recently proved to be a useful tool to reduce the workload of air traffic controllers leading to significant gains in operational efficiency. Air Traffic Control (ATC) systems in operation rooms around the world generate large amounts of untranscribed speech and radar data each day, which can be utilized to build and improve ASR models. In this paper, we propose an iterative approach that utilizes increasing amounts of untranscribed data to incrementally build the necessary ASR models for an ATC operational area. Our approach uses a semi-supervised learning framework to combine speech and radar data to iteratively update the acoustic model, language model and command prediction model (i.e. prediction of possible commands from radar data for a given air traffic situation) of an ASR system. Starting with seed models built with a limited amount of manually transcribed data, we simulate an operational scenario to adapt and improve the models through semi-supervised learning. Experiments on two independent ATC areas (Vienna and Prague) demonstrate the utility of our proposed methodology that can scale to operational environments with minimal manual effort for learning and adaptation.

show abstract

Applying automatic speech recognition technology to air traffic management

Cited by 14 publications

References 4 publications

Design and Evaluation of the Closed Runway Operation Prevention Device

Design and Evaluation of the Closed Runway Operation Prevention Device

Real-time Controlling Dynamics Sensing in Air Traffic System

Iterative Learning of Speech Recognition Models for Air Traffic Control

Contact Info

Product

Resources

About