In the air traffic management (ATM) environment, air traffic controllers (ATCos) and flight crews, (FCs) communicate via voice to exchange different types of data such as commands, readbacks (confirmation of reception of the command) and information related to the air traffic environment. Speech recognition can be used in these voice exchanges to support ATCos in their work; each time a flight identification or callsign is mentioned by the controller or the pilot, the flight is recognised through automatic speech recognition (ASR) and the callsign is highlighted on the ATCo screen to increase their situational awareness and safety. This paper presents the work that is being performed within SESAR2020-founded solution PJ.10-W2-96 ASR in callsign recognition via voice by Enaire, Indra, and Crida using ASR models developed jointly by EML Speech Technology GmbH (EML) and Crida. The paper describes the ATCo speech environment and presents the main requirements impacting the design, the implementation performed, and the outcomes obtained using real operation communications and real-time simulations. The findings indicate a way forward incorporating partial recognition of callsigns and enriching the phonetization of company names to improve the recognition rates, currently set at 84–87% for controllers and 49–67% for flight crew.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.