“…Please see Figure for a schematic overview of this framework. Because the HALEF architecture and components have been described in detail in prior publications (Ramanarayanan, Suendermann‐Oeft, Ivanov, & Evanini, ; Suendermann‐Oeft, Ramanarayanan, Teckenbrock, Neutatz, & Schmidt, ), we only briefly mention the various modules of the system here: - Telephony servers Asterisk (van Meggelen, Smith, & Madsen, ) and FreeSWITCH (Minessale, Schreiber, Collins, & Chandler, ), which are compatible with Session Initiation Protocol (SIP), Public Switched Telephone Network (PSTN), and web Real‐Time Communications (WebRTC) standards and include support for voice and video
- A voice browser, JVoiceXML (Schnelle‐Walka, Radomski, & Mühlhäuser, ), which is compatible with VoiceXML 2.1 and can process SIP traffic and which incorporates support for multiple grammar standards, such as Java Speech Grammar Format (JSGF), Advanced Research Projects Agency (ARPA), and Weighted Finited State Transducer (WFST)
- An Media Resource Control Protocol (MRCP) speech server (Prylipko, Schnelle‐Walka, Lord, & Wendemuth, ), Cairo, which allows the voice browser to initiate SIP or Real‐Time Transport Protocol (RTP) connections from/to the telephony server and incorporates two speech recognizers (Sphinx and Kaldi; see respectively Lamere et al, ; Povey et al, ) and synthesizers (Mary and Festival; see respectively Schröder & Trouvain, ; Taylor, Black, & Caley, ).
- An Apache Tomcat‐based web server, which can host dynamic VoiceXML pages, web services, and media libraries containing grammars and audio files
- OpenVXML, a VoiceXML‐based voice application authoring suite: generates dynamic web applications that can be housed on the web server
- A MySQL database server for storing call logs
- A speech transcription, annotation, and rating portal that allows one to listen to and transcribe full‐call recordings, rate them on a variety of dimensions such as caller experience and latency, and perform various semantic annotation tasks required to train ASR and SLU modules
…”