Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrati 2021
DOI: 10.18653/v1/2021.eacl-demos.32
|View full text |Cite
|
Sign up to set email alerts
|

ELITR Multilingual Live Subtitling: Demo and Strategy

Abstract: This paper presents an automatic speech translation system aimed at live subtitling of conference presentations. We describe the overall architecture and key processing components. More importantly, we explain our strategy for building a complex system for endusers from numerous individual components, each of which has been tested only in laboratory conditions.The system is a working prototype that is routinely tested in recognizing English, Czech, and German speech and presenting it translated simultaneously … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
2
0
2

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(4 citation statements)
references
References 21 publications
0
2
0
2
Order By: Relevance
“…Integration with ELITR To demonstrate practical usability, we integrate Whisper-Streaming with the ELITR (European Live Translator, Bojar et al, 2020) framework for complex distributed systems for multi-source and multi-target live speech transcription and translation (Bojar et al, 2021a). Within Whisper-Streaming, we implement and release a server that is connected as a worker to Mediator server (Franceschini et al, 2020). Mediator allows a client to request a service of a worker.…”
Section: System Demonstrationmentioning
confidence: 99%
“…Integration with ELITR To demonstrate practical usability, we integrate Whisper-Streaming with the ELITR (European Live Translator, Bojar et al, 2020) framework for complex distributed systems for multi-source and multi-target live speech transcription and translation (Bojar et al, 2021a). Within Whisper-Streaming, we implement and release a server that is connected as a worker to Mediator server (Franceschini et al, 2020). Mediator allows a client to request a service of a worker.…”
Section: System Demonstrationmentioning
confidence: 99%
“…So far, the adoption of direct ST architectures to address the automatic subtitling task has only been explored in (Papi et al, 2022a). As a matter of fact, all previous works on the topic (Piperidis et al, 2004;Melero et al, 2006;Matusov et al, 2019;Koponen et al, 2020;Bojar et al, 2021) rely on cascade architectures that usually involve an ASR component to transcribe the input speech, a subtitle segmenter that segments the transcripts into subtitles, a timestamp estimator that predicts the start and times of each subtitle, and an MT model that translates the subtitle transcripts. Cascaded architectures, however, cannot access information contained in the speech, such as prosody, which related works proved to be an important source of information for the segmentation into subtitles (Öktem et al, 2019;Virkar et al, 2021;Tam et al, 2022).…”
Section: Automatic Subtitlingmentioning
confidence: 99%
“…Por ello, la atención y las críticas por parte de los representantes del colectivo de personas con discapacidad se centran en cuestiones de calidad (Richart-Marset & Calamita, 2020), de entre las que destaca la preocupación por el subtitulado en directo. Desde el ámbito científico, se trabaja en proyectos para la generación de nuevos sistemas automáticos de reconocimiento de voz e inteligencia artificial, como ELITR (Bojar et al, 2021) o Deep-Sync (Martín et al, 2021). Asimismo, la calidad se vincula a otros elementos, como la posibilidad de personalizar los servicios de accesibilidad para que se adapten a los requerimientos de los televidentes, objeto de estudio de proyectos europeos como EasyTV (Richart-Marset & Calamita, 2020).…”
Section: Legislación Carencias Y Demandasunclassified
“…Por su parte, el artículo corrobora también la preocupación existente sobre la calidad de los servicios de accesibilidad (Richart-Marset & Calamita, 2020), especialmente en el caso del subtitulado en directo (Romero-Fresco, 2020). Desde el ámbito científico, diversos proyectos trabajan para generar nuevos sistemas automáticos de reconocimiento de voz e inteligencia artificial (Bojar et al, 2021;Martín et al, 2021). En el caso de TVE, empezó a introducir el software de reconocimiento de voz en informativos regionales, aunque desde la televisión señalan que estos sistemas se incorporan con precaución por los posibles errores que puedan generar.…”
Section: Discusión Y Conclusionesunclassified