2012 IEEE 14th International Workshop on Multimedia Signal Processing (MMSP) 2012
DOI: 10.1109/mmsp.2012.6343465
|View full text |Cite
|
Sign up to set email alerts
|

Large-scale processing, indexing and search system for Czech audio-visual cultural heritage archives

Abstract: This paper describes a complex system developed for processing, indexing and accessing data collected in large audio and audio-visual archives that make an important part of Czech cultural heritage. Recently, the system is being applied to the Czech Radio archive, namely to its oral history segment with more than 200.000 individual recordings covering almost ninety years of broadcasting in the Czech Republic and former Czechoslovakia. The ultimate goals are a) to transcribe a significant portion of the archive… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2013
2013
2019
2019

Publication Types

Select...
4
1

Relationship

2
3

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 11 publications
0
2
0
Order By: Relevance
“…For the transcription task, we have been adapting and enhancing a largevocabulary continuous speech recognition (LVCSR) system developed previously in our lab. During the first two years of the 4-year project, we have implemented most of the required functionalities and utilized the system to process, transcribe and index more than 75.000 documents broadcast since 1993 to present [2]. That period did not pose a particular challenge for our research as we could employ the existing system trained for contemporary Czech.…”
Section: Introductionmentioning
confidence: 99%
“…For the transcription task, we have been adapting and enhancing a largevocabulary continuous speech recognition (LVCSR) system developed previously in our lab. During the first two years of the 4-year project, we have implemented most of the required functionalities and utilized the system to process, transcribe and index more than 75.000 documents broadcast since 1993 to present [2]. That period did not pose a particular challenge for our research as we could employ the existing system trained for contemporary Czech.…”
Section: Introductionmentioning
confidence: 99%
“…For this purpose, we have adapted our previously developed large-vocabulary continuous speech recognition (LVCSR) system to deal with broadcast recordings in Czech and Slovak and designed modules for speech indexation and search. During the first 18 months of the project, we have processed about 75,000 audio files (with total duration of 30,000 hours) and created a demo version of the web service that allows for smart search in the transcribed data [7].…”
Section: Introductionmentioning
confidence: 99%