2010 IEEE International Conference on Acoustics, Speech and Signal Processing 2010
DOI: 10.1109/icassp.2010.5496185
|View full text |Cite
|
Sign up to set email alerts
|

Audio fingerprinting to identify multiple videos of an event

Abstract: The proliferation of consumer recording devices and video sharing websites makes the possibility of having access to multiple recordings of the same occurrence increasingly likely. These co-synchronous recordings can be identified via their audio tracks, despite local noise and channel variations. We explore a robust fingerprinting strategy to do this. Matching pursuit is used to obtain a sparse set of the most prominent elements in a video soundtrack. Pairs of these elements are hashed and stored, to be effic… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
45
0

Year Published

2012
2012
2018
2018

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 35 publications
(45 citation statements)
references
References 8 publications
0
45
0
Order By: Relevance
“…In [17] the Spatio Temporal Interest Point (STIP) detector was proposed to detect actions in video. In multimodal approaches the audio signal is usually characterized with its Mel Frequency Cepstral Coefficients (MFCC) although more sophisticated techniques can also be used [6]. In our experiment we have only used combinations of SIFT and MFCC features.…”
Section: Related Workmentioning
confidence: 99%
“…In [17] the Spatio Temporal Interest Point (STIP) detector was proposed to detect actions in video. In multimodal approaches the audio signal is usually characterized with its Mel Frequency Cepstral Coefficients (MFCC) although more sophisticated techniques can also be used [6]. In our experiment we have only used combinations of SIFT and MFCC features.…”
Section: Related Workmentioning
confidence: 99%
“…fingerprinting strategy of [4]. Most recently, Cotton and Ellis [5] explicitly discuss audio fingerprinting for event identification, but do not address synchronization. For a general review of audio fingerprinting methods (not applied to video synchronization), please also see [6].…”
Section: Introductionmentioning
confidence: 99%
“…However, these methods are only applicable when visual features are visible in both videos. Consequently, audio synchronization is widely used for outdoor motion capture [9], mash-ups, identifying video of the same event [10], and is available in commercial editing applications [11]. Synchronizing content captured of outdoor events on consumer devices is particularly challenging given that the microphones may be far apart or disjoint in time, and hence only partially share audio environments.…”
Section: Introductionmentioning
confidence: 99%