Recognition of Brand and Models of Cell-Phones From Recorded Speech Signals

Hanilçi, Cemal; Ertaş, Figen; Ertaş, T.; Eskidere, Ömer

doi:10.1109/tifs.2011.2178403

Cited by 71 publications

(61 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…f k fs n (1) where k = 1, 2, ..., K is the frequency bin index; f s is the sampling rate; f k is the center frequency of bin k, which is exponentially distributed and is defined as…”

Section: Spectral Distribution Features Of the Cqt Domainmentioning

confidence: 99%

Source Cell-Phone Identification in the Presence of Additive Noise from CQT Domain

et al. 2018

View full text Add to dashboard Cite

Abstract:With the widespread availability of cell-phone recording devices, source cell-phone identification has become a hot topic in multimedia forensics. At present, the research on the source cell-phone identification in clean conditions has achieved good results, but that in noisy environments is not ideal. This paper proposes a novel source cell-phone identification system suitable for both clean and noisy environments using spectral distribution features of constant Q transform (CQT) domain and multi-scene training method. Based on the analysis, it is found that the identification difficulty lies in different models of cell-phones of the same brand, and their tiny differences are mainly in the middle and low frequency bands. Therefore, this paper extracts spectral distribution features from the CQT domain, which has a higher frequency resolution in the mid-low frequency. To evaluate the effectiveness of the proposed feature, four classification techniques of Support Vector Machine (SVM), Random Forest (RF), Convolutional Neural Network (CNN) and Recurrent Neuron Network-Long Short-Term Memory Neural Network (RNN-BLSTM) are used to identify the source recording device. Experimental results show that the features proposed in this paper have superior performance. Compared with Mel frequency cepstral coefficient (MFCC) and linear frequency cepstral coefficient (LFCC), it enhances the accuracy of cell-phones within the same brand, whether the speech to be tested comprises clean speech files or noisy speech files. In addition, the CNN classification effect is outstanding. In terms of models, the model is established by the multi-scene training method, which improves the distinguishing ability of the model in the noisy environment than single-scenario training method. The average accuracy rate in CNN for clean speech files on the CKC speech database (CKC-SD) and TIMIT Recaptured Database (TIMIT-RD) databases increased from 95.47% and 97.89% to 97.08% and 99.29%, respectively. For noisy speech files with seen noisy types and unseen noisy types, the performance was greatly improved, and most of the recognition rates exceeded 90%. Therefore, the source identification system in this paper is robust to noise.

show abstract

“…f k fs n (1) where k = 1, 2, ..., K is the frequency bin index; f s is the sampling rate; f k is the center frequency of bin k, which is exponentially distributed and is defined as…”

Section: Spectral Distribution Features Of the Cqt Domainmentioning

confidence: 99%

Source Cell-Phone Identification in the Presence of Additive Noise from CQT Domain

et al. 2018

View full text Add to dashboard Cite

show abstract

“…Each dataset is separated into 2 parts, as training and testing datasets. The GMM is trained for training speech durations of 30, 60, 90, 120, 150, and 180 s. Testing is carried out using different test utterance lengths (T u) of 1 s and 3 s. Detailed information can be found in [13,15].…”

Section: Data Collection and Test Setupmentioning

confidence: 99%

“…We recently addressed a new problem of recognizing cell phones from recorded speech signals [15]. Vector quantization and SVM-based classification algorithms are used in several experiments, and, as a result, an identification rate of 96.42% is achieved on a set of 14 models of cell phones.…”

Section: Introductionmentioning

confidence: 99%

Source microphone identification from speech recordings based on a Gaussian mixture model

Eskidere¹

2014

Turk J Elec Eng & Comp Sci

View full text Add to dashboard Cite

Abstract:Microphone identification is a specific type of media forensics that investigates whether it is possible to identify the source microphone from speech recordings. The main aim of this study is to find out which of the several feature extraction techniques are best suited to the source microphone identification systems. We perform microphone identification experiments with 16 different microphones using 3 datasets. In order to improve the results on the datasets, we also investigate the important parameters that may affect the microphone identification performance. Our experimental results show that the proposed method is comparable to the existing studies in a closed-set identification rate.

show abstract

“…A speech signal comprises different forms of information such as the conveyed message; the identity, emotion, age, and gender of the speaker; and information about the recording device [1][2][3][4][5].…”

Section: Introductionmentioning

confidence: 99%

Identifying acquisition devices from recorded speech signals using wavelet-based features

Eskidere¹

2016

Turk J Elec Eng & Comp Sci

View full text Add to dashboard Cite

Speech characteristics have played a critical role in media forensics, particularly in the investigation of evidence. This study proposes two wavelet-based feature extraction methods for the identification of acquisition devices from recorded speech. These methods are discrete wavelet-based coefficients (DWBCs) and wavelet packet-based coefficients, which are mainly based on a multiresolution analysis. These features' ability to capture characteristics of acquisition devices is compared to conventional mel frequency cepstral coefficients and subband-based coefficients. In the experiments, 14 different audio acquisition devices were trained and tested using support vector machines. Experimental results showed that DWBCs can effectively be used in source audio acquisition device identification problems.

show abstract

Recognition of Brand and Models of Cell-Phones From Recorded Speech Signals

Cited by 71 publications

References 30 publications

Source Cell-Phone Identification in the Presence of Additive Noise from CQT Domain

Source Cell-Phone Identification in the Presence of Additive Noise from CQT Domain

Source microphone identification from speech recordings based on a Gaussian mixture model

Identifying acquisition devices from recorded speech signals using wavelet-based features

Contact Info

Product

Resources

About