A retrieval algorithm of encrypted speech based on short-term cross-correlation and perceptual hashing

Zhang, Qiu-yu; Zhou, Liang; Zhang, Tao; Zhang, Denghai

doi:10.1007/s11042-019-7180-9

Cited by 22 publications

(46 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As can be seen from Table 4, the FAR of Log-Mel Spectrogram/MFCC are lower than the methods [3,5,6,7] under different thresholds. Therefore, the method in this paper has strong anti-collision performance and discrimination, which can meet the needs of retrieval.…”

Section: B Performance Comparison With Existing Perceptual Hashing Mmentioning

confidence: 91%

“…At present, the existing content-based encrypted speech retrieval algorithm [3,4,5,6,7] are all realized by constructing speech perceptual hashing. For example, Wang et al [3] proposed an encrypted speech perceptual hashing retrieval algorithm based on a zero-crossing rate and used Chua's chaotic system to encrypt speech.…”

Section: Related Workmentioning

confidence: 99%

“…A retrieval method of syllable-level perceptual hashing based on the posterior probability feature of syllable segment model is proposed. Zhang et al [7] proposed an encrypted speech retrieval algorithm based on short-term cross-correlation and perceptual hashing, which can directly extract the perceptual hashing sequence from the encrypted sample speech. By analyzing the above, the existing content-based encrypted speech retrieval algorithms are based on existing hand-crafted feature to generate binary hashing codes for speech retrieval.…”

Section: Related Workmentioning

confidence: 99%

“…4964, and the standard deviation  1 =0.0322. Table 4 shows the comparison results of the proposed scheme with the existing perceptual hashing based encrypted speech retrieval methods [3,5,6,7] under different thresholds. The lower FAR of perceptual hashing algorithm, the higher the anti-collision performance and the better the discrimination of the algorithm.…”

Section: B Performance Comparison With Existing Perceptual Hashing Mmentioning

confidence: 99%

“…The traditional encrypted speech retrieval methods are based on speech perceptual hashing technology to extract the perceptual features of speech [3,4,5,6,7]. Speech feature extraction is the basis of the retrieval process, and the performance of feature expression directly affects subsequent retrieval results.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

An Encrypted Speech Retrieval Method Based on Deep Perceptual Hashing and CNN-BiLSTM

Zhang

et al. 2020

IEEE Access

Self Cite

View full text Add to dashboard Cite

ABSTRACRT Since convolutional neural network (CNN) can only extract local features, and long shortterm memory (LSTM) neural network model has a large number of learning calculations, a long processing time and an obvious degree of information loss as the length of speech increases. Utilizing the characteristics of autonomous feature extraction in deep learning, CNN and bidirectional long short-term memory (BiLSTM) network are combined to present an encrypted speech retrieval method based on deep perceptual hashing and CNN-BiLSTM. Firstly, the proposed method extracts the Log-Mel Spectrogram/MFCC features of the original speech and enters the CNN and BiLSTM networks in turn for model training. Secondly, we use the trained fusion network model to learn the deep perceptual feature and generate deep perceptual hashing sequences. Finally, the normalized Hamming distance algorithm is used for matching retrieval. In order to protect the speech security in the cloud, a speech encryption algorithm based on a 4D hyperchaotic system is proposed. The experimental results show that the proposed method has good discrimination, robustness, recall and precision compared with the existing methods, and it has good retrieval efficiency and retrieval accuracy for longer speech. Meanwhile, the proposed speech encryption algorithm has a high key space to resist exhaustive attacks. INDEX TERMS Encrypted speech retrieval, CNN-BiLSTM, Deep perceptual hashing, Speech feature extraction, 4D hyperchaotic system.

show abstract

Section: B Performance Comparison With Existing Perceptual Hashing Mmentioning

confidence: 91%