This paper presents a blind audio watermarking method that uses two different schemes to hide binary bits and auxiliary information within separate ranges of a fast Fourier transform (FFT) sequence. An adaptive vector norm modulation (AVNM) scheme is introduced to achieve a satisfactory balance of imperceptibility, robustness, and payload capacity. An improved spread spectrum (ISS) scheme is developed to produce a striking correlation peak, which facilitates the detection of synchronization codes in the FFT domain. The combination of robust audio segment extraction and recursive FFT makes it possible to execute these two FFT-based schemes in tandem on a sample-by-sample basis. The experiment results confirm that watermark embedding causes merely a negligible degradation in perceptual quality. A detectability test proved the effectiveness of the ISS scheme in self-synchronization as well as hiding auxiliary data. Three versions of AVNM with capacities ranging from 344.53 to 1033.59 bits per second were demonstrated. Compared with six recently developed schemes, AVNM exhibited advantages in terms of negligible quality distortion, flexible payload capacity, and excellent robustness against a variety of common signal processing attacks. INDEX TERMS Synchronous blind audio watermarking, fast Fourier transform, adaptive vector norm modulation, improved spread spectrum, robust audio segment extractor.
The paper presents a lifting wavelet transform (LWT)-based framework for multi-purpose blind audio watermarking. The proposed schemes can be used to carry out robust watermarking for intellectual property protection as well as fragile watermarking for tamper detection and signal recovery. Following 3-level LWT decomposition of the host audio, the coefficients in selected subbands are partitioned into frames for watermarking. To expand applicability, the robust watermark comprising proprietary information, synchronization code, and frame-related data was particularly embedded in the approximation subband using perceptual-based rational dither modulation (RDM) and adaptive quantization index modulation (AQIM) at a payload capacity of 1523.9 bits per second. The fragile watermark is a highly compressed version of the audio embedded within the 2 nd -and 3 rd -level detail subbands using 2 N − ary AQIM. Hashing comparison and source-channel coding make it possible to identify tampered frames and restore affected regions. Experiment results indicate that the embedded robust watermark can withstand commonly-encountered attacks and the fragile watermark is highly effective in tamper detection and self-recovery. More importantly, the incorporation of a frame synchronization mechanism makes the proposed system resistant to cropping and replacement attacks, all of which were unsolvable using previous watermarking schemes. The perceptual evaluation revealed that the watermark caused only minor degradation. The proposed watermarking scheme is suitable for a wide range of ownership protection and content authentication applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.