[Signal processing] Audio watermark embedding and extraction based on matlab wavelet transform [including Matlab source code 053]

1. Introduction

Earlier, they used the watermarking technology of block DCT. Their watermarking scheme used a key to randomly select some blocks of the image, and slightly changed a triple at the intermediate frequency of the frequency domain to hide the binary sequence information. This method is robust to lossy compression and low-pass filtering. Cox et al. [proposed a well-known digital watermarking technology based on image global transformation, which performs discrete cosine transform (DCT) on the entire image. Barni et al. proposed a DCT-based watermarking algorithm that uses the masking characteristics of HVS. In the watermark embedding stage , Perform DCT transformation on the image, rearrange the DCT coefficients into a one-dimensional vector according to Zig-Zag scanning, leave the first L coefficients in the vector unchanged, and modify the M coefficients after the Lth coefficient to embed Watermark. Based on the qualitative and quantitative analysis of the DC and AC components of the DCT coefficients, Huang Jiwu and others pointed out that the DC component is more suitable for embedding watermarks than the AC component, and the watermark embedded with the DC component has better robustness, and proposed a use of DC Adaptive algorithm for components.

Digital Watermarking technology is to embed some identification information (ie digital watermark) directly into digital carriers (including multimedia, documents, software, etc.) or indirectly express (modify the structure of a specific area), and does not affect the use of the original carrier Value is not easy to be detected and modified again. But it can be identified and identified by the manufacturer. Through the information hidden in the carrier, the purpose of confirming the content creator, purchaser, transmitting secret information or judging whether the carrier has been tampered with can be achieved. Digital watermarking is an important research direction of information hiding technology. Digital watermarking is an effective way to realize copyright protection and an important branch in the field of information hiding technology. This article briefly introduces several important algorithms of digital watermarking, proposes an audio digital watermarking model based on discrete wavelet transform (DWT), and gives some experimental results.

1 Audio watermark

The main application of audio digital watermarking is covert communication and copyright protection. Covert communication focuses on the concealment of information and the embedded capacity of data, while copyright protection emphasizes robustness. At present, most of the watermarking technologies applied to the digital copyright protection of audio products are limited to the uncompressed domain, including the time domain and the transform domain. The time domain mainly includes LSB algorithm and echo algorithm, while the transform domain algorithm mainly uses DCT, DFT and DWT.

The more important algorithms for audio embedded digital watermarking include the following categories:

(1) The least significant bit embedding (LSB-Lesat Significant Bit) is the simplest embedding method. Any form of watermark can be converted into a string of binary code streams. Each sample data of the audio file is also represented by a binary value. In this way, the watermark can be embedded in the audio signal by replacing the least significant bit (usually the lowest bit) of each sampled value with the binary bit representing the watermark. If the audio signal is regarded as a watermark transmission channel, and the watermark is regarded as a signal transmitted in the channel, the channel capacity will be 1Kbps/kHz under ideal circumstances, that is, the sampling rate and the bit rate are numerically equivalent. The pseudo-random sequence can be generated by a pseudo-random sequence generator. When the pseudo-random sequence generator has a fixed structure, different initial values ​​will generate different pseudo-random sequences, so that both the sender and receiver only need to secretly transmit an initial value as a key instead of transmitting the entire pseudo-random sequence value. To enhance the robustness of the watermark, consider adding the watermark to the high frequency components of the audio data.

The LSB method is simple and easy to implement, with large data capacity and high security. The disadvantage is that the robustness of anti-signal processing is poor.

(2) Spread spectrum method (Spread Sprectrum Encoding). This method is to distribute the coded data into as much frequency spectrum as possible to encode the information stream. Commonly used is Direct Sequence Spread Spectrum Coding (DSSS), which usually combines excellent m-sequences for encoding and decoding. In order to take advantage of the masking effect of HAS, it is generally necessary to carry out several levels of filtering processing on the adopted sequence, and the detection of watermark is combined with the detection method of correlation hypothesis test. This method is robust to MP3 audio coding, PCM quantization and additional noise. The spread spectrum method has good anti-interference performance, strong concealment, low interference, easy to achieve code division multiple access, and digital-analog compatibility.

(3) Phase encoding. Utilizing the characteristics of human hearing system insensitive to absolute phase and sensitive to relative phase, the reference phase representing the watermark information is used to replace the absolute phase of the original audio segment, and the remaining audio segments are adjusted to keep the relative phase unchanged. The coding steps are briefly described as follows:
Insert picture description here
⑥According to the modified phase matrix and the original amplitude matrix, IDFT inverse transformation is performed to generate an audio signal containing a watermark.

(4) Echo hiding. The watermark data is embedded into the audio signal by introducing echo, which uses another characteristic of HAS: the backward shielding effect of the audio signal in the time domain, that is, the weak signal is shielded after the strong signal disappears, about 50ms after the strong signal disappears Continue to work within ~200ms without being noticed by human ears.

Since echo concealment is to embed the watermark information as the environment of the carrier data instead of random noise into the carrier data, it has satisfactory robustness to some lossy compression algorithms.

(5) Transformation domain algorithm: The transformation domain algorithm has many advantages that the spatial domain algorithm does not possess. The most prominent point is the robustness of the algorithm. Transform domain algorithms include discrete Fourier transform (DFT), discrete cosine transform (DCT), and discrete wavelet transform (DWT) that has emerged in recent years. For the first two methods, considerable research has been done at home and abroad. The basic idea is to combine the HAS auditory characteristics to transform the original audio data in a certain frequency domain, and then change the corresponding transform coefficients to embed the watermark. The algorithm in this paper is based on the third transform, the discrete wavelet transform. Here is a brief introduction to DWT technology.

The DWL algorithm uses the original audio of the Daubechies-4 wavelet base to perform L-level wavelet decomposition, retains the difference components of the previous L-1 level, and processes the L-th level detail components and embeds the watermark. A feature of this algorithm is that the watermark signal is placed in the low frequency part where the energy of the speech signal is most concentrated.

2 Human hearing model

The response of the human auditory system to the input signal is based on frequency, and the difference in pitch corresponds to the change in frequency. Figure 1 shows the sensitivity of the human ear as a function of frequency. The figure shows the lowest sound intensity that can be heard by the human ear. For each different frequency, it is just the reciprocal of the audio sensitivity. It can be seen from Figure 1 that the human ear is most sensitive to frequencies around 3kHz, and the sensitivity of the human ear will be reduced for frequencies that are too high (20kHz) and too low (20Hz).
Insert picture description here
According to this characteristic, it can be known that the watermark is embedded in the appropriate high-frequency or low-frequency component of the audio data, and it can be reasonably expected that the quality of the original audio will not be damaged. This can be verified from subsequent experiments.

3 arithmetic

This algorithm uses DWT, which includes three main parts: watermark embedding, watermark detection, and watermark attack. The working principle of watermark is shown in Figure 2. The detection of the watermark requires the original audio data.
Insert picture description here
3.1 Watermark embedding algorithm

(1) Scrambling the image to be embedded with the watermark. This algorithm simply uses a pseudo-random number algorithm to eliminate data correlation. (2) Perform multi-scale one-dimensional decomposition of the original audio data, and extract low-frequency coefficients and three-layer high-frequency coefficients respectively. In order to obtain better robustness, this algorithm embeds watermark data into the third layer of high frequency components of audio data. (3) Embed the watermark data according to the formula Vw(i)=V(i)+(α+e)×W(i). Among them, V(i) is the audio data bit, W(i) is the watermark data bit, Vw(i) is the audio data bit after the watermark is embedded, α is the watermark embedding strength, and the e value is the correction, and the value is 10-20. Through experimentation, it is found that the effect of embedding watermark is ideal when the value of α is 0.004. (4) IDWT transformation is performed on the audio embedded with watermark data to obtain audio data containing watermark.

3.2 Watermark detection algorithm

(1) Multi-scale one-dimensional decomposition of audio data containing watermarks, and extract the three-layer high-frequency component coefficients.

(2) The detection algorithm is the inverse process of the embedding algorithm, which requires the original audio data to participate in the detection, expressed as:

W(i)=(Vw(i)-V(i))/(α+e), where the values ​​of α and e are consistent with the values ​​determined in the embedding algorithm.

(3) The W(i) obtained in step (2) is the extracted one-dimensional watermark information sequence, which can be upgraded to obtain a two-dimensional image form. This result is the watermark of the detection output.

The original image of the embedded watermark and the watermark extracted after one embedding are shown in Figure 3 and Figure 4, respectively.
Insert picture description here
4 part test

In order to test the performance of this watermarking system, various types of attacks are carried out on watermarked audio data, here are some experimental results.

Definition:
Insert picture description here
Nc is a measure of the similarity between the extracted watermarked image and the original watermarked image. The similarity of the watermark extracted directly from the watermarked audio that has not been attacked is as high as 0.9998.

(1) Two-choice one forced-choice experiment. For the tester who does not know the precise original audio signal in advance, the original audio and the watermarked audio are played separately, and the tester is required to identify the original audio. According to the conclusion of L.BONey et al. [5], if the proportions of the two types of audio are considered to be roughly equivalent to the original audio, it can be considered that there is no significant difference in human ear perception after the watermark is embedded. In the experiment, 8 students from the same laboratory were randomly selected. By embedding watermarks in different wav files and asking randomly, about 53.4% ​​of the interviewees thought that the original audio quality was better. It shows that the watermark embedded by this system does not cause a significant change in the original audio quality.

(2) Cut the audio to n/10 of all data (n=1, 2, 3...), the original audio data bit length is slightly larger than 40 000, and the cutting starts from the 20 000th bit.

According to Figure 5, Figure 6, and Figure 7, it can be seen that after cutting the audio about one-third of the content, more obvious watermark patterns can still be extracted. If the cut part is more, the watermark cannot be detected satisfactorily. However, since one-third of the cutting rate will cause a large amount of loss of carrier audio data at the same time, this result is acceptable.
Insert picture description here
(3) MP3 compression. At present, MP3 compression encoding for audio signals is a commonly used audio processing technology, and its goal is to reduce the amount of audio data as much as possible without affecting the quality of the original audio signal. Different bit rates correspond to different MP3 compression ratios. In this experiment, a piece of audio with a watermark on it is compressed at a code rate of 96Kbps (compression ratio is 7.4:1), and then subjected to mapping decoding. The detected watermark image is shown in Figure 8.
Insert picture description here
5 Conclusion

In recent years, the field of audio digital watermarking, especially the research of transform domain audio watermark embedding and detection, has developed rapidly, and discrete wavelet analysis (DWT) is one of the hotspots of the entire digital watermarking system in recent years. Experiments show that the algorithm has good concealment, hardly weakens the quality of the original audio, and has a certain ability to resist cutting attacks and other attacks. In order to further improve the robustness of the algorithm, further consideration should be given to how to use more HAS characteristics and the location and strength of watermark embedding. Taking into account the practicality of the algorithm, we should consider increasing the capacity of the embedded watermark. These are the directions that need further improvement.

Second, the source code

clear all;
clc;
key=35;
%Arnold置换次数,作为密钥
Orignalmark=double(imread('suda64.bmp'));  %读入64*64的水印图片
[wrow,wcol]=size(Orignalmark);  
if wrow~=wcol 
    error('wrow~=wcol error');
end
%--- 测试密钥key是否超出范围---------
n=check_arnold(wrow);
if (key+1)>n
    error('arnold key error');
end
Arnoldw=arnold(Orignalmark,wrow,key); %对水印图像进行Arnold转化
[X,fs,bits]=wavread('laile.wav'); %读入音频文件
%X=imread('lena.bmp');
figure;    
subplot(2,1,1); 
%imshow(X),title('原始图像');
%X=double(X);
plot(X);     %显示音频文件波形
title('原始音频信号'); 
%sound(X,fs,bits);
%pause;
%水印嵌入--------------------------------------------------
[c,l]=wavedec(X,2,'db4'); %用db4小波对读入的声音文件进行2级小波分解 
ca2=appcoef(c,l,'db4',2); %提取2级小波分解的低频系数和高频系数 
cd2=detcoef(c,l,2); 
cd1=detcoef(c,l,1); 
lca=length(ca2);  %低频长度
blocksize=fix(lca/(wrow*wcol)); %每块的大小
water_vector=reshape(Arnoldw,1,wrow*wcol);  %将置乱后的水印转化为一维的
wlength=wrow*wcol;  %水印的长度
a=0.25;  %量化步长
j=1;
for i=1:wlength
    Block=ca2(j:j+blocksize-1);
    [U,S,V]=svd(double(Block)); 
    cc=floor(S(1,1)/a);  
    if(Arnoldw(i)==1)           %嵌入奇数倍
        if(mod(cc,2)==0)
            cc=cc+1;
        end
        S(1,1)=a*cc;
    end
    if(Arnoldw(i)==0)            %嵌入偶数倍
        if(mod(cc,2)==1)  
            cc=cc+1;
        end
        S(1,1)=a*cc;
    end
    Blockw=U*S*V';          %SVD 逆变换还原 
    ca2(j:j+blocksize-1)=Blockw; 
    j=j+blocksize;
end
c1=[ca2',cd2',cd1']';
 
 
 

Three, running results

Insert picture description here

Four, remarks

Complete code or writing add QQ2449341593 past review
>>>>>>
[Signal processing] Matlab HMM-based sleep detection [Include Matlab source code 050]
[Signal processing] Based on matlab CDR noise and reverberation suppression [Include Matlab source code 051 Issue]
[Signal processing] Solve the problem of sparse signal recovery based on matlab least square method [Include Matlab source code Issue 052]

Guess you like

Origin blog.csdn.net/TIQCmatlab/article/details/113088823