[Mathematical Modeling] 2023 Shenzhen Cup && Mathematical Modeling of the Three Eastern Provinces Topic B: Copyright Protection of Electronic Resources (DCT-Based Dark Watermark Information Embedding Model)

Reply to the official account at the end of the article: 深圳杯B题, get the full content.
The text, formulas, and codes of this article are all displayed in part.

1. Topic

Question B Copyright Protection of Electronic Resources

Copyright, also known as copyright, includes the right of publication, authorship, modification, protection of the integrity of the work, reproduction, distribution, rental, exhibition, performance, projection, broadcasting, information network dissemination, filming, and adaptation rights, translation rights, compilation rights and other rights that should be enjoyed by the copyright owner.

Today, with the widespread use of computer networks, more and more electronic resources will be quickly transferred through the network. At the same time, how to protect the copyright of electronic resources has gradually become crucial. This problem is also one of the key issues in the field of information security. Digital watermark (electronic water mark) technology is one of the key technologies to solve this problem. But because visible watermarking (visible watermarking) often destroys the structure of the picture itself when it is applied to the copyright protection of electronic pictures, and because the embedded information is visible, it is easy to be identified and eliminated. Therefore, steganography has been widely concerned and used.

Steganography is generally regarded as an important branch of information hiding, which specializes in how to hide information that actually exists. Steganography has a long history, with some cases dating back to hundreds of years BC. With the rapid development of computer and Internet technology, the research on modern steganographic technology is considered to have started in the 1990s. Because steganographic technology can embed specific information into the information carrier and is not easy to be detected, it can be widely used in copyright protection, data addition and other fields.

  • Question 1: For the picture P in Attachment 1, establish a mathematical model for generating the picture SP with embedded information "Shenzhen Cup Mathematical Modeling Challenge", so that the picture SP is as close as possible to the original picture P in human vision. Design and implement the algorithm for generating image SP, put the generated SP source code and the resulting image SP in Appendix A of the entry; give the source code used to extract copyright information from the image SP and place it in Appendix B of the entry.
  • Question 2 Using the model and algorithm in Question 1, is it possible to embed all the textual information in the "Copyright Law of the People's Republic of China" (Third Amendment) [1] into the picture in Attachment 1? If not, how much can be embedded at most?
  • Question 3 In the process of electronic image transmission, it may be compressed or stored in different image formats, or may be scaled, rotated or other geometric deformations . At this point, is the algorithm in question 1 still usable? If not, how can it be improved?
  • Question 4 If you want to protect the copyright of other electronic pictures, what should you pay attention to when using the algorithm in question 1? Please give at most 3 precautions and explain the reasons.

references


[1] http://www.gov.cn/guoqing/2021-10/29/content_5647633.htm

2. Ideas and answers

2.1 Question 1

2.11 LSB method test

This can be achieved using the lsb method.

LSB steganography is a technology that uses the least significant bit (Least Significant Bit, LSB) of an image to hide information. Each pixel of the image consists of three colors (red, green and blue), and each color occupies 8 bits, that is, one byte. LSB steganography is to replace the binary bit of the information to be hidden with the lowest bit of each pixel of the image, so as to realize the embedding of information. Since the lowest bit has little effect on the quality of the image, it is difficult for human eyes to detect the difference, so this method has better concealment.

You can call the stegano library to test and see how the lsb method works:

  • The library requires the input image to be in PNG format, so first convert the original image to PNG format.
  • The library does not support Chinese well (UTF-8 is used immediately), so this article first base64-encodes the information to be embedded and then embeds it into the picture;
  • When parsing the information in the picture, the operation is reversed.
...

# 嵌入信息到图片中
def embed_info(image_path, message, output_path):
    # 把中文转换为base64编码
    message = base64.b64encode(message.encode('utf-8')).decode('ascii')
    secret = lsb.hide(image_path, message)
    secret.save(output_path)

# 从图片中提取信息
def extract_info(image_path):
    secret_message = lsb.reveal(image_path)
    # 把base64编码还原为中文
    secret_message = base64.b64decode(secret_message.encode('ascii')).decode('utf-8')
    return secret_message

...

operation result:
insert image description here

Original image:
insert image description here
Image after embedding information:
insert image description here
The size of the two images is the same, and no difference can be seen, because the amount of embedded information is relatively small compared to the size of the image, accounting for only about 0.02% (24 bytes/1228800 bytes).

There are many methods and indicators for comparing the difference between two pictures, such as mean square error (MSE), structural similarity index (SSIM), etc. These methods calculate the degree of difference between two pictures based on the pixel values ​​of the pictures, and the smaller the difference, the more similar the pictures are.

code:

...

# 定义MSE函数
def mse(imageA, imageB):
    # 计算两张图片之间的均方误差
    # 两张图片必须有相同的尺寸
    err = np.sum((imageA.astype("float") - imageB.astype("float")) ** 2)
    err /= float(imageA.shape[0] * imageA.shape[1])

    # 返回MSE值,越小越相似
    return err

...

Results:
insert image description here
It can be seen that after embedding the "Shenzhen Cup Mathematical Modeling Challenge", the difference between the two pictures is very, very small.

When the second paragraph of the title is also embedded in the picture, the difference becomes a little bit bigger.
insert image description here

2.12 LSB Method Modeling

The above method shows that it is feasible to use the lsb method to embed information into pictures. Therefore, we establish the mathematical model of the lsb method, and program through the principle of the lsb method.

Since the picture given in the title is in jpg format, the model and code in this section support the input of the original picture in jpg format.

However, after embedding information into the original jpg image, it cannot be saved in jpg format, because jpg cannot guarantee that no details will be lost at all, which will disturb the embedded information, so the output should be saved in PNG format without loss, so that it can be completely recorded The rgb value of each pixel of the image.

mathematical model:

The basic principle of these two functions is to use the Least Significant Bit (LSB) of the image to embed and extract information. This method is a simple steganography technique that embeds information by modifying the least significant bits of an image pixel, since the modification has little or no visual impact on the image and is almost imperceptible.

The following is the mathematical model of these two functions:

  1. Embed information :

    For each character cin message message, convert it to an 8-bit binary representation b(c). Then, iterate over each pixel of the image (r, g, b)and replace the least significant bit of each color channel with b(c)a single bit of . This can be represented by the following formula:

    . . . ... ... . . . ... ... . . . ... ...

    Among them r', g', b'is the new pixel value, b(c)_iis b(c)the bit iof , modand is the modulo operation.
    insert image description here

  2. Extract information :

    Iterates through each pixel of the image (r, g, b)and extracts bits of information from the least significant bits of each color channel. These bits of information are then combined into an 8-bit binary representation, converted into characters. This can be represented by the following formula:

    b ( c ) i = r m o d 2 b(c)_i = r mod 2 b(c)i=rmod2 . . . ... ... b ( c ) i + 2 = b m o d 2 b(c)_{i+2} = b mod 2 b(c)i+2=bm o d 2
    whereb(c)_iisb(c)the bit ofi,modwhich is a modulo operation. Then, web(c)convert to a characterc.


code:

# -*- coding: UTF-8 -*-

...
# 嵌入信息到图片中
def embed_info(image_path, message, output_path):
    # 把中文转换为base64编码
    message = base64.b64encode(message.encode('utf-8')).decode('ascii')
    # 把信息转换为二进制位
    bits = ''.join(format(ord(x), '08b') for x in message)
    info_len = len(bits)
    # 打开图片
    img = Image.open(image_path)
    # 获取图片的宽度和高度
    width, height = img.size
    # 获取图片的像素数据
    pixels = img.load()
    # 初始化索引
    index = 0
    # 遍历每个像素
    for x in range(width):
        for y in range(height):
            # 获取当前像素的RGB值
            r, g, b = pixels[x, y]
            # 如果还有未嵌入的信息位
            if index < info_len:
                # 把当前像素的红色分量的最低有效位替换为信息位
                r = int(format(r, '08b')[:-1] + bits[index], 2)
                # 索引加一
                index += 1
            # 如果还有未嵌入的信息位
            if index < info_len:
...

Information Analysis Inspection:
insert image description here

Similarity test: the mean square error is lower than that of the library used, and the structural similarity index is slightly higher, which is still indistinguishable by the naked eye.

insert image description here

Output picture:
insert image description here

2.2 Question 2

The result is OK.

reason:

  • The dimensions of the original image are 1280*1896. 3 * 1280*1896According to the lsb model in this paper, there are a total of binary bits that can be used to store information:
  • In this paper, the "Copyright Law of the People's Republic of China" is saved as a txt file with a size of 33KB. After encoding conversion in the model, it is finally 349952represented by binary bits.
  • The number of bits of information to be hidden is much smaller than the number of bits available for storage in a picture. So you can embed all the text information in the picture in Attachment 1.

Information embedding effect:
insert image description here

Differences before and after pictures:
insert image description here

2.3 Question 3

The previous lsb method is very effective and has minimal visual impact on the image.

However, when an image is compressed or stored in a different image format, it may also be scaled, rotated, or otherwise geometrically deformed.

The previous information embedding algorithm cannot extract the embedded information from the picture. (Because these operations may change the least significant bits of the pixel value, thus destroying the embedded information)

Therefore, the algorithm needs to be improved.

2.31 Overview of methods and steps

A technique called "robust" information hiding can be employed. The goal of robust information hiding techniques is to extract hidden information after image processing (such as compression, scaling, rotation, etc.). Here are some possible ways to improve:

  1. Use more sophisticated embedding techniques : for example, information embedding using frequency-domain methods such as discrete cosine transform (DCT) or discrete wavelet transform (DWT). These methods hide information in the frequency domain of the image instead of directly in the spatial domain like LSB. In this way, hidden information can be extracted even after some processing of the image.

  2. Use error-correcting coding : For example, use error-correcting coding such as Hamming codes or Reed-Solomon codes to encode hidden information. In this way, even if a part of the information is destroyed during image processing, the original information can be recovered through error correction coding.

  3. Using watermarking technology : Watermarking is a special information hiding technique whose goal is to hide an identifier in an image, which can be detected even after the image is processed. Watermarking techniques usually use some complex embedding and extraction algorithms to improve robustness.

  4. Use more powerful machine learning methods : for example, use deep learning for information hiding and extraction. Deep learning can learn how to keep information hidden and extracted under different image processing operations, thereby improving robustness.

All of the above methods can improve the robustness of information hiding technology, but it should also be noted that improving the robustness usually sacrifices the capacity of some hidden information. However, when used to protect copyright, there is not much information that needs to be embedded in the picture.

This paper uses watermarking technology for information embedding (dark watermarking, invisible to the naked eye).


The main steps are:

  1. Generate a watermark image (square);
    • Calculate the appropriate number of lines based on the length of the text;
    • Create a blank image;
    • draw text on the image;
    • save Picture.
  2. embedded watermark;
    • right…
  3. Extract the watermark.
    • Restore watermarked pictures to a state as close as possible to the original picture (for rotation, cropping and scaling);
    • …;
    • Perform Arnold inverse transformation on the extracted watermark image (if the embedded black silk is Arnold transformed watermark image).

2.32 DCT-based dark watermark information embedding model

2.32 - 1 Watermark image generation

Main code:

def create_watermark(text, font_path, font_size=26, opacity=100):
    # Calculate lines
    n = int(math.sqrt(len(text))) + 1
    lines = [text[i:i + n] for i in range(0, len(text), n)]
    # Create a blank image with white background
    width, height = n * font_size, n * font_size
    img = Image.new('RGBA', (width, height), (255, 255, 255))
    # Load font
    font = ImageFont.truetype(font_path, font_size)
    # Initialize ImageDraw
    draw = ImageDraw.Draw(img)
    # Set text color
    text_color = (0, 0, 0, opacity)
    # Draw text on image
    for i, line in enumerate(lines):
        # Calculate the width of the line
        text_bbox = draw.textbbox((0, 0), line, font)
        line_width = text_bbox[2] - text_bbox[0]
        # Calculate the x coordinate to center the line
        x = (width - line_width) / 2
        draw.text((x, i * font_size), line, font=font, fill=text_color)
    # Save the image
    img.save('watermark.png', 'PNG')

Watermark image:
insert image description here

2.32 - 2 watermark image Arnold scrambling

That is to scramble the watermark information, so that the information is evenly distributed, reduce possible losses, and at the same time prevent the watermark from being extracted and tampered by others.

Arnold scrambling is an image encryption technology, which is a two-dimensional image scrambling transformation method proposed by VI Arnold. Its basic idea is to regard the image as a function on a two-dimensional integer plane, and then scramble the pixel positions of the original image through a certain geometric transformation, so as to achieve the purpose of image encryption.

The basic formula of Arnold scrambling is as follows:

For each pixel (x, y) in the image, after Arnold scrambling, the new position (x', y') of the pixel can be calculated by the following formula:

. . . ... ... y ′ = ( x + 2 y ) m o d N y' = (x + 2y) mod N y=(x+2 y ) m o d N

Among them, N is the width or height of the image (assuming the image is square), and mod is the modulo operation.

The inverse operation of Arnold scrambling, that is, the decryption process, can be performed by the following formula:

x = ( 2 x ′ − y ′ ) m o d N x = (2x' - y') mod N x=( 2x _y )mordNy = . . . and =...y=...

These two sets of formulas are the basic formulas of Arnold scrambling and its inverse operation. Through these two sets of formulas, image encryption and decryption operations can be realized.


Effect:
One iteration of Arnold scrambling:

Part of the code:

# 对水印图片进行 Arnold 置乱
def arnold_scramble(image, iterations):
    # Convert the image to a numpy array
    array = np.array(image)
    # Get the size of the image
    height, width, _ = array.shape
    # Create an empty array to hold the scrambled image
    scrambled_array = np.empty_like(array)
    # Perform the scrambling
    for _ in range(iterations):
        for y in range(height):
            for x in range(width):
                scrambled_array[x,y] = array[(x + y) % height, (x + 2 * y) % width]
        array = scrambled_array.copy()
    # Convert the scrambled array back to an image
    scrambled_image = Image.fromarray(np.clip(scrambled_array, 0, 255).astype('uint8'))

    return scrambled_image

2.32 - 2 watermark embedding

This paper uses DCT-based dark watermarking technology. That is, the watermark image is embedded in the frequency domain of the image instead of the space domain of the previous LSB.

In the result of discrete cosine transform (DCT), the low-frequency part usually contains most of the information of the image, such as color and brightness changes. This is because most regions of the image usually have similar color and brightness, and this information appears as low-frequency components in the frequency domain. Therefore, you can see that there are more bright spots in the upper left corner (low frequency part) of the DCT image.
insert image description here

On the contrary, the high-frequency part contains the details and texture information of the image, such as edges and textures. These information appear as high-frequency components in the frequency domain. Therefore, you can see that there are some bright spots in the lower right corner (high frequency part) of the DCT image, but usually less than the low frequency part.

The midrange is somewhere in between, containing some color and brightness variation, as well as some detail and texture information.

In the frequency domain representation of an image, the high frequency part usually contains the least information. This is because…

on the contrary,…

Therefore, the intermediate frequency part is generally considered to be the best place to embed watermark. The mid-frequency part contains some color and brightness changes, as well as some detail and texture information, so embedding a watermark in this part is unlikely to significantly change the visual effect of the image. At the same time, since the information in the intermediate frequency part is less likely to be discarded when the image is compressed or reduced in resolution, the watermark embedded in this part is also more likely to be preserved.

Therefore, this article is to embed the watermark into the intermediate frequency part of the image after DCT transformation.


Of course, the following is the formula for the discrete cosine transform (DCT).

For a one-dimensional signal, the formula of DCT is as follows:

For a real number sequence x(n) of length N, its DCT transformation is X(k), and the calculation formula is:

. . . ... ...

Among them, n=0,1,...,N-1; k=0,1,...,N-1.

For a 2D image, we can think of it as a signal in two directions (x and y), so a separate DCT can be performed for each direction. All signals are first DCT-transformed in one direction (eg, row) and then DCT-transformed in the other direction (eg, column). In this way, we get the two-dimensional DCT transform.

The formula of the two-dimensional DCT is as follows:

F ( u , v ) = C ( u ) C ( v ) / 4 ∗ s u m x = 0 M − 1 s u m y = 0 N − 1 f ( x , y ) c o s [ ( 2 x + 1 ) u ∗ p i / 2 M ] c o s [ ( 2 y + 1 ) v ∗ p i / 2 N ] F(u,v) = C(u)C(v)/4 * sum_{x=0}^{M-1} sum_{y=0}^{N-1} f(x,y) cos [ (2x+1)u*pi / 2M ] cos [ (2y+1)v*pi / 2N ] F(u,v)=C(u)C(v)/4sumx=0M1sumy=0N1f(x,y ) cos [( 2 x+1)up i /2 M [ cos ] ( 2 y+1)vp i /2 N ]

Among them, x and y are the coordinates of the image, M and N are the width and height of the image, F(u, v) is the value in the frequency domain, f(x, y) is the value in the time domain, u and v are frequency. C(u) and C(v) are normalization coefficients, when u or v is 0, C(u) or C(v) is 1/√2, otherwise it is 1.

These formulas are all based on the cosine function, which is why it is called "discrete cosine transform".

For a one-dimensional signal, the formula for the inverse DCT is as follows:

For a real number sequence X(k) of length N, its inverse DCT transform is x(n), and the calculation formula is:

x ( n ) = s u m k = 0 N − 1 X ( k ) c o s [ ( p i / N ) ( n + 1 / 2 ) k ] , n = 0 , 1 , . . . , N − 1 x(n) = sum_{k=0}^{N-1} X(k) cos [ (pi/N) (n + 1/2) k ], n = 0, 1, ..., N-1 x(n)=sumk=0N1X ( k ) cos [ ( pi / N ) ( n+1/2)k],n=0,1,...,N1

Among them, n=0,1,...,N-1; k=0,1,...,N-1.

For a 2D image, we can think of it as a signal in two directions (u and v), so an inverse DCT can be performed for each direction separately. All signals are first inverse DCT transformed in one direction (eg row) and then inverse DCT transformed in the other direction (eg column). In this way, we get the two-dimensional inverse DCT transform.

The formula of the two-dimensional inverse DCT is as follows:

. . . ... ...

Among them, x and y are the coordinates of the image, M and N are the width and height of the image, F(u, v) is the value in the frequency domain, f(x, y) is the value in the time domain, u and v are frequency. C(u) and C(v) are normalization coefficients, when u or v is 0, C(u) or C(v) is 1/√2, otherwise it is 1.

Effect demonstration:

alpha=0.1

insert image description here
alpha = 0.9

insert image description here

Part of the code:

# 对原始图像执行离散余弦变换(DCT)
def perform_dct(original_array):
    height, width, _ = original_array.shape
    dct_blocks = np.empty_like(original_array, dtype=np.float64)
    for i in range(0, height, 8):
        for j in range(0, width, 8):
            dct_blocks[i:i + 8, j:j + 8] = dct(dct(original_array[i:i + 8, j:j + 8], axis=0, norm='ortho'), axis=1,
                                               norm='ortho')
    return dct_blocks


# 将水印嵌入到DCT块中
def embed_watermark(dct_blocks, watermark_array, alpha=0.05):
    dct_blocks_with_watermark = dct_blocks.copy()
    dct_blocks_with_watermark[::8, ::8] += alpha * watermark_array
    return dct_blocks_with_watermark

2.32 - 3 Watermark Extraction

In the first case, the watermark information is extracted from the rotated image. First, the edge extraction is realized through the Canny operator, and then the geometric shape of the image is detected by using the Hough transform. The appropriate rotation angle is selected by setting a reasonable threshold and restored to the level. Finally, the image obtained by cutting out the invalid image frame can be used to extract the watermark.

In the second case, the watermark information is extracted for the cropped picture, which will... .

In the third case, for the zoomed picture, the watermark can be extracted from the picture obtained through the zoom operation function provided by ....

In the process of extracting the watermark, the picture needs to be divided into 8×8 blocks first, and then the watermark is extracted from the sub-blocks. Since the watermark image was encrypted and scrambled in the previous operation, the inverse Arnold transform is needed here to realize the decryption and restoration of the watermark information, and finally extract a watermark image.

Effect:
insert image description here

Part of the code:

def process_images(image_with_watermark_path, original_image_path, alpha=0.05):
    # 加载图像
    image_with_watermark = load_image(image_with_watermark_path)
    original_image = load_image(original_image_path)
    # 将图像转换为数组
    image_with_watermark_array = image_to_array(image_with_watermark)
    original_array = image_to_array(original_image)
    # 对图像执行DCT
    dct_blocks_with_watermark = perform_dct(image_with_watermark_array)
    original_dct_blocks = perform_dct(original_array)
    # 提取水印
    watermark_array = extract_watermark(dct_blocks_with_watermark, original_dct_blocks, alpha)
    # 裁剪和转换图像
    watermark_array = clip_and_convert(watermark_array)
    # 将数组转换回图像
    watermark_image = array_to_image(watermark_array)
    return watermark_image

2.4 Question 4

LSB (Least Significant Bit) is a common information hiding technique, usually used in digital watermarking and steganography. When using LSB for information embedding, you need to pay attention to the following points:

  1. Choose the appropriate embedding position : LSB is usually the least significant bit of information embedded in the image, because it has the least impact on the image and is difficult for human eyes to detect. However, if the image may be subject to compression or other forms of processing, these processing may change the least significant bits, resulting in loss of embedded information. Therefore, if it is expected that the image may be subject to this kind of processing, it may be necessary to choose a different embedding position, such as a higher bit.

  2. Securing Embedded Information : While LSB embedding can hide information, if an attacker knows that LSB embedding was used, they may try to extract or corrupt it. Therefore, it may be necessary to use encryption or other forms of protection to ensure the security of embedded information. (For example, encrypt the information to be embedded before embedding)

  3. Avoid over-embedding : Although LSB embedding has less impact on the image, if too much information is embedded, it may cause significant image quality degradation. Therefore, a balance needs to be found between the need to hide information and maintain image quality.

3. References

A lot of content is referenced in these documents, which can be regarded as the reproduction and synthesis of some methods. When writing a paper, you can refer to these documents for related concepts and formulas. (However, I also gave a lot of concepts and their formulas)

All:
insert image description here
Partial preview:

insert image description here

insert image description here

4. Code preview

insert image description here

Guess you like

Origin blog.csdn.net/weixin_43764974/article/details/131994622