Python achieves blurred pictures in word resume

Python achieves blurred photos in word resumes - an effective way to protect personal privacy

 1. Introduction background

         In the modern recruitment process, the electronic resume has become one of the main recruitment methods. However, personal information contained in resumes often involves privacy issues, especially photos. In order to protect the personal privacy and data security of job seekers, many recruitment platforms require photos in resumes to be processed to avoid misuse or illegal use. Therefore, this blog will introduce how to use the Python programming language, combined with import zipfile, PIL library and io library, to achieve blurring of photos in Word resumes.

2. Development environment

We will use the following tools and libraries to accomplish this task:

  • Python programming language: Python is a popular and easy-to-use programming language with powerful image processing capabilities.
  • import zipfile: This library allows us to work with ZIP files in Python, making it easy to extract and save Word resume files.
  • PIL (Python Imaging Library) library: The PIL library is one of the standard libraries for Python image processing, providing a wealth of image processing functions.
  • io library: The io library provides some tools and functions for processing file streams.

3. Step overview

  1. Unzip the Word resume file: Use the import zipfile library to open the Word resume file and unzip its contents.
  2. Locate the photo file: Use Python code to find the photo file in the resume file and extract it.
  3. Image processing: Load a photo file using the PIL library and apply a blur filter to blur the photo.
  4. Replace the original file: save the processed photo again, and replace the photo file in the original resume.
  5. Recompress as Word resume file: Use the import zipfile library to repack the modified resume file, keeping its original format and structure.

        The following will introduce the implementation method of each link in detail according to the above steps, and provide corresponding Python code examples. By following the guidance of this blog post, you can easily implement the photo blur effect using the Python programming language to ensure the security and privacy protection of resume datasets.

Fourth, implement the code

import zipfile
from PIL import Image, ImageFilter
import io
import os

blur = ImageFilter.GaussianBlur(40)

def redact_images(filename):
    # outfile = filename.replace(".docx", "_redacted.docx")
    with zipfile.ZipFile('D:\Pycharmproject2023\code_test_project\shan_test\data\word简历\{}'.format(filename)) as inzip:
        with zipfile.ZipFile(filename, "w") as outzip:
            for info in inzip.infolist():
                name = info.filename
                print(info)
                content = inzip.read(info)
                if name.endswith((".png", ".jpeg", ".gif")):
                        fmt = name.split(".")[-1]
                        img = Image.open(io.BytesIO(content))
                        img = img.convert().filter(blur)
                        outb = io.BytesIO()
                        img.save(outb, fmt)
                        content = outb.getvalue()
                        info.file_size = len(content)
                        info.CRC = zipfile.crc32(content)
                outzip.writestr(info, content)

for filename in os.listdir('D:\Pycharmproject2023\code_test_project\shan_test\data\word简历'):
    if filename.endswith('.docx'):
        redact_images(filename)

5. After-blur effect

Guess you like

Origin blog.csdn.net/weixin_40547993/article/details/131713829
Recommended