CV's modular image processing pipeline

In this post, we'll learn how to implement a simple modular pipeline for image processing, using OpenCV for image processing and manipulation, and Python generators for the pipeline steps.

An image processing pipeline is a set of tasks performed in a predefined sequence to convert an image into a desired result or extract some interesting features.

Examples of tasks could be:

Image transformations such as translation, rotation, resizing, flipping and cropping,
image enhancement,
extract the region of interest (ROI),
Compute feature descriptors,
image or object classification,
object detection,
image annotation for machine learning,

The end result might be a new image, or just a JSON file containing some image information.

Suppose we have a large number of images in a directory and want to detect faces in them and write each face to a separate file. Additionally, we want to have some JSON summary file that tells us where the face is found and in which file it is found. Our face detection process is as follows:

This is a very simple example that can be summarized with the following code:

import cv2
import os
import json
import numpy as np

def parse_args():
    import argparse

    # Parse command line arguments
    ap = argparse.ArgumentParser(description="Image processing pipeline")
    ap.add_argument("-i", "--input", required=True,
                    help="path to input image files")
    ap.add_argument("-o", "--output", default="output",
                    help="path to output directory")
    ap.add_argument("-os", "--out-summary", default=None,
                    help="output JSON summary file name")
    ap.add_argument("-c", "--classifier", default="models/haarcascade/haarcascade_frontalface_default.xml",
                    help="path to where the face cascade resides")

    return vars(ap.parse_args())

def list_images(path, valid_exts=None):
    image_files = []
    # Loop over the input directory structure
    for (root_dir, dir_names, filenames) in os.walk(path):
        for filename in sorted(filenames):
            # Determine the file extension of the current file
            ext = filename[filename.rfind("."):].lower()
            if valid_exts and ext.endswith(valid_exts):
                # Construct the path to the file and yield it
                file = os.path.join(root_dir, filename)
                image_files.append(file)

    return image_files

def main(args):
    os.makedirs(args["output"], exist_ok=True)

    # load the face detector
    detector = cv2.CascadeClassifier(args["classifier"])

    # list images from input directory
    input_image_files = list_images(args["input"], (".jpg", ".png"))

    # Storage for JSON summary
    summary = {}

    # Loop over the image paths
    for image_file in input_image_files:
        # Load the image and convert it to grayscale
        image = cv2.imread(image_file)
        gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

        # Detect faces
        face_rects = detector.detectMultiScale(gray, scaleFactor=1.05, minNeighbors=5,
                                               minSize=(30, 30), flags=cv2.CASCADE_SCALE_IMAGE)
        summary[image_file] = {}
        # Loop over all detected faces
        for i, (x, y, w, h) in enumerate(face_rects):
            face = image[y:y+w, x:x+h]

            # Prepare output directory for faces
            output = os.path.join(*(image_file.split(os.path.sep)[1:]))
            output = os.path.join(args["output"], output)
            os.makedirs(output, exist_ok=True)

            # Save faces
            face_file = os.path.join(output, f"{i:05d}.jpg")
            cv2.imwrite(face_file, face)

            # Store summary data
            summary[image_file][face_file] = np.array([x, y, w, h], dtype=int).tolist()

        # Display summary
        print(f"[INFO] {image_file}: face detections {len(face_rects)}")

    # Save summary data
    if args["out_summary"]:
        summary_file = os.path.join(args["output"], args["out_summary"])
        print(f"[INFO] Saving summary to {summary_file}...")
        with open(summary_file, 'w') as json_file:
            json_file.write(json.dumps(summary))

if __name__ == '__main__':
    args = parse_args()
    main(args)

Simple image processing script for face detection and extraction

The comments in the code are also quite exploratory, so let’s dig a little deeper. First, we define the command line argument parser (lines 6-20) to accept the following arguments:

--input: This is the path to the directory containing our image (can be a subdirectory), this is the only mandatory parameter.

--output: The output directory to save pipeline results.

--out-summary: If we want a JSON summary, just provide its name (e.g. output.json).

--classifier: Path to pre-trained Haar cascade for face detection

Next, we define the list_images function (lines 22-34), which will help us traverse the input directory structure to get the image paths. For face detection, we use the Viola-Jones algorithm called the Haar cascade (line 40), which is a fairly Ancient algorithm.

Example image from the movie "Friends" with some false positives

The main processing loop is as follows: we iterate over the image files (line 49), read them one by one (line 51), detect faces (line 55), save them to a prepared directory (lines 59-72) and Save summary report with face coordinates (lines 78-82).

Prepare the project environment:

$ git clone git://github.com/jagin/image-processing-pipeline.git
$ cd image-processing-pipeline
$ git checkout 77c19422f0d7a90f1541ff81782948e9a12d2519
$ conda env create -f environment.yml
$ conda activate pipeline

In order to ensure that your code can run normally, please check whether your switch branch command is correct:
77c19422f0d7a90f1541ff81782948e9a12d2519

Let's run it: $ python process\_images.py --input assets/images -os output.json we get a nice summary: whaosoft aiot http://143ai.com

[INFO] assets/images/friends/friends\_01.jpg: face detections 2

[INFO] assets/images/friends/friends\_02.jpg: face detections 3

[INFO] assets/images/friends/friends\_03.jpg: face detections 5

[INFO] assets/images/friends/friends\_04.jpg: face detections 14

[INFO] assets/images/landscapes/landscape\_01.jpg: face detections 0

[INFO] assets/images/landscapes/landscape\_02.jpg: face detections 0

[INFO] Saving summary to output/output.json...

Face images (also false positives) are stored in a separate directory for each image.

output
├── images
│   └── friends
│       ├── friends_01.jpg
│       │   ├── 00000.jpg
│       │   └── 00001.jpg
│       ├── friends_02.jpg
│       │   ├── 00000.jpg
│       │   ├── 00001.jpg
│       │   └── 00002.jpg
│       ├── friends_03.jpg
│       │   ├── 00000.jpg
│       │   ├── 00001.jpg
│       │   ├── 00002.jpg
│       │   ├── 00003.jpg
│       │   └── 00004.jpg
│       └── friends_04.jpg
│           ├── 00000.jpg
│           ├── 00001.jpg
│           ├── 00002.jpg
│           ├── 00003.jpg
│           ├── 00004.jpg
│           ├── 00005.jpg
│           ├── 00006.jpg
│           ├── 00007.jpg
│           ├── 00008.jpg
│           ├── 00009.jpg
│           ├── 00010.jpg
│           ├── 00011.jpg
│           ├── 00012.jpg
│           └── 00013.jpg
└── output.json

CV's modular image processing pipeline

Guess you like