In this post, we'll learn how to implement a simple modular pipeline for image processing, using OpenCV for image processing and manipulation, and Python generators for the pipeline steps.
In this post, we'll learn how to implement a simple modular pipeline for image processing, using OpenCV for image processing and manipulation, and Python generators for the pipeline steps.
An image processing pipeline is a set of tasks performed in a predefined sequence to convert an image into a desired result or extract some interesting features.
Examples of tasks could be:
-
Image transformations such as translation, rotation, resizing, flipping and cropping,
-
image enhancement,
-
extract the region of interest (ROI),
-
Compute feature descriptors,
-
image or object classification,
-
object detection,
-
image annotation for machine learning,
The end result might be a new image, or just a JSON file containing some image information.
Suppose we have a large number of images in a directory and want to detect faces in them and write each face to a separate file. Additionally, we want to have some JSON summary file that tells us where the face is found and in which file it is found. Our face detection process is as follows:
This is a very simple example that can be summarized with the following code:
import cv2
import os
import json
import numpy as np
def parse_args():
import argparse
# Parse command line arguments
ap = argparse.ArgumentParser(description="Image processing pipeline")
ap.add_argument("-i", "--input", required=True,
help="path to input image files")
ap.add_argument("-o", "--output", default="output",
help="path to output directory")
ap.add_argument("-os", "--out-summary", default=None,
help="output JSON summary file name")
ap.add_argument("-c", "--classifier", default="models/haarcascade/haarcascade_frontalface_default.xml",
help="path to where the face cascade resides")
return vars(ap.parse_args())
def list_images(path, valid_exts=None):
image_files = []
# Loop over the input directory structure
for (root_dir, dir_names, filenames) in os.walk(path):
for filename in sorted(filenames):
# Determine the file extension of the current file
ext = filename[filename.rfind("."):].lower()
if valid_exts and ext.endswith(valid_exts):
# Construct the path to the file and yield it
file = os.path.join(root_dir, filename)
image_files.append(file)
return image_files
def main(args):
os.makedirs(args["output"], exist_ok=True)
# load the face detector
detector = cv2.CascadeClassifier(args["classifier"])
# list images from input directory
input_image_files = list_images(args["input"], (".jpg", ".png"))
# Storage for JSON summary
summary = {}
# Loop over the image paths
for image_file in input_image_files:
# Load the image and convert it to grayscale
image = cv2.imread(image_file)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Detect faces
face_rects = detector.detectMultiScale(gray, scaleFactor=1.05, minNeighbors=5,
minSize=(30, 30), flags=cv2.CASCADE_SCALE_IMAGE)
summary[image_file] = {}
# Loop over all detected faces
for i, (x, y, w, h) in enumerate(face_rects):
face = image[y:y+w, x:x+h]
# Prepare output directory for faces
output = os.path.join(*(image_file.split(os.path.sep)[1:]))
output = os.path.join(args["output"], output)
os.makedirs(output, exist_ok=True)
# Save faces
face_file = os.path.join(output, f"{i:05d}.jpg")
cv2.imwrite(face_file, face)
# Store summary data
summary[image_file][face_file] = np.array([x, y, w, h], dtype=int).tolist()
# Display summary
print(f"[INFO] {image_file}: face detections {len(face_rects)}")
# Save summary data
if args["out_summary"]:
summary_file = os.path.join(args["output"], args["out_summary"])
print(f"[INFO] Saving summary to {summary_file}...")
with open(summary_file, 'w') as json_file:
json_file.write(json.dumps(summary))
if __name__ == '__main__':
args = parse_args()
main(args)
Simple image processing script for face detection and extraction
The comments in the code are also quite exploratory, so let’s dig a little deeper. First, we define the command line argument parser (lines 6-20) to accept the following arguments:
--input: This is the path to the directory containing our image (can be a subdirectory), this is the only mandatory parameter.
--output: The output directory to save pipeline results.
--out-summary: If we want a JSON summary, just provide its name (e.g. output.json).
--classifier: Path to pre-trained Haar cascade for face detection
Next, we define the list_images function (lines 22-34), which will help us traverse the input directory structure to get the image paths. For face detection, we use the Viola-Jones algorithm called the Haar cascade (line 40), which is a fairly Ancient algorithm.
Example image from the movie "Friends" with some false positives
The main processing loop is as follows: we iterate over the image files (line 49), read them one by one (line 51), detect faces (line 55), save them to a prepared directory (lines 59-72) and Save summary report with face coordinates (lines 78-82).
Prepare the project environment:
$ git clone git://github.com/jagin/image-processing-pipeline.git
$ cd image-processing-pipeline
$ git checkout 77c19422f0d7a90f1541ff81782948e9a12d2519
$ conda env create -f environment.yml
$ conda activate pipeline
In order to ensure that your code can run normally, please check whether your switch branch command is correct:
77c19422f0d7a90f1541ff81782948e9a12d2519
Let's run it: $ python process\_images.py --input assets/images -os output.json
we get a nice summary: whaosoft aiot http://143ai.com
[INFO] assets/images/friends/friends\_01.jpg: face detections 2
[INFO] assets/images/friends/friends\_02.jpg: face detections 3
[INFO] assets/images/friends/friends\_03.jpg: face detections 5
[INFO] assets/images/friends/friends\_04.jpg: face detections 14
[INFO] assets/images/landscapes/landscape\_01.jpg: face detections 0
[INFO] assets/images/landscapes/landscape\_02.jpg: face detections 0
[INFO] Saving summary to output/output.json...
Face images (also false positives) are stored in a separate directory for each image.
output
├── images
│ └── friends
│ ├── friends_01.jpg
│ │ ├── 00000.jpg
│ │ └── 00001.jpg
│ ├── friends_02.jpg
│ │ ├── 00000.jpg
│ │ ├── 00001.jpg
│ │ └── 00002.jpg
│ ├── friends_03.jpg
│ │ ├── 00000.jpg
│ │ ├── 00001.jpg
│ │ ├── 00002.jpg
│ │ ├── 00003.jpg
│ │ └── 00004.jpg
│ └── friends_04.jpg
│ ├── 00000.jpg
│ ├── 00001.jpg
│ ├── 00002.jpg
│ ├── 00003.jpg
│ ├── 00004.jpg
│ ├── 00005.jpg
│ ├── 00006.jpg
│ ├── 00007.jpg
│ ├── 00008.jpg
│ ├── 00009.jpg
│ ├── 00010.jpg
│ ├── 00011.jpg
│ ├── 00012.jpg
│ └── 00013.jpg
└── output.json