Step by step learning OAK five: edge detection through OAK camera

Introduction to Edge Detection

Edge detection is a fundamental technique in computer vision and image processing to identify objects in an image and the boundaries or contours between different regions in an image. Edges are regions in an image with significant grayscale or intensity variations, typically representing boundaries between different objects, textures, or shapes.

Edge detection algorithms find areas of these changes by analyzing the gray value or intensity changes of pixels in the image. Common edge detection algorithms include Sobel operator, Canny edge detection and Laplacian operator, etc.

Edge detection has a wide range of applications, such as object recognition, image segmentation, object measurement, image registration, and image compression. By detecting the edges in the image, the shape and structure information of the object can be extracted for further analysis and processing.

Here we implement edge detection for three different inputs from the left, right and RGB cameras. A hardware accelerated Sobel filter 3x3 is used. The parameters of the Sobel filter can be changed by pressing keys 1 and 2.

Setup 1: Create the file

  • Create a new 5-edge-detector folder
  • Open the folder with vscode
  • Create a new main.py file

Setup 2: Install dependencies

Before installing dependencies, you need to create and activate a virtual environment. I have created a virtual environment OAKenv here, enter cd in the terminal... to return to the root directory of OAKenv, and enter to activate the virtual environment. Install pip OAKenv\Scripts\activatedependencies
:

pip install numpy opencv-python depthai blobconverter --user

Setup 3: Import required packages

Import the packages required by the project in main.py

import cv2
import depthai as dai
import numpy as np

The numpy library is imported here;

NumPy is a Python library for scientific computing and data analysis. It provides high-performance multidimensional array objects (ndarray), and a series of functions for manipulating these arrays. The main functions of NumPy include:

  1. Multidimensional arrays: At the heart of NumPy is the ndarray object, which is an array with a fixed size that can hold elements of the same type. This makes NumPy arrays more efficient than Python's native lists and more suitable for large-scale data processing and numerical calculations.

  2. Broadcasting: NumPy allows arrays of different shapes to perform the same operation without copying the data. Broadcasting enables efficient element-wise mathematical operations on arrays.

  3. Mathematical and logical operations: NumPy provides many built-in mathematical functions (such as sin, cos, exp, etc.) and logical operation functions (such as all, any, logical_and, etc.) to easily operate on arrays.

  4. Linear algebra operations: NumPy includes a set of functions for performing linear algebra operations, such as matrix multiplication, solving linear equations, etc.

  5. Array manipulation: NumPy provides a wealth of array manipulation functions, such as sorting, slicing, indexing, reshaping, etc., as well as functions for merging and splitting arrays.

  6. File operations: NumPy can read and write array data to files on disk, supporting multiple file formats (such as text files, CSV files, binary files, etc.).

Setup 4: Create the pipeline

pipeline = dai.Pipeline()

Setup 5: Create Nodes

Create a camera node

camRgb = pipeline.create(dai.node.ColorCamera)
monoLeft = pipeline.createMonoCamera()
monoRight = pipeline.createMonoCamera()
  1. camRgb = pipeline.create(dai.node.ColorCamera): Create a color camera node (ColorCamera) for capturing RGB images.
  2. monoLeft = pipeline.createMonoCamera(): Create a single-channel camera node (MonoCamera) for capturing the left monochrome image.
  3. monoRight = pipeline.createMonoCamera(): Create a single-channel camera node (MonoCamera) for capturing the monochrome image on the right.

These three nodes will work in parallel in a pipeline that can simultaneously capture color and monochrome images.

Create an edge detection node

edgeDetectorLeft = pipeline.createEdgeDetector()
edgeDetectorRight = pipeline.createEdgeDetector()
edgeDetectorRgb = pipeline.createEdgeDetector()

This code creates three edge detection nodes in the pipeline.

  1. edgeDetectorLeft = pipeline.createEdgeDetector(): Created an edge detection node for processing the left monochrome image.
  2. edgeDetectorRight = pipeline.createEdgeDetector(): An edge detection node is created to process the monochrome image on the right.
  3. edgeDetectorRgb = pipeline.createEdgeDetector(): Created an edge detection node for processing color images.

These edge detection nodes can be applied to left monochrome images, right monochrome images, and color images in order to detect edge features in the images.

Create a node for XLinkOut data interaction

xoutEdgeLeft = pipeline.createXLinkOut()
xoutEdgeRight = pipeline.createXLinkOut()
xoutEdgeRgb = pipeline.createXLinkOut()
xinEdgeCfg = pipeline.createXLinkIn()

xoutEdgeLeft.setStreamName("edge left")
xoutEdgeRight.setStreamName("edge right")
xoutEdgeRgb.setStreamName("edge rgb")
xinEdgeCfg.setStreamName("edge cfg")

This code creates four nodes for data interaction with external devices.

  1. The usage pipeline.createXLinkOut()method creates three nodes for data output with external devices.
  2. The usage pipeline.createXLinkIn()method creates a node for data input with an external device.

A dataflow name is set for each output node and input node

  • xoutEdgeLeft.setStreamName("edge left"): Set the data stream name of the output node xoutEdgeLeft to "edge left".
  • xoutEdgeRight.setStreamName("edge right"): Set the data stream name of the output node xoutEdgeRight to "edge right".
  • xoutEdgeRgb.setStreamName("edge rgb"): Set the data stream name of the output node xoutEdgeRgb to "edge rgb".
  • xinEdgeCfg.setStreamName("edge cfg"): Set the data flow name of the input node xinEdgeCfg to "edge cfg".

Through these nodes and data stream names, we can connect external devices with the image processing pipeline to realize the input and output of image data.

Setup 6: Set related properties

Set the relevant properties of the color camera

camRgb.setBoardSocket(dai.CameraBoardSocket.RGB)
camRgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P)
  1. camRgb.setBoardSocket(dai.CameraBoardSocket.RGB): Set the onboard slot of the color camera to RGB. This indicates that the color camera will be connected to the system board through the RGB interface.

  2. camRgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P): Set the resolution of the color camera to 1080P. This means that the color camera will capture images at 1080P resolution.

monoLeft.setResolution(dai.MonoCameraProperties.SensorResolution.THE_400_P)
monoLeft.setBoardSocket(dai.CameraBoardSocket.LEFT)
monoRight.setResolution(dai.MonoCameraProperties.SensorResolution.THE_400_P)
monoRight.setBoardSocket(dai.CameraBoardSocket.RIGHT)

Set the relevant properties of the left and right monocular cameras

For the monocular camera on the left:

  1. monoLeft.setResolution(dai.MonoCameraProperties.SensorResolution.THE_400_P): Set the resolution of the left monocular camera to 400P. This means that the left monocular camera will capture images at 400P resolution.
  2. monoLeft.setBoardSocket(dai.CameraBoardSocket.LEFT): Set the onboard slot of the left monocular camera to LEFT. This indicates that the left monocular camera will be connected via the left connector on the system board.

For the monocular camera on the right:

  1. monoRight.setResolution(dai.MonoCameraProperties.SensorResolution.THE_400_P): Set the resolution of the right monocular camera to 400P. This means that the right monocular camera will capture images at 400P resolution.
  2. monoRight.setBoardSocket(dai.CameraBoardSocket.RIGHT): Set the onboard slot of the right monocular camera to RIGHT. This indicates that the right monocular camera will be connected via the right connector on the system board.

Set the maximum output frame size of the edge detector

edgeDetectorRgb.setMaxOutputFrameSize(camRgb.getVideoWidth() * camRgb.getVideoHeight())

This line of code sets the maximum output frame size of the edge detector to match the video frame size of the color camera.

camRgb.getVideoWidth()and camRgb.getVideoHeight()return the width and height of the color camera's video frame, respectively. By multiplying these two values ​​together, you get the total number of pixels in the color camera video frame.

edgeDetectorRgb.setMaxOutputFrameSize()Used to set the maximum output frame size of the edge detector. Ensure that the output frame size of the edge detector matches the size of the color camera video frame by setting it to the total number of pixels of the color camera video frame.

Doing this ensures that the edge detector can process the full color camera video frame and produce edge detection results of the same size as the original video frame.

Setup 7: Establish link relationship

Establish the link between the camera and the edge detector

monoLeft.out.link(edgeDetectorLeft.inputImage)
monoRight.out.link(edgeDetectorRight.inputImage)
camRgb.video.link(edgeDetectorRgb.inputImage)

This code establishes the link between the camera and the edge detector, passing the camera's image as input to the corresponding edge detector.

  • monoLeft.out.link(edgeDetectorLeft.inputImage): Connect the output of the left monocular camera to the input of the left edge detector. This means that the left edge detector will receive the image from the left monocular camera as input for edge detection.

  • monoRight.out.link(edgeDetectorRight.inputImage): Connect the output of the right monocular camera to the input of the right edge detector. This means that the right edge detector will receive the image from the right monocular camera as input for edge detection.

  • camRgb.video.link(edgeDetectorRgb.inputImage): Connect the video output of the color camera to the input of the color edge detector. This means that the color edge detector will receive video frames from a color camera as input for edge detection.

Link the edge detection area to the output port

edgeDetectorLeft.outputImage.link(xoutEdgeLeft.input)
edgeDetectorRight.outputImage.link(xoutEdgeRight.input)
edgeDetectorRgb.outputImage.link(xoutEdgeRgb.input)

This code connects the output images of the left, right and colored edge detectors to the corresponding output ports to output the edge detection results.

  • edgeDetectorLeft.outputImage.link(xoutEdgeLeft.input): Connect the output image of the left edge detector to xoutEdgeLeftthe input port to output the left edge detection result.

  • edgeDetectorRight.outputImage.link(xoutEdgeRight.input): Connect the output image of the right edge detector to xoutEdgeRightthe input port to output the right edge detection result.

  • edgeDetectorRgb.outputImage.link(xoutEdgeRgb.input): Connect the output image of the color edge detector to xoutEdgeRgbthe input port to output the color edge detection result.

Link xinEdgeCfg with the edge detector configuration input

xinEdgeCfg.out.link(edgeDetectorLeft.inputConfig)
xinEdgeCfg.out.link(edgeDetectorRight.inputConfig)
xinEdgeCfg.out.link(edgeDetectorRgb.inputConfig)

This code xinEdgeCfglinks the output of the to the configuration input of the left, right and colored edge detectors in order to pass the configuration information to the edge detectors.

  • xinEdgeCfg.out.link(edgeDetectorLeft.inputConfig): Links xinEdgeCfgthe output of the to the configuration input of the left edge detector. This means that the edge detector will receive xinEdgeCfgconfiguration information from , to be set and adjusted accordingly.

  • xinEdgeCfg.out.link(edgeDetectorRight.inputConfig): Links xinEdgeCfgthe output of to the configuration input of the right edge detector. This means that the edge detector will receive xinEdgeCfgconfiguration information from , to be set and adjusted accordingly.

  • xinEdgeCfg.out.link(edgeDetectorRgb.inputConfig): xinEdgeCfgLinks the output of s to the configuration input of the color edge detector. This means that the edge detector will receive xinEdgeCfgconfiguration information from , to be set and adjusted accordingly.

Setup 8: Connect the device and start the pipeline

with dai.Device(pipeline) as device:

Setup 9: Create input queue and output queue to communicate with DepthAI device

    edgeLeftQueue = device.getOutputQueue(name="edge left", maxSize=8, blocking=False)
    edgeRightQueue = device.getOutputQueue(name="edge right", maxSize=8, blocking=False)  
    edgeRgbQueue = device.getOutputQueue(name="edge rgb", maxSize=8, blocking=False)
    edgeCfgQueue = device.getInputQueue(name="edge cfg")

    print("Switch between sobel filter kernels using keys '1' and '2'")

This code creates three output queues and one input queue to receive output and configuration information from the edge detector.

  • Usage device.getOutputQueue(name="edge left", maxSize=8, blocking=False): Creates three output queues edgeLeftQueue, edgeRightQueueand edgeRgbQueue, to receive the outputs of the three edge detectors. At the same time, name is specified, maxSize=8indicating that the maximum size of the queue is 8 elements. blocking=FalseIndicates that the queue is in non-blocking mode, that is, if the queue is full, new elements will be discarded.

  • Use device.getInputQueue(edgeCfgStr): Creates an input queue edgeCfgQueuefor receiving edge detector configuration information. Specify name="edge-cfg".

Setup 10: Main loop

    while True:

Get the output of the edge detector

        edgeLeft = edgeLeftQueue.get()
        edgeRight = edgeRightQueue.get()
        edgeRgb = edgeRgbQueue.get()

Get the output of the edge detector from each output queue.

  • edgeLeft = edgeLeftQueue.get()edgeLeftQueue: Get the latest output from the output queue of the left edge detector . Assign this output to a variable edgeLeft.

  • edgeRight = edgeRightQueue.get()edgeRightQueue: Get the latest output from the output queue of the right edge detector . Assign this output to a variable edgeRight.

  • edgeRgb = edgeRgbQueue.get()edgeRgbQueue: Get the latest output from the output queue of the color edge detector . Assign this output to a variable edgeRgb.

Get frame image data from output of edge detector

        edgeLeftFrame = edgeLeft.getFrame()
        edgeRightFrame = edgeRight.getFrame()
        edgeRgbFrame = edgeRgb.getFrame()

This code gets the frame image data from the output of the edge detector.

  • edgeLeftFrame = edgeLeft.getFrame(): Get the frame image data from the output of the left edge detector edgeLeftand assign it to edgeLeftFramea variable.
  • edgeRightFrame = edgeRight.getFrame(): Get the frame image data from the output of the right edge detector edgeRightand assign it to edgeRightFramea variable.
  • edgeRgbFrame = edgeRgb.getFrame(): Get the frame image data from the output of the color edge detector edgeRgband assign it to edgeRgbFramea variable.

Display the output image of the edge detector

        cv2.imshow("edge left", edgeLeftFrame)
        cv2.imshow("edge right", edgeRightFrame)
        cv2.imshow("edge rgb", edgeRgbFrame)

This code uses the functions of the OpenCV library imshowto display the output image of the edge detector.

  • cv2.imshow("edge left", edgeLeftFrame): Use imshowthe function to display the output image of the left edge detector. edgeLeftStris the window title and edgeLeftFrameis the image data to display.

  • cv2.imshow("edge right", edgeRightFrame): Use imshowthe function to display the output image of the right edge detector. edgeRightStris the window title and edgeRightFrameis the image data to display.

  • cv2.imshow("edge rgb", edgeRgbFrame): Use imshowthe function to display the output image of the colored edge detector. edgeRgbStris the window title and edgeRgbFrameis the image data to display.

With these codes, we can display the output image of the edge detector in a window to view and analyze the results.

Responding to keyboard input

        key = cv2.waitKey(1)
        if key == ord('q'):
            break

        if key == ord('1'):
            print("Switching sobel filter kernel.")
            cfg = dai.EdgeDetectorConfig()
            sobelHorizontalKernel = [[1, 0, -1], [2, 0, -2], [1, 0, -1]]
            sobelVerticalKernel = [[1, 2, 1], [0, 0, 0], [-1, -2, -1]]
            cfg.setSobelFilterKernels(sobelHorizontalKernel, sobelVerticalKernel)
            edgeCfgQueue.send(cfg)

        if key == ord('2'):
            print("Switching sobel filter kernel.")
            cfg = dai.EdgeDetectorConfig()
            sobelHorizontalKernel = [[3, 0, -3], [10, 0, -10], [3, 0, -3]]
            sobelVerticalKernel = [[3, 10, 3], [0, 0, 0], [-3, -10, -3]]
            cfg.setSobelFilterKernels(sobelHorizontalKernel, sobelVerticalKernel)
            edgeCfgQueue.send(cfg)

This code implements the response to keyboard input to achieve some function switching and configuration changes.

  • key = cv2.waitKey(1): Wait and get the ASCII code entered by the keyboard.

  • if key == ord('q'): break: If the key pressed is the ASCII code of the letter 'q', it means that you want to exit the program, then it will jump out of the loop and end the running of the program.

  • if key == ord('1'):: If the key pressed is the ASCII code of the letter '1', it means that you want to switch to the first type of convolution kernel, then the following operations will be performed:

    • Create an object named cfg, dai.EdgeDetectorConfigwhich is used to configure the edge detector.
    • Define sobelHorizontalKerneland sobelVerticalKernelare the horizontal and vertical filter matrices of the first type of convolution kernel, respectively.
    • Use cfg.setSobelFilterKernelsthe method to set these filter matrices into the configuration object.
    • Send the configuration object to edgeCfgQueuethe message queue named .
  • if key == ord('2'):: If the key pressed is the ASCII code of the letter '2', it means that you want to switch to the second type of convolution kernel, then the following operations will be performed:

    • Create an object named cfg, dai.EdgeDetectorConfigwhich is used to configure the edge detector.
    • Define sobelHorizontalKerneland sobelVerticalKernelare the horizontal and vertical filter matrices of the second convolution kernel, respectively.
    • Use cfg.setSobelFilterKernelsthe method to set these filter matrices into the configuration object.
    • Send the configuration object to edgeCfgQueuethe message queue named .

Through these codes, we can switch the convolution kernel or exit the program according to the key.

Setup 11: Run the program

Enter the following command in the terminal to run the program

python main.py

After running the program, you can see the following effect
insert image description here

Guess you like

Origin blog.csdn.net/w137160164/article/details/131448965