Table of contents
- Introduction to Edge Detection
- Setup 1: Create the file
- Setup 2: Install dependencies
- Setup 3: Import required packages
- Setup 4: Create the pipeline
- Setup 5: Create Nodes
- Setup 6: Set related properties
- Setup 7: Establish link relationship
- Setup 8: Connect the device and start the pipeline
- Setup 9: Create input queue and output queue to communicate with DepthAI device
- Setup 10: Main loop
- Setup 11: Run the program
Introduction to Edge Detection
Edge detection is a fundamental technique in computer vision and image processing to identify objects in an image and the boundaries or contours between different regions in an image. Edges are regions in an image with significant grayscale or intensity variations, typically representing boundaries between different objects, textures, or shapes.
Edge detection algorithms find areas of these changes by analyzing the gray value or intensity changes of pixels in the image. Common edge detection algorithms include Sobel operator, Canny edge detection and Laplacian operator, etc.
Edge detection has a wide range of applications, such as object recognition, image segmentation, object measurement, image registration, and image compression. By detecting the edges in the image, the shape and structure information of the object can be extracted for further analysis and processing.
Here we implement edge detection for three different inputs from the left, right and RGB cameras. A hardware accelerated Sobel filter 3x3 is used. The parameters of the Sobel filter can be changed by pressing keys 1 and 2.
Setup 1: Create the file
- Create a new 5-edge-detector folder
- Open the folder with vscode
- Create a new main.py file
Setup 2: Install dependencies
Before installing dependencies, you need to create and activate a virtual environment. I have created a virtual environment OAKenv here, enter cd in the terminal... to return to the root directory of OAKenv, and enter to activate the virtual environment. Install pip OAKenv\Scripts\activate
dependencies
:
pip install numpy opencv-python depthai blobconverter --user
Setup 3: Import required packages
Import the packages required by the project in main.py
import cv2
import depthai as dai
import numpy as np
The numpy library is imported here;
NumPy is a Python library for scientific computing and data analysis. It provides high-performance multidimensional array objects (ndarray), and a series of functions for manipulating these arrays. The main functions of NumPy include:
-
Multidimensional arrays: At the heart of NumPy is the ndarray object, which is an array with a fixed size that can hold elements of the same type. This makes NumPy arrays more efficient than Python's native lists and more suitable for large-scale data processing and numerical calculations.
-
Broadcasting: NumPy allows arrays of different shapes to perform the same operation without copying the data. Broadcasting enables efficient element-wise mathematical operations on arrays.
-
Mathematical and logical operations: NumPy provides many built-in mathematical functions (such as sin, cos, exp, etc.) and logical operation functions (such as all, any, logical_and, etc.) to easily operate on arrays.
-
Linear algebra operations: NumPy includes a set of functions for performing linear algebra operations, such as matrix multiplication, solving linear equations, etc.
-
Array manipulation: NumPy provides a wealth of array manipulation functions, such as sorting, slicing, indexing, reshaping, etc., as well as functions for merging and splitting arrays.
-
File operations: NumPy can read and write array data to files on disk, supporting multiple file formats (such as text files, CSV files, binary files, etc.).
Setup 4: Create the pipeline
pipeline = dai.Pipeline()
Setup 5: Create Nodes
Create a camera node
camRgb = pipeline.create(dai.node.ColorCamera)
monoLeft = pipeline.createMonoCamera()
monoRight = pipeline.createMonoCamera()
camRgb = pipeline.create(dai.node.ColorCamera)
: Create a color camera node (ColorCamera) for capturing RGB images.monoLeft = pipeline.createMonoCamera()
: Create a single-channel camera node (MonoCamera) for capturing the left monochrome image.monoRight = pipeline.createMonoCamera()
: Create a single-channel camera node (MonoCamera) for capturing the monochrome image on the right.
These three nodes will work in parallel in a pipeline that can simultaneously capture color and monochrome images.
Create an edge detection node
edgeDetectorLeft = pipeline.createEdgeDetector()
edgeDetectorRight = pipeline.createEdgeDetector()
edgeDetectorRgb = pipeline.createEdgeDetector()
This code creates three edge detection nodes in the pipeline.
edgeDetectorLeft = pipeline.createEdgeDetector()
: Created an edge detection node for processing the left monochrome image.edgeDetectorRight = pipeline.createEdgeDetector()
: An edge detection node is created to process the monochrome image on the right.edgeDetectorRgb = pipeline.createEdgeDetector()
: Created an edge detection node for processing color images.
These edge detection nodes can be applied to left monochrome images, right monochrome images, and color images in order to detect edge features in the images.
Create a node for XLinkOut data interaction
xoutEdgeLeft = pipeline.createXLinkOut()
xoutEdgeRight = pipeline.createXLinkOut()
xoutEdgeRgb = pipeline.createXLinkOut()
xinEdgeCfg = pipeline.createXLinkIn()
xoutEdgeLeft.setStreamName("edge left")
xoutEdgeRight.setStreamName("edge right")
xoutEdgeRgb.setStreamName("edge rgb")
xinEdgeCfg.setStreamName("edge cfg")
This code creates four nodes for data interaction with external devices.
- The usage
pipeline.createXLinkOut()
method creates three nodes for data output with external devices. - The usage
pipeline.createXLinkIn()
method creates a node for data input with an external device.
A dataflow name is set for each output node and input node
xoutEdgeLeft.setStreamName("edge left")
: Set the data stream name of the output node xoutEdgeLeft to "edge left".xoutEdgeRight.setStreamName("edge right")
: Set the data stream name of the output node xoutEdgeRight to "edge right".xoutEdgeRgb.setStreamName("edge rgb")
: Set the data stream name of the output node xoutEdgeRgb to "edge rgb".xinEdgeCfg.setStreamName("edge cfg")
: Set the data flow name of the input node xinEdgeCfg to "edge cfg".
Through these nodes and data stream names, we can connect external devices with the image processing pipeline to realize the input and output of image data.
Setup 6: Set related properties
Set the relevant properties of the color camera
camRgb.setBoardSocket(dai.CameraBoardSocket.RGB)
camRgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P)
-
camRgb.setBoardSocket(dai.CameraBoardSocket.RGB)
: Set the onboard slot of the color camera to RGB. This indicates that the color camera will be connected to the system board through the RGB interface. -
camRgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P)
: Set the resolution of the color camera to 1080P. This means that the color camera will capture images at 1080P resolution.
monoLeft.setResolution(dai.MonoCameraProperties.SensorResolution.THE_400_P)
monoLeft.setBoardSocket(dai.CameraBoardSocket.LEFT)
monoRight.setResolution(dai.MonoCameraProperties.SensorResolution.THE_400_P)
monoRight.setBoardSocket(dai.CameraBoardSocket.RIGHT)
Set the relevant properties of the left and right monocular cameras
For the monocular camera on the left:
monoLeft.setResolution(dai.MonoCameraProperties.SensorResolution.THE_400_P)
: Set the resolution of the left monocular camera to 400P. This means that the left monocular camera will capture images at 400P resolution.monoLeft.setBoardSocket(dai.CameraBoardSocket.LEFT)
: Set the onboard slot of the left monocular camera to LEFT. This indicates that the left monocular camera will be connected via the left connector on the system board.
For the monocular camera on the right:
monoRight.setResolution(dai.MonoCameraProperties.SensorResolution.THE_400_P)
: Set the resolution of the right monocular camera to 400P. This means that the right monocular camera will capture images at 400P resolution.monoRight.setBoardSocket(dai.CameraBoardSocket.RIGHT)
: Set the onboard slot of the right monocular camera to RIGHT. This indicates that the right monocular camera will be connected via the right connector on the system board.
Set the maximum output frame size of the edge detector
edgeDetectorRgb.setMaxOutputFrameSize(camRgb.getVideoWidth() * camRgb.getVideoHeight())
This line of code sets the maximum output frame size of the edge detector to match the video frame size of the color camera.
camRgb.getVideoWidth()
and camRgb.getVideoHeight()
return the width and height of the color camera's video frame, respectively. By multiplying these two values together, you get the total number of pixels in the color camera video frame.
edgeDetectorRgb.setMaxOutputFrameSize()
Used to set the maximum output frame size of the edge detector. Ensure that the output frame size of the edge detector matches the size of the color camera video frame by setting it to the total number of pixels of the color camera video frame.
Doing this ensures that the edge detector can process the full color camera video frame and produce edge detection results of the same size as the original video frame.
Setup 7: Establish link relationship
Establish the link between the camera and the edge detector
monoLeft.out.link(edgeDetectorLeft.inputImage)
monoRight.out.link(edgeDetectorRight.inputImage)
camRgb.video.link(edgeDetectorRgb.inputImage)
This code establishes the link between the camera and the edge detector, passing the camera's image as input to the corresponding edge detector.
-
monoLeft.out.link(edgeDetectorLeft.inputImage)
: Connect the output of the left monocular camera to the input of the left edge detector. This means that the left edge detector will receive the image from the left monocular camera as input for edge detection. -
monoRight.out.link(edgeDetectorRight.inputImage)
: Connect the output of the right monocular camera to the input of the right edge detector. This means that the right edge detector will receive the image from the right monocular camera as input for edge detection. -
camRgb.video.link(edgeDetectorRgb.inputImage)
: Connect the video output of the color camera to the input of the color edge detector. This means that the color edge detector will receive video frames from a color camera as input for edge detection.
Link the edge detection area to the output port
edgeDetectorLeft.outputImage.link(xoutEdgeLeft.input)
edgeDetectorRight.outputImage.link(xoutEdgeRight.input)
edgeDetectorRgb.outputImage.link(xoutEdgeRgb.input)
This code connects the output images of the left, right and colored edge detectors to the corresponding output ports to output the edge detection results.
-
edgeDetectorLeft.outputImage.link(xoutEdgeLeft.input)
: Connect the output image of the left edge detector toxoutEdgeLeft
the input port to output the left edge detection result. -
edgeDetectorRight.outputImage.link(xoutEdgeRight.input)
: Connect the output image of the right edge detector toxoutEdgeRight
the input port to output the right edge detection result. -
edgeDetectorRgb.outputImage.link(xoutEdgeRgb.input)
: Connect the output image of the color edge detector toxoutEdgeRgb
the input port to output the color edge detection result.
Link xinEdgeCfg with the edge detector configuration input
xinEdgeCfg.out.link(edgeDetectorLeft.inputConfig)
xinEdgeCfg.out.link(edgeDetectorRight.inputConfig)
xinEdgeCfg.out.link(edgeDetectorRgb.inputConfig)
This code xinEdgeCfg
links the output of the to the configuration input of the left, right and colored edge detectors in order to pass the configuration information to the edge detectors.
-
xinEdgeCfg.out.link(edgeDetectorLeft.inputConfig)
: LinksxinEdgeCfg
the output of the to the configuration input of the left edge detector. This means that the edge detector will receivexinEdgeCfg
configuration information from , to be set and adjusted accordingly. -
xinEdgeCfg.out.link(edgeDetectorRight.inputConfig)
: LinksxinEdgeCfg
the output of to the configuration input of the right edge detector. This means that the edge detector will receivexinEdgeCfg
configuration information from , to be set and adjusted accordingly. -
xinEdgeCfg.out.link(edgeDetectorRgb.inputConfig)
:xinEdgeCfg
Links the output of s to the configuration input of the color edge detector. This means that the edge detector will receivexinEdgeCfg
configuration information from , to be set and adjusted accordingly.
Setup 8: Connect the device and start the pipeline
with dai.Device(pipeline) as device:
Setup 9: Create input queue and output queue to communicate with DepthAI device
edgeLeftQueue = device.getOutputQueue(name="edge left", maxSize=8, blocking=False)
edgeRightQueue = device.getOutputQueue(name="edge right", maxSize=8, blocking=False)
edgeRgbQueue = device.getOutputQueue(name="edge rgb", maxSize=8, blocking=False)
edgeCfgQueue = device.getInputQueue(name="edge cfg")
print("Switch between sobel filter kernels using keys '1' and '2'")
This code creates three output queues and one input queue to receive output and configuration information from the edge detector.
-
Usage
device.getOutputQueue(name="edge left", maxSize=8, blocking=False)
: Creates three output queuesedgeLeftQueue
,edgeRightQueue
andedgeRgbQueue
, to receive the outputs of the three edge detectors. At the same time, name is specified,maxSize=8
indicating that the maximum size of the queue is 8 elements.blocking=False
Indicates that the queue is in non-blocking mode, that is, if the queue is full, new elements will be discarded. -
Use
device.getInputQueue(edgeCfgStr)
: Creates an input queueedgeCfgQueue
for receiving edge detector configuration information. Specify name="edge-cfg".
Setup 10: Main loop
while True:
Get the output of the edge detector
edgeLeft = edgeLeftQueue.get()
edgeRight = edgeRightQueue.get()
edgeRgb = edgeRgbQueue.get()
Get the output of the edge detector from each output queue.
-
edgeLeft = edgeLeftQueue.get()
edgeLeftQueue
: Get the latest output from the output queue of the left edge detector . Assign this output to a variableedgeLeft
. -
edgeRight = edgeRightQueue.get()
edgeRightQueue
: Get the latest output from the output queue of the right edge detector . Assign this output to a variableedgeRight
. -
edgeRgb = edgeRgbQueue.get()
edgeRgbQueue
: Get the latest output from the output queue of the color edge detector . Assign this output to a variableedgeRgb
.
Get frame image data from output of edge detector
edgeLeftFrame = edgeLeft.getFrame()
edgeRightFrame = edgeRight.getFrame()
edgeRgbFrame = edgeRgb.getFrame()
This code gets the frame image data from the output of the edge detector.
edgeLeftFrame = edgeLeft.getFrame()
: Get the frame image data from the output of the left edge detectoredgeLeft
and assign it toedgeLeftFrame
a variable.edgeRightFrame = edgeRight.getFrame()
: Get the frame image data from the output of the right edge detectoredgeRight
and assign it toedgeRightFrame
a variable.edgeRgbFrame = edgeRgb.getFrame()
: Get the frame image data from the output of the color edge detectoredgeRgb
and assign it toedgeRgbFrame
a variable.
Display the output image of the edge detector
cv2.imshow("edge left", edgeLeftFrame)
cv2.imshow("edge right", edgeRightFrame)
cv2.imshow("edge rgb", edgeRgbFrame)
This code uses the functions of the OpenCV library imshow
to display the output image of the edge detector.
-
cv2.imshow("edge left", edgeLeftFrame)
: Useimshow
the function to display the output image of the left edge detector.edgeLeftStr
is the window title andedgeLeftFrame
is the image data to display. -
cv2.imshow("edge right", edgeRightFrame)
: Useimshow
the function to display the output image of the right edge detector.edgeRightStr
is the window title andedgeRightFrame
is the image data to display. -
cv2.imshow("edge rgb", edgeRgbFrame)
: Useimshow
the function to display the output image of the colored edge detector.edgeRgbStr
is the window title andedgeRgbFrame
is the image data to display.
With these codes, we can display the output image of the edge detector in a window to view and analyze the results.
Responding to keyboard input
key = cv2.waitKey(1)
if key == ord('q'):
break
if key == ord('1'):
print("Switching sobel filter kernel.")
cfg = dai.EdgeDetectorConfig()
sobelHorizontalKernel = [[1, 0, -1], [2, 0, -2], [1, 0, -1]]
sobelVerticalKernel = [[1, 2, 1], [0, 0, 0], [-1, -2, -1]]
cfg.setSobelFilterKernels(sobelHorizontalKernel, sobelVerticalKernel)
edgeCfgQueue.send(cfg)
if key == ord('2'):
print("Switching sobel filter kernel.")
cfg = dai.EdgeDetectorConfig()
sobelHorizontalKernel = [[3, 0, -3], [10, 0, -10], [3, 0, -3]]
sobelVerticalKernel = [[3, 10, 3], [0, 0, 0], [-3, -10, -3]]
cfg.setSobelFilterKernels(sobelHorizontalKernel, sobelVerticalKernel)
edgeCfgQueue.send(cfg)
This code implements the response to keyboard input to achieve some function switching and configuration changes.
-
key = cv2.waitKey(1)
: Wait and get the ASCII code entered by the keyboard. -
if key == ord('q'): break
: If the key pressed is the ASCII code of the letter 'q', it means that you want to exit the program, then it will jump out of the loop and end the running of the program. -
if key == ord('1'):
: If the key pressed is the ASCII code of the letter '1', it means that you want to switch to the first type of convolution kernel, then the following operations will be performed:- Create an object named
cfg
,dai.EdgeDetectorConfig
which is used to configure the edge detector. - Define
sobelHorizontalKernel
andsobelVerticalKernel
are the horizontal and vertical filter matrices of the first type of convolution kernel, respectively. - Use
cfg.setSobelFilterKernels
the method to set these filter matrices into the configuration object. - Send the configuration object to
edgeCfgQueue
the message queue named .
- Create an object named
-
if key == ord('2'):
: If the key pressed is the ASCII code of the letter '2', it means that you want to switch to the second type of convolution kernel, then the following operations will be performed:- Create an object named
cfg
,dai.EdgeDetectorConfig
which is used to configure the edge detector. - Define
sobelHorizontalKernel
andsobelVerticalKernel
are the horizontal and vertical filter matrices of the second convolution kernel, respectively. - Use
cfg.setSobelFilterKernels
the method to set these filter matrices into the configuration object. - Send the configuration object to
edgeCfgQueue
the message queue named .
- Create an object named
Through these codes, we can switch the convolution kernel or exit the program according to the key.
Setup 11: Run the program
Enter the following command in the terminal to run the program
python main.py
After running the program, you can see the following effect