opencv case 06 - Comparison of fire passage obstacle detection and depth yolo detection based on opencv image matching

Fire exit obstacle detection based on image matching

technical background

Fire exits refer to the passages used by firefighters to rescue and evacuate trapped people when various dangers occur. The Fire Protection Law stipulates that no unit or individual may occupy, block or seal fire escapes. In fact, due to the lack of management of fire escapes, various garbage, objects, vehicles and other obstacles often appear in the fire escapes, blocking the fire escapes. When danger occurs, it will cause great harm to people's lives and property. Therefore, it is particularly important to detect obstacles in fire escapes.

Traditional fire escape obstacle detection mainly relies on manual safety inspections. Specialized staff are designated to go to specific fire escapes regularly to check whether the fire escape is blocked. Although this method is simple and easy to implement and does not require complex equipment, the disadvantage of this method is that it cannot The timely detection of whether fire escapes are blocked is greatly affected by the manual inspection cycle; secondly, it relies heavily on the professional quality and work attitude of the staff, which is highly subjective.

Fire escape obstacle detection belongs to the field of image processing and intelligent security.
A fixed camera is used to obtain the background scene image and the real-time monitoring scene image when clearing the fire passage obstacles, and the background scene image is used as a matching template image. By matching the template image and the real-time monitoring scene image in the designated area, Determine whether there are obstacles in the designated area and alert the police. As shown in the picture below, there should be no accumulation in the red box. If so, call the police
Insert image description here

The system collects real-time scene images of the designated monitoring area in the fire passage at certain intervals and performs matching, which not only ensures that the system can detect obstacles in real time and alarm, ensure the smoothness of the passage in a timely manner, but also reduces system overhead; in addition, based on Featured image matching is used for fire passage obstacle detection, making passage blockage determination more accurate and effective.

Overall flow chart:

Insert image description here

Insert image description here
In order to realize the above process, several important steps need to be performed:
Step 1 , set up a camera in the channel that needs to be detected, and collect the channel scene image through the camera;
Step 2 , during the deployment process of the camera detection channel system, save the background template image to form the background template image Set, and set the key detection area of ​​the channel;
Step 3 , perform noise reduction preprocessing on the background template image set and the image to be matched;
Step 4 , calculate the matching degree between the background template image in each background template image set and the image to be matched;
Step 5 : Calculate the matching degree between the image to be matched and the background template image set. Based on the comparison between the matching degree of the background template image set and the threshold, it is judged whether there are obstacles in the image to be matched. If an obstacle exists, an alarm will be issued.

The matching between a single background template image and the image to be matched mainly includes the following processes:

First, feature points need to be extracted in the background template image and the specified area of ​​the image to be matched. Considering the light problem in the fire escape scene, Harris corner points with certain immunity to light changes are used as feature points. This type of feature points is calculated from the second-order derivative of the grayscale image and usually exists in the vicinity of pixels in the image. There are pixels with grayscale changes in multiple directions in the domain, so it can well represent the range of grayscale value changes in the image, while the impact of illumination changes on the grayscale value of the image is usually very small in the neighborhood. Therefore, the Harris corner point has a certain degree of stability against lighting.

The neighborhood of a pixel x in the image refers to a set of pixels composed of several adjacent pixels centered on x. According to the boundary point sequence PSeq of the specified area obtained in S2, a mask area (i.e., the area surrounded by boundary points) is generated in the background template image and the image to be matched. The corner point sets BackCornerSet (background corner point set) and TestCornerSet (test corner point set) are extracted from the mask area in the background template image and the image to be matched respectively. There are Nb corner points in BackCornerSet, and Nt corner points in TestCornerSet. The number of corner points will vary greatly depending on the image content, so it is temporarily represented by Nb and Nt. The corner point set contains the coordinates of each corner point and is used to position each corner point in the image.

Secondly, feature descriptors need to be extracted from the background template image and each corner point in the image to be matched. The so-called feature descriptor refers to the attribute that describes the characteristics of the pixel where each feature point is located. The most basic attribute of each pixel is the gray value, but using gray value as a feature descriptor is not only too simple, but also ignores the relationship between the pixel and adjacent pixels, making it unable to effectively represent the attributes of the feature. . Usually the characteristics in the neighborhood of the pixel where the feature point is located are used as the feature descriptor. The present invention uses a 15×15 neighborhood centered on the pixel where the feature point is located, which can well cover the feature point and adjacent pixels with greater influence. The vector composed of the first-order gradient is calculated for the pixels in the neighborhood as the feature descriptor of the feature point. The first-order gradient can weaken the influence of illumination, making the feature descriptor also have a certain stability against illumination. After the deployment process is completed, the image matching scene is fixed, so the background in the specified area usually does not show obvious rotation and scale changes, so the first-order gradient as a feature descriptor can basically meet the requirements. The purpose is to locate each corner point in the background template image and the image to be matched according to the coordinates of each corner point in the corner point set BackCornerSet and TestCornerSet, and to calculate the background template image and the image to be matched based on the neighborhood of each corner point. The feature descriptor set BackDescriptorSet (background feature descriptor set) and TestDescriptorSet (test feature descriptor set).

Then the background template image and the feature descriptor set extracted from the image to be matched are matched. During the matching process, the present invention uses Euclidean distance to calculate the similarity between two feature descriptors. For BackCornerSet and assuming that A is any corner point in BackCornerSet and B is any corner point in TestCornerSet, calculate the similarity of the feature descriptors corresponding to each corner point from A to B1, B2,..., BNt, and select the one with the greatest similarity Corner point Bj (the possible value of j is any integer from 1 to Nt), then the A corner point to the Bj corner point is matched in one direction; calculate the feature descriptor corresponding to each corner point from B to A1, A2,..., ANb Similarity, select the corner point Aj with the greatest similarity (the possible value of j is any integer from 1 to Nb), then the corner point B to the corner point Aj will be matched in one direction. If and only if corner point A to corner point B are matched in one direction and corner point B to corner point A are also matched in one direction, corner point A and corner point B match, then corner point A and corner point B are a match. right. After matching the corner points in BackCornerSet and TestCornerSet, a matching set MatchPairs containing Q successful matches is obtained.

Finally, the matching pairs need to be corrected. In the successful matching set MatchPairs, there may be situations where the same corner point matches multiple corner points at the same time, and there may also be mismatched focus pairs, so we need to correct the MatchPairs. Since the background content in the specified area in the acquired image will not change significantly when the camera is fixed, it can be considered that the detected corner points in the background template image match the corresponding corners in the image to be matched. The relative displacement between points should be small. Based on this principle, the relative displacement of the two corner points of each matching pair in MatchPairs is calculated. If the displacement deviation is greater than the threshold Δs, the matching pair is considered a mismatched pair. The threshold of Δs can be set to 5 pixels (to adapt to changes in the physical environment such as image content shifts caused by camera shake).
Delete mismatched pairs in MatchPairs to complete the correction of MatchPairs. The modified MatchPairs contains Q* matching pairs.

Research on obstacle detection based on deep learning

YOLO is an end-to-end image detection framework. Its core process is to use the entire image as the input of the network. It can directly obtain the detection bounding box of the object in the output layer and label the detected category. YOLO uses grids instead of traditional sliding windows. First, a picture is divided into S * S grids. Each grid needs to predict an object whose center point falls in this grid; each grid needs to predict B Each bounding box returns a position information, including x, y, w, h, which represent coordinate information and size information respectively, and also outputs a confidence value (confidencescore).

The effect is shown in the figure below:
Insert image description here
Depth-based detection is more efficient and accurate than opencv. However, one drawback of deep security channel detection is that it must be clear in advance what obstacles will be placed in the channel. This is easy to be criticized, which means Your model needs to be continuously trained to recognize new obstacles during the actual production process. Of course, this also depends on your own business scenario. If your obstacle is fixed, depth detection is definitely your best choice.

Guess you like

Origin blog.csdn.net/hai411741962/article/details/132595507