Thesis Research | Data Annotation Method in Image Segmentation

With the continuous deepening of the exploration in the field of artificial intelligence, high-speed rail and urban transportation are gradually becoming intelligent, and more and more image recognition algorithms are applied to auxiliary navigation and automatic driving. In actual scenarios, high-speed rail will face some extreme situations during driving, such as mudslide disasters, track derailment, etc. Image recognition is usually used to judge these extreme situations. The training of image recognition algorithms requires a large number of high-quality data sets, especially semantic segmentation and annotation data for image segmentation algorithms, and the acquisition cost is extremely high.

Semantic segmentation and labeling is to divide and classify object areas, and usually use polygon labeling to label the contours of various objects. At present, there are more obvious inefficiencies in the way of labeling polygons for object contours, and the cost of manual labeling is too high. In order to control the cost of data labeling, improving the efficiency of manual labeling of object contours is the most direct and effective method. It is very important to design and implement an efficient data labeling tool.

Today I will share an article about the method of data annotation that I saw.

 1 Overview

slightly.

2 Overview of Data Labeling Issues

slightly.

3 Design and implementation of data labeling system

This chapter proposes a method to improve the efficiency of manual labeling of object contours, and designs and implements a data labeling system. Firstly, the architecture design of the data labeling system is carried out; then, to solve the problem of manual drawing and labeling of polygons, a method of quickly labeling object outlines with the mouse is proposed, and applied to the system; finally, the overall copy is made by using the continuous frame characteristics of the video / Adjust the research and application of the method of labeling polygons.

3.1 System Architecture

This labeling system adopts B/S architecture [6], the front end uses the React [7] framework, the back end uses the Flask framework [8], the front and back end communication uses the Axios library and SocketIO [9] technology, adopts the front and back end separation mode, and uses the relationship A type database MySQL[10] stores data. The front-end or client mainly includes UI layer and business logic layer. The UI layer is responsible for interface display and data visualization, and provides interactive functions related to data labeling work. The business logic layer is responsible for realizing the business logic of front-end labeling operations, processing data changes, and providing Functions for communicating with the backend. The backend, namely the server side, mainly includes a business logic layer and a data access layer. The business logic layer provides an interface to the client, provides business logic processing for the realization of client functions, and integrates image segmentation algorithm modules. The data access layer controls access to the database. Operation, which provides methods for adding, deleting, modifying and checking for the business logic layer. The system architecture design is shown in Figure 1.

 3.2 Design of mouse quick labeling method

The image segmentation-oriented data labeling system can support the use of polygons to mark object contours. Currently, purely manual polygon labeling requires marking points along the object contours. For complex object contours, the difficulty and complexity of marking points will increase exponentially. In order to facilitate the outline of the object, this annotation system proposes a way to quickly annotate the outline of the object with the mouse: press the left mouse button and keep moving along the outline of the object, and the position where the mouse moves generates an annotation polygon.

Many continuous points will be generated during the mouse movement, and the coordinates of these points will be recorded as the original data. These continuous points form a curve. If all the points are displayed on the interface, there will be too many points describing the outline, and the data to be maintained is too large. It is unrealistic to edit a large number of fixed points, and most of the vertices will not produce labels. positive effects. Therefore, it is necessary to filter the coordinate points, and keep the least coordinate points as much as possible to mark a more complete and accurate object outline.

To sum up, the design scheme is summarized as follows: press the left button of the mouse and do not release it, move the mouse along the outline of the object, and start the rapid labeling process. During this process, each coordinate point of the mouse movement is collected and stored in the collection in an orderly manner. Release the left mouse button when the mouse finishes drawing the outline of the object, and the quick labeling process ends, and all points in the collected ordered coordinate point set are filtered according to a certain tolerance value, and a new ordered coordinate point set is obtained after filtering. This is the coordinate point of the polygon formed by the mouse to quickly mark the outline of the object, and use these points to draw the marked polygon on the interface.

According to the above preliminary design scheme, it should be implemented in this labeling system and refined into the following more specific design schemes. First of all, the drag operation of the left mouse button will trigger the moving operation of the drawing area, so set an activation button for the mouse quick labeling function. When the mouse quick labeling function is activated, it will enter the fast labeling state and disable the moving operation of the drawing area. Then, when the mouse quick annotation function is activated, the drag event of the left mouse button should be registered, including the "drag move" and "drag end" events. In the callback function of the drag event "drag move", collect the coordinates of the trigger point and save them in the specified collection; in the callback function of the drag end event "drag end", process the coordinates in the collection to obtain the filtered After the collection of , replace the polygon drawn by the original collection. In this way, a polygon that can describe the outline of the object is displayed on the interface. Finally, when the quick labeling function of the mouse is not needed, click the activation button to cancel the quick labeling function, exit the fast labeling state, and resume the movement operation of the drawing area. In this operation, the previously registered drag event needs to be removed, so that Enter the default state of the labeling system.

3.3 Overall Copy/Adjust Label Polygon Method Design

When the labeled data source is video, the preprocessing module will filter out a set of continuous frame pictures. Obviously, there is a continuous relationship between this group of pictures. However, when manually labeling, this relationship is not used. Manually labeling each frame of pictures is not efficient. Based on the continuity between the frame pictures, the objects in two consecutive frame pictures also have continuity, and the positions and sizes of the corresponding marked polygons are also similar. Therefore, an idea is proposed: copy the labeled polygon of the previous frame of pictures to the next consecutive frame of pictures, and then adjust the polygon as a whole to match the object. Based on the above ideas, a method of copying/adjusting the marked polygon as a whole is proposed.

According to the above analysis, the overall copy/adjustment method is designed in detail and refined into the following three function points:

1) Copy the entire annotation polygon;

2) Move the entire marked polygon;

3) Resize the entire dimension polygon.

For the first function point "copy the entire marked polygon", this system provides two copy methods, namely "copy the whole to the current picture" and "copy the whole to the next picture". A menu of options is displayed for the annotator to choose from. The annotation polygon copied to the current picture is originally in the same position as the original annotation graphics, and they will overlap together, making it impossible to distinguish between selections. Therefore, the copied annotation graphics are designed to be offset by 5 pixels to the lower right, which is convenient for the annotation personnel to distinguish and select, as shown in Figure 2 Show. The annotation polygon copied to the next image remains the same in position and size as the original annotation polygon.

 For the second function point "moving the entire annotation polygon", this system realizes it by registering the drag event for the entire annotation polygon, and registering the "drag move" and "drag end" events for the entire annotation polygon. In the callback function of the "drag move" event, the coordinates of the dragged polygon vertices are calculated in real time and updated to the displayed polygon to realize the effect of the real-time movement of the polygon following the drag; in the callback of the "drag end" event, Get the coordinates of the vertices of the current polygon and update them to the data model to keep the polygon coordinates in the data model consistent with the polygon coordinates displayed on the interface.

In order to avoid conflicts with the moving operation of the marked area, the overall moving polygon state is added. Use the shortcut key M to activate the overall moving polygon state, and register the drag event for the polygon according to the above discussion. At this time, you cannot edit a single vertex of the polygon, nor can you adjust the size, width, and height of the polygon as a whole. When the mouse moves over the labeling polygon, the current labeling polygon is translucently filled with the class color. This highlighted display tells the labeler that the current labeling polygon can be moved as a whole; when the mouse moves away from the labeling polygon, the semi-transparent fill color disappears, and the current polygon fill color is transparent.

When the marked polygon displays a semi-transparent fill color, it means that the polygon can be dragged and moved. Press the left mouse button on the polygon without releasing it, move the mouse, and the polygon will follow the movement in real time according to the mouse movement position. Release the left mouse button On, the polygon movement ends, and the current position is the new position of the polygon. As shown in Figure 3, the dashed polygon is the position of the polygon before movement, and the solid line polygon is the position of the polygon after movement.

 Take the movement of the vertex (x0, y0) in Figure 3 as an example, according to the horizontal and vertical distances Δx, Δy, the new coordinates (x1, y1)=(x0+ Δx, y0+Δy). For the third function point "resize the entire annotation polygon", this system allows the annotator to drag the adjustment point to adjust the size of the entire annotation polygon by providing adjustment points, which is similar to the realization of the overall moving polygon function. Adjustment points Register the drag event, the details will not be repeated, you can refer to the description of the entire polygon registration drag event.

Added the state of adjusting the overall polygon size. Use the shortcut key R to activate the overall adjusting polygon size state. At this time, you will not be able to edit a single point of the polygon, nor can you move the polygon as a whole. Click the polygon to be adjusted with the left mouse button to activate the overall adjustment of the polygon. function, draw the bounding box and adjustment points according to the bounding box P(x, y, w, h), as shown in Figure 4, which indicates that the current polygon can be resized as a whole.

 Register the drag event for the adjustment point. The adjustment point on the side of the bounding box is a one-way adjustment point, and the adjustment point on the vertex of the bounding box is a two-way adjustment point. The adjustment point in the middle of the upper and lower sides only supports dragging up and down to adjust the height of the polygon. The adjustment point in the middle of the left and right only supports left and right dragging, which is used to adjust the width of the polygon. Bi-directional adjustment points on vertices support dragging up, down, left, and right to adjust the height and width of the polygon at the same time. Take dragging the upper left adjustment point to change the size of the polygon as an example to deduce and calculate the coordinates of the vertices of the polygon.

Drag the upper left adjustment point of the bounding box to adjust the polygon size as a whole, as shown in Figure 5. Assuming that the leftward adjustment width is Δx and the upward adjustment width is Δy, then the adjusted bounding box P (x , y , w ,h )=(x-Δx, y-Δy, w+Δx, h+Δy), with polygon vertices (x0 , y0) as an example, the adjusted vertex is (x0 ,y0 ), and it is obvious that x0 = x0-Δx, y0 is calculated as shown in formula (1).

 4 System Application

This paper classifies the high-speed rail driving scene. The usual scene is the normal track. The rails, guardrails, trains, mountains, etc. are marked, and the classification labels are marked to obtain the marked data set for training the image recognition algorithm. As shown in Figure 6, this labeling system labels the high-speed rail driving scene.

Image recognition [11] for high-speed rail driving scenes has practical significance when encountering extremely special natural disasters or failures, such as mudslides. Track derailment, etc. If these situations can be identified in advance, measures can be taken in time to reduce casualties and facility damage.

 5 System experiment and result analysis

5.1 Mouse Quick Labeling Experiment

In the experiment, 30 annotators were selected to mark the contours of the same batch of designated objects using two ways of point marking and mouse quick marking respectively.

For the first annotation, 30 annotators used the method of marking points to annotate 54 objects specified in 30 annotated pictures. For the second labeling, 30 labelers used the mouse to quickly label, and the labeling objects were the same as the first labeling.

In order to compare the data of the above two annotations more intuitively, the time spent by the annotation personnel under the two annotation methods is displayed in the form of a histogram, as shown in Figure 7. In order to reflect the efficiency improvement of using the mouse quick method, the percentage of each person's efficiency improvement is calculated and displayed in the form of a scatter diagram, as shown in Figure 8.

 It can be seen from Figure 7 that, under the same conditions of other labeling conditions, the labeling time of each labeler using the mouse quick labeling method is less than the labeling time of point labeling. From the analysis in Figure 8, it can be seen that the labeling efficiency of 76.7% of the personnel has increased by more than 10%, and the labeling efficiency of 100% of the personnel has been improved, and the labeling efficiency has increased by an average of 11.6%, indicating that this method can effectively improve the labeling efficiency.

5.2 Overall Copy/Adjust Label Polygon Experiment

In the experiment, 30 annotators were selected to do point marking and overall copy/adjustment marking on the same vehicle with continuous motion trajectory specified in 30 marked pictures.

For the first annotation, 30 annotators used the method of point marking to annotate the target objects in 30 pictures. For the second annotation, 30 annotators can use the overall copy/adjustment method to assist in point annotation to annotate the target objects in the 30 pictures.

In order to compare the data of the above two annotations more intuitively, the time spent by the annotation personnel under the two annotation methods is displayed in the form of a histogram, as shown in Figure 9. In order to reflect the efficiency improvement of using the overall copy/adjustment labeling method, the percentage of each person's efficiency improvement is calculated and displayed in the form of a scatter diagram, as shown in Figure 10.

 It can be seen from Figure 9 that, under the same conditions of other labeling conditions, the labeling time of each labeler using the overall copy/adjust labeling method for auxiliary labeling is less than that of only using the point-drawing labeling method. From Figure 10, it can be seen that the labeling efficiency of 90% of the personnel has increased by more than 28%, and the labeling efficiency has increased by an average of 29.7%, indicating that this method can effectively improve the labeling efficiency.

6 Summary

This paper mainly focuses on the low efficiency of manual labeling of object outlines, with the goal of improving the efficiency of manual labeling of object outlines, and proposes two labeling methods: mouse quick labeling and overall copy/adjust labeling polygons, which are applied to the data labeling system. conclusion as below.

The influence of mouse quick labeling method on labeling efficiency is studied, and the mouse quick labeling method can improve the speed of manual labeling through experiments, which can achieve the purpose of improving labeling efficiency. The mouse quick labeling method improves the efficiency by an average of 11.6% compared with the point labeling method.

The effect of the overall copy/adjustment of the annotation polygon method on the annotation efficiency is studied, and the experiment proves that the overall copy/adjustment of the annotation polygon method can improve the annotation efficiency of the annotators. The overall copy/adjustment labeling polygon method shortens the labeling time by an average of 29.7% compared with the drawing point labeling method.

7 References

slightly.

Interested students can go to Zhiwang to download this paper.


Article source: Zhang Yue, Wang Xiaoyi, Li Juan. Data annotation system for image segmentation [J]. Railway Communication Signal Engineering Technology (RSCE) , 2022,19(11)


Regarding the semantic segmentation part, I would like to recommend a domestic machine vision platform—Coovally, which is a machine vision platform that includes a complete AI modeling process, AI project management, and AI system deployment management. The development cycle can be shortened from months to days, and the development, integration, testing and verification of AI vision solutions can be accelerated. Help improve the enterprise's AI stack software development capabilities, so that advanced AI systems can be popularized at a lower cost and faster. "Package its own AI capabilities" for business personnel to use, so as to realize "teaching people how to fish". At present, Coovally has covered multiple application areas, including manufacturing quality inspection, geological disaster monitoring, power industry equipment monitoring, medical special disease diagnosis, smart transportation, smart parks, etc.

Coovally's data labeling solution solves the hands of the enterprise, and it can be handed over to Coovally to complete the detection of mudslide disasters and track derailments.

Guess you like

Origin blog.csdn.net/Bella_zhang0701/article/details/128114625