Annotation types for different CV tasks

  In the CV task, image annotation helps the computer to better understand the image. Based on the known label information, the computer will learn similar rules applicable to new data recognition from the data.

CV labeling is divided into the following ways:

  • Bounding Box Annotation
  • Polygon callout
  • key point labeling
  • line callout
  • Cuboid Dimensions (3D)
  • semantic segmentation

Bounding Box Annotation

Bounding boxes are the most common type of image annotation. As the name implies, the annotator needs to draw a box around the target object according to specific requirements. Object detection models can be trained using bounding boxes .
insert image description here

Polygon callout

Polygon masks are mainly used to label objects with irregular shapes . Annotators must annotate the boundaries of objects in an image with high precision, so that they have a clear idea of ​​the object's shape and size. Different from the way of labeling frame labeling, unnecessary areas around the target can be framed, which may affect the training of the model in some tasks. Polygon labeling can obtain more accurate positioning results in tasks due to its high labeling accuracy.
insert image description here

key point labeling

Landmark annotation is mainly suitable for visual tasks of detecting shape changes and small objects , which helps to better understand the motion changes of each point in the target object. Keypoint annotation can aid in gesture and facial recognition, and can also be used to detect human body parts and accurately estimate their poses.
insert image description here

line callout

Line annotation is applied to the task of training a vehicle perception model for lane detection by drawing lane line annotations . Unlike bounding boxes, it avoids a lot of white space and extra noise.
insert image description here

cuboid callout

3D cuboid annotation is used in vision tasks to calculate the depth of target objects , such as vehicles, buildings or even humans, and thus obtain their total volume. It is mainly used in the fields of construction and autonomous vehicle systems.
insert image description here

semantic segmentation

In semantic segmentation or pixel-level annotation, we group together pixels with similar attributes. It is suitable for detection and localization vision tasks of specific objects at the pixel level . Unlike polygonal segmentation, which is used to detect specific objects of interest (or regions of interest), semantic segmentation provides a complete understanding of each pixel of the scene in the image.
insert image description here

Guess you like

Origin blog.csdn.net/weixin_45074568/article/details/125125877