The analysis and use of the object recognition model YOLOv3 in the photo in the iOS project

The YOLOv3 model referred to in this article is the CoreML model of the graphic recognition object provided by Apple Developer’s official website, which can recognize 80 kinds of objects and give the position and size information of the recognized object in the graph.

We can download the model directly from the official website:

Machine Learning - Models - Apple Developer

Then directly drag the model into the project (using xcode14.3), and xcode will automatically generate the corresponding tool class YOLOv3 according to the model, which cannot be modified. In the project we can view the information of the model:

a0156130547f4559aef36d2305550849.png

We can click Model Class in this browse page to view the automatically generated YOLOv3 tool class. In the browsing page, we can see that there are several tab options below, the key tab introduction:

General: The description information of the model, the important thing is the Class Label, which lists 80 recognizable object names.

Preview: directly experience the prediction effect of the model. If you are interested, you can drag and drop pictures under this tab to preview.

Prediction: Describes the input and output information of the model.

Next, we can use this model to make predictions in the project. The demo code is as follows (the YOLOv3 class does not need to be imported):

do {
            let config:MLModelConfiguration = MLModelConfiguration()
            let model:YOLOv3! = try YOLOv3(configuration: config)
            
            if model != nil {
                //为了方便,我直接拿了asset中的照片
                let image:UIImage? = UIImage(named: "IMG_0096")
                
                if image != nil {
                    let input:YOLOv3Input = try YOLOv3Input(imageWith: image!.cgImage!)
                    let outPut:YOLOv3Output = try! model.prediction(input: input)
                    
                    print("识别成功")
                    print(outPut.coordinates.count)
                    
                }else{
                    print("图片读取失败")
                }
            }else{
                print("模型初始化失败")
            }
        }catch{
            print(error)
        }

There are three classes involved in the code:

YOLOv3: Model class, its instance can also be understood as the model itself.

YOLOv3Input: The class instance is the input object.

YOLOv3Output: The class instance is the output object of the recognition result.

In the code, the outPut object contains all the recognition data, the coordinates attribute value represents the coordinates and size data of the recognized object, and the confidence attribute value represents the probability value of the recognized object.

coordinates: The element is an array containing 4 double values, and each double value in turn represents the relative coordinates, width and height of the recognized object in the picture:

  • x: the ratio of the center point of the identified object to the pixel on the left side of the image relative to the pixel width of the image;
  • y: the ratio of the center point of the identified object to the pixel at the top of the image relative to the pixel width of the image;
  • w: the ratio of the width of the recognized object to the width of the picture;
  • h: the ratio of the height of the recognized object to the height of the picture;

confidence: The element is an array containing 80 double values, and each double value in turn represents the probability that the recognized object belongs to 80 object categories.

After a difficult search, it is still impossible to obtain the values ​​of 80 object classification names through the API. It can only be seen through model browsing. Finally, the corresponding attributes can be found in the console, but they cannot be obtained through the YOLOv3 object. 80 are printed out in the console. Kind of object class name:

1782646f0b4645dc974ee550e3d991a9.png

 If you find a friendly way to obtain the Class Label in the future, I will add it.

 

Guess you like

Origin blog.csdn.net/qq_31672459/article/details/130870081