"Whale Face Recognition" has been launched, and the University of Hawaii used 50,000 images to train the recognition model, with an average accuracy of 0.869

Content overview: Face recognition can lock human identity. This technology is extended to cetaceans, and there is "dorsal fin recognition". "Dorsal fin recognition" uses image recognition technology to identify cetacean species through their dorsal fins. Traditional image recognition relies on convolutional neural network (CNN) models, requires a large number of training images, and can only recognize certain single species. Recently, researchers at the University of Hawaii trained a multi-species image recognition model that performed well in cetacean applications.

Key words: image recognition cetaceans ArcFace

Author|daserney

Edit|Slowly, Sanyang

This article was first published on the HyperAI Super Neural WeChat public platform ~

Cetaceans are the flagship animals and indicator organisms of the marine ecosystem, and have extremely high research value for the protection of the marine ecological environment. Traditional animal identification requires on-site photography of animals to record the time and location of individual appearance, which involves many steps and is a complicated process. Among them, image matching-identifying the same individual in different images is particularly time-consuming.

A 2014 study by Tyne et al. estimated that during a year-long capture and release survey of spotted dolphins (Stenella longirostris), image matching took up more than 1,100 hours of human labor, nearly the entire project budget one-third of .

Recently, researchers including Philip T. Patton from the University of Hawaii (University of Hawai'i) used more than 50,000 photos (including 24 species of cetaceans and 39 categories) to train the ArcFace Classification Head based on face recognition. A multi-species image recognition model. The model achieved a mean precision rate (MAP) of 0.869 on the test set. Of these, 10 directories had a MAP score greater than 0.95.

The research has been published in the journal "Methods in Ecology and Evolution" with the title "A deep learning approach to photo–identification demonstrates high performance on two dozen cetacean species".

The research results have been published in "Methods in Ecology and Evolution"

Paper address:

https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/2041-210X.14167

 

Dataset: 25 species, 39 catalogs

Data introduction 

Happywhale and Kaggle collaborated with researchers around the world to assemble a large-scale, multi-species dataset of cetaceans. This dataset was collected for a Kaggle competition that asked teams to identify individual cetaceans from images of their dorsal fins/sideways. The data set contains 41 catalogs (catalogues) of 25 species (species), each catalog contains a species, and some of the species in the catalog will appear repeatedly.

The study removed two competition directories because one had only 26 low-quality images for training and testing, while the other lacked a test set. The final dataset contains 50,796 training images and 27,944 testing images, of which 50,796 training images contain 15,546 identities. Of these identities, 9,240 (59%) had only one training image and 14,210 (91%) had less than 5 training images.

Dataset and code address:

GitHub - knshnb/kaggle-happywhale-1st-place

training data 

To address the problem of complex image backgrounds, some contestants trained image cropping models that can automatically detect cetaceans in images and draw bounding boxes around them. As can be seen in the figure below, this process includes 4 cetacean detectors, using different algorithms including YOLOv5 and Detic. The diversity of detectors increases the robustness of the model and can enhance the experimental data. .

Figure 1: Images of 9 categories in the competition set and bounding boxes generated by 4 cetacean detectors

Each bounding box generates a crop with a probability of: 0.60 for red, 0.15 for olive green, 0.15 for orange, and 0.05 for blue. After cropping, the researchers resized each image to 1024 x 1024 pixels for compatibility with the EfficientNet-B7 backbone.

After resizing, apply data augmentation techniques such as affine transformation, resizing and cropping, grayscale, Gaussian blur, etc. to avoid severe overfitting of the model .

Data augmentation refers to the transformation or expansion of the original data during the training process to increase the diversity and quantity of training samples, thereby improving the generalization ability and robustness of the model.

 

Model Training: Species & Individual Recognition Two-pronged

The figure below shows the training process of the model. As shown in the orange part of the figure, the researchers divided the image recognition model into three parts: backbone, neck and head.

Figure 2: Multi-species image recognition model training pipeline

The first row in the figure is the preprocessing step (take the common dolphin Delphinus delphis image as an example), crops are generated by 4 target detection models, and two sample images are generated by the data enhancement step.

The bottom line shows the training steps of the image classification network, from backbone to neck to head.

Images first enter the backbone over the network. A series of research in the past decade has produced dozens of popular backbones, including ResNet, DenseNet, Xception, and MobileNet. EfficientNet-B7 was verified to perform best in cetacean applications.

After Backbone takes an image, it processes it through a series of convolutional and pooling layers to produce a simplified 3D representation of the image. Neck reduces this output to a one-dimensional vector, also known as a feature vector.

Both head models convert feature vectors into class probabilities, namely Pr(species) or Pr(individual), for species identification and individual identification, respectively. These classification heads are called subcentric ArcFace with dynamic margins, which are generally applicable to multi-species image recognition scenarios.

 

Experimental results: average precision 0.869

Predictions were made on 21,192 images in the test set (39 categories of 24 species), achieving a mean precision (MAP) of 0.869. As shown in the figure below, the average accuracy varies across species and is independent of the number of training images or test images.

Figure 3: Average precision on the test set

The top panel shows the number of images for each species by usage (i.e. training or testing). Species with multiple catalogs are represented by x.

The figure shows that the model performed better at identifying toothed whales and worse at identifying baleen whales, with only two baleen whale species scoring above average.

There were also differences in model performance for multi-category species. For example, the MAP scores between different categories of the common minke whale (Balaenoptera acutorostrata) are 0.79 and 0.60, respectively. Other species, such as beluga whales (Delphinapterus leucas) and killer whales, also showed large differences in performance across catalogs.

In this regard, although the researchers did not find a reason that could explain this category-level performance difference, they found that some qualitative indicators such as ambiguity, uniqueness, marker confusion, distance, contrast, and splashes may affect the accuracy score of the image.

Figure 4: Variables that can affect directory-level performance differences

Each point in the figure represents a category in the competition dataset, and pixels represent the image and bounding box width. Distinct IDs represent the number of distinct individuals in the training set. However, there is no clear correlation between catalog-level MAP and mean image width, mean bounding box width, number of training images, number of distinct individuals, and number of training images per individual.

Based on the above, the researchers proposed that when the model was used for prediction, the average precision of 10 catalogs representing 7 species was higher than 0.95, and the performance was better than the traditional prediction model, which further showed that the use of the model could correctly identify individuals. In addition, the researchers also summarized 7 points for attention in cetacean research during the experiment:

  1. Dorsal fin identification performed best.
  2. Directories with fewer distinct individual features perform poorly.
  3. Image quality matters.
  4. Identifying animals by color can be difficult.
  5. Species with larger differences in characteristics relative to the training set score poorer.
  6. Preprocessing remains a hurdle.
  7. Variations in animal markers may affect model performance.

 

Happywhale: A citizen science platform for cetacean research

Happywhale, mentioned in the dataset introduction in this article, is a citizen science platform for sharing cetacean images with the goal of unlocking massive datasets, facilitating quick matching of photo IDs, and creating scientific engagement for the public.

Happywhale official website address:

Happywhale

Happywhale was founded in August 2015. Its co-founder, Ted Cheeseman, is a Naturalist. He grew up in Monterey Bay, California. He loved whale watching since he was a child and has traveled to Antarctica and South Africa many times. Georgia Island Expedition, with more than 20 years of experience in Antarctic exploration and polar tourism management.

Happywhale co-founder Ted Cheeseman

In 2015, Ted left Cheesemans' Ecology Safaris (an eco-travel agency founded in 1980 by Ted's parents, who are also naturalists) who had worked for 21 years to devote himself to the Happywhale project - collecting scientific research data to further understand and protect whales   .

In just a few years, Happywhale.com has become one of the largest contributors to the field of cetacean research, offering many insights into understanding cetacean migration patterns in addition to the sheer volume of cetacean identification images.

Reference link:

[1] https://baijiahao.baidu.com/s?id=1703893583395168492

[2]https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0086132

[3]https://phys.org/news/2023-07-individual-whale-dolphin-id-facial.html#google_vignette

[4]https://happywhale.com/about

This article was first published on the HyperAI Super Neural WeChat public platform ~

Guess you like

Origin blog.csdn.net/HyperAI/article/details/132314748