Dog job hunting notes: AI interviews, human assistance, the US Research Institute uses 628 Labrador data to improve the selection efficiency of smell detection dogs

Contents at a Glance: Canines have a keen sense of smell and are great assistants for difficult tasks. However, the selection of working dogs requires strict screening and training, and the elimination rate is extremely high. Using supervised machine learning and task data, it can be used to predict human job performance, however, similar studies in dogs have not been found.

Key words: working dog supervised machine learning random forest

Author|daserney

Editor|Sanyang

This article was first published on the HyperAI Super Neural WeChat public platform ~

Dogs can often be seen on the green areas of parks and in the corners of streets and alleys. In addition to being the company of human beings, bringing joy and comfort, there are many special dogs that are quietly doing important work and serving human society. They are called working dogs.

There are many types of working dogs, including military and police dogs, search and rescue dogs, and service dogs, and each category is divided into many different specializations. Among them, the main task of the smell detection dog is to use its super strong sense of smell to detect various specific substances, such as explosives and drugs. Its olfactory ability plays an irreplaceable role in protecting the safety of human society.

Most untrained working dogs sell for $40,000-$80,000, which can double when training costs are factored in. However, the overall training success rate of working dogs is less than 50%, and there is an urgent need to develop more effective selection and training methods.

Recently, researchers including Alexander W. Eyre of The Abigail Wexner Research Institute at Nationwide Children's Hospital and Isain Zapata of Rocky Vista University, Using the data of 628 Labrador Retrievers from the US Transportation Security Administration's olfactory detection team, three models were compared to predict whether the hounds can enter the formal training stage through pre-training, and found behavioral characteristics that affect the performance of olfactory detection dogs.

The research has been published in the journal Scientific Reports, titled "Machine learning prediction and classification of behavioral selection in a canine olfactory detection program".

The research results have been published in "Scientific Reports "

Paper address:

Machine learning prediction and classification of behavioral selection in a canine olfactory detection program | Scientific Reports

experimental method 

Data introduction: AT + Env predicts hound performance 

Data for the study came from a 2002-2013 olfactory detection dog breeding and training program conducted by the US Transportation Security Administration (TSA). This dataset contains the ratings of 628 Labrador retrievers that were tested twice every 3 months during a 15-month continuous foster care period.

Test 1: Airport Terminal (AT) test. The AT test took place in an empty simulated airport terminal, where the hounds were led through a simulated airport terminal, searching for scented towels in randomly scattered containers, and interacting with toys. This test demonstrates the hound's training potential by measuring how well it recognizes scented towels, interacts with staff, towels, and toys.

Test 2: Environmental (Environmental, referred to as Env) test, carried out in different locations around the base. The tests included having the hound walk around under the guidance of a staff member, attempt a search, and interact with toys and staff members in a noisy and crowded environment. Test locations included a busy gift exchange (BX), a noisy and dark enclosed wood shop (Woodshop), a cargo area with moving traffic and noise (Airport Cargo), and various airport terminals (Airport Terminal). This test complements the AT test because there are no other personnel to distract the hound during the AT test.

Table 1: Hound characteristics and scoring descriptions

AT = Airport Terminal Testing, E = Environmental Testing, B = Both.

 

Using 3 predictive models and 2 feature screening methods 

The study used 3 different supervised machine learning algorithms to predict the success of hounds through pre-training selection based on their performance on behavioral tests. Algorithms used include random forests, support vector machines, and logistic regression.

The study also used principal component analysis (PCA) and cross-validated recursive feature elimination (RFECV) to identify important behavioral features that influence canine performance in smell detection.

Among them, PCA is a statistical technique that reduces data dimensionality by identifying the most important variables; RFECV is a machine learning algorithm that filters out the most important features by recursively eliminating unimportant features.

Experimental results

Predicting Hound Pass Rates: AT Test Results Better 

As shown in Figure A below, in the AT test, the predictive power of the model generally improved over time. In the 12-month test data, the random forest model performed the best, with an Accuracy of 87% and an AUC (area under the curve) of 0.68. The logistic regression model performed slightly worse, but still performed well overall. However, the results of the support vector machine model are relatively unstable, mainly because of its poor performance in predicting the recall of the failed hounds.

Table 2: 3 Model Performance-A

As shown in Figure B below, in the Env test, the prediction results are not satisfactory. This may be due to the relatively small average number of hounds participating in the Env test compared to the AT test (56% vs. 73%). Overall, the logistic regression model performed better. At 4 time points, the SVM predicted that the F1 of the failed hounds was extremely low.

All 3 models had the highest accuracy (0.82-0.84) at 3 months and a higher F1 score (0.90-0.91) for predicting passing hounds. However, none of them performed well in predicting failed hounds at 3 months (F1 ≤ 0.10).

Table 2: 3 Model Performance-B

Logistic Regression: Logistic regression

Support Vector Machine: Support Vector Machine

Random Forest: Random Forest

A: Airport Terminal Test, AT Test

B: Environment test, Env test

M03, M06, M09, and M12 indicate that the test time is the 3rd, 6th, 9th, and 12th month, respectively.

In the figure, the data before / represents the result of selecting hounds through pre-training, and the data after / is the result of selecting hounds without pre-training.

Affecting characteristics: Possession characteristics, confidence, and H2 have a greater influence 

The researchers used principal component analysis (PCA) and cross-validated recursive feature elimination (RFECV) to determine which features were most important for prediction at different time points. The figure below shows the results of PCA on the AT test as well as the Env test.

Figure 1: Principal Component Analysis Results

a:   Airport terminal test, AT test

b:   Environment test, Env test

Abbreviations of horizontal axis features correspond to those in Table 1.

As shown in Figure a above, in the AT test, the test data of the 3rd and 6th months show that the most influential feature is H1/2 (Hidden 1/2), while in the 9th and 12th months In the monthly test data, physical possession (Physical Posession, PP) has the greatest impact. Panel b above shows that the toy Independent Possession (IP) had the greatest effect at all time points in the Env test.

Recursive Feature Elimination (RFECV) is a feature selection technique that obtains the optimal combination of variables that maximize model performance by adding or removing specific feature variables. In this study, RFECV was combined with Random Forest.

Table 3: Cross Validated Recursive Feature Elimination (RFECV) Results

a:   Airport terminal test

b:   Environmental testing

Numerical values ​​indicate the percentage occurrence of each feature, ranging from 0 to 100.

Feature abbreviations correspond to Table 1.

As shown in Figure A above, all occupancy properties (MP, PP, IP) and H2 are most important in airport terminal testing.

Figure B above shows that in environmental testing, confidence (Conf) was most important (100% and 88.7%) at months 3 and 6; and Independent Possession (IP) at month 9 Most important (93.3%); at 12 months, Physical Possession (PP) was most important (80.7%).

Taken together, the findings suggest that some properties such as H2, IP, Conf may have a greater influence. However, due to the small size of the data set and the limited variety of traits, the study had some problems identifying hounds that successfully passed the pre-training selection and those that failed due to behavioral problems. Nonetheless, it is expected that the predictive procedure will be further enhanced and expanded by introducing additional behavioral characteristics, medical information, and other types of longitudinal data.

A scientific research institution focusing on working dog research

Study author Elizabeth Hare's Penn Vet Working Dog Center, a pioneer in the working dog field, advances the research and application of the latest scientific discoveries and veterinary expertise to optimize the performance of scent-detecting dogs. Inspired by the excellence of search and rescue dogs during the 9/11 attacks, the agency was created on September 11, 2012 as the National Center for Search and Rescue Dog Research and Development.

Institution address:

Penn Vet | Penn Vet Working Dog Center

Penn Vet Working Dog Center is committed to working with dogs to protect the health and safety of humans, animals and the environment, by collecting and analyzing genetic, behavioral and physical health data, combined with the latest scientific research, to improve the work efficiency and life well-being of working dogs . Its work includes not only developing and implementing programs for the development and training of working dogs, but also testing and disseminating research findings to better meet future challenges.

Reference link:

[1] https://zhuanlan.zhihu.com/p/384069169

[2]https://blog.csdn.net/qq_35218635/article/details/110001554

[3] https://zhuanlan.zhihu.com/p/626862784

[4] https://zhuanlan.zhihu.com/p/359006952

Guess you like

Origin blog.csdn.net/HyperAI/article/details/132409346