An optimization strategy for fruit detection to overcome unstructured background challenges in field orchard environments

Summary

Due to the ongoing impact of the world food and environmental crisis, the demand for smart farming is increasing. Focusing on fruit detection, with the rapid development of object detection technology, it is now possible to achieve high accuracy in fruit detection systems. However, detecting fruits with high accuracy remains particularly challenging in unstructured orchard environments. Such environments consist of varying lighting conditions and levels of occlusion, which can be mitigated by certain strategies. To the best of our knowledge, this is the first review of optimization strategies for fruit detection. This review aims to explore methods for improving fruit detection in complex environments. First, we describe the types of complex backgrounds commonly found in outdoor orchard environments. We then group the improvements into two categories, optimizations before and after image sampling. Next, we compare the test results before and after applying these improvements. Finally, the future development trend of fruit detection optimization technology in complex background is introduced. We hope that this review will inspire researchers to design their optimization strategies and help explore lower-cost and more robust fruit detection systems.

Complex contextual factors and negative influences

In this paper, according to the complexity of the background input by the visual sensor in different agricultural application scenarios, the background types of the visual detection object are divided into two categories: clean background and complex background. For example, in the case of an RGB camera, a clean background refers to a solid color plane with a significant color deviation from the object to be inspected, or only a few interfering factors that have minimal defects on the inspection performance and are easily eliminated, such as the texture of the mechanical structure and Regular lines on the conveyor belt. These situations are often found in areas such as supermarket self-checkouts or automated fruit and vegetable sorting, as shown in Figure 1.
insert image description here
In contrast, complex backgrounds are common in felt plants that need to be deployed in outdoor growing environments, such as outdoor fruit picking robots and automatic fruit yield estimation. Different from clean backgrounds, factors leading to complex backgrounds are more diverse and usually have a significant impact on fruit detection performance. Therefore, many studies have proposed optimization strategies to overcome or mitigate these drawbacks and evaluated the feasibility of fruit detection methods in real agricultural scenarios.

We believe that a detailed background on these optimization strategies is warranted before a comprehensive review of these optimization strategies can be undertaken.

lighting conditions

Unstructured lighting conditions are common complex background factors that can affect fruit detection performance in field environments. These conditions change dynamically over time and with changing weather conditions, such as changes in the direction of sunlight during the day, and changes in light intensity between sunny and cloudy days. These factors are reproduced in the fruit image as follows: (1) the brightness of the fruit image changes with the light intensity; (2) the direct sunlight causes the supersaturated area of ​​the fruit surface; (3) due to the light occlusion phenomenon, the fruit surface is divided into shadow parts and non- shaded part.

While the human visual system can determine the color invariance of object surfaces under changing light and imaging conditions, imaging devices do not have this color invariance capability (Tian et al., 2019a, 2019b). Therefore, lighting conditions can significantly affect fruit detection performance (Zhou et al., 2022). Wang et al. (2020) pointed out that some traditional machine learning-based segmentation algorithms are significantly affected by lighting in natural environments. Furthermore, non-optimized deep learning models exhibit misidentifications in uneven lighting. Yan et al. (2021) compared the improved model optimized for complex backgrounds with four other mainstream object detection models and found that the non-optimized detection model may exhibit misidentifications or missed identifications under cloudy conditions. The detection performance of the same model under different lighting conditions will also vary. For example, the green citrus detection method proposed by Wang et al. (2018a, 2018b) is more prone to false positives in front-light conditions than in back-light conditions, while the fruit drop phenomenon is the opposite. In the kiwi fruit detection task, the improved YOLOv3-tint model proposed by Fu et al. (2021) performed better in the afternoon than in the morning.

fruit occlusion phenomenon

Occlusion is considered to be one of the most important complex background factors in object detection. Although the human visual system cannot visualize the full image of an occluded object through a scene occluder, it can abstractly infer what the object is based on a small visible area. However, this capability is expensive for visual inspection systems, as better feature extraction and information inference capabilities usually imply a larger and more complex model architecture. The design of the model architecture is often limited by the hardware conditions of the deployment platform. Therefore, object occlusion poses a huge problem to the detection performance.

In general, occlusion can be divided into two categories: inter-class occlusion and intra-class occlusion. The former occurs when an object is occluded by objects of different categories, while the latter occurs when an object is occluded by objects of the same category (Wang et al., 2018a, 2018b). In fruit detection, the fruit occlusion situation is as follows: (1) the fruit is covered by the branches and leaves of the fruit plant, (2) the fruits of the same plant cover, overlap and adhere to each other, as shown in the lower right corner of Fig. 1.

Gongal et al. (2016) pointed out that the camera cannot capture most of the apples (approximately 60% of the fruit) from only one side of the apple crown, because leaves, branches, and other apples occlude each other in the orchard, which has a negative impact on the fruit counting accuracy. significant impact. Yu et al. (2019) used the Mask R-CNN model to detect and segment strawberries growing in an unstructured environment and locate their picking points. The overall precision and recall are 95.78 and 95.41, respectively. The detection error can be attributed to the fact that when sampling the strawberry dataset, due to light effects, occlusion, or the influence of the angle of the photo, the features of the collected images are not prominent enough, resulting in misjudgment. The study by Jia et al. (2020) on the detection and segmentation of green apples shows that the proposed recognition method has the best detection performance for unoccluded fruits. While the detection performance of occluded fruits drops significantly, the detection performance of overlapping fruits drops further.

fruits of different ripeness

Orchards typically harvest fruit as it approaches maturity to ensure it is sold at the optimal time for consumption. At this point, the same fruit tree may have fruit at different stages of maturity. If the appearance of the fruit (mainly its color) shows clear differences according to the degree of ripeness (see also the lower right corner of Fig. 1), it will interfere with the collection of the robot, which relies on accurate and reliable fruit detection. In addition, if it is necessary to monitor the growth stages of fruits or provide stage-by-stage yield estimates through fruit detection techniques, the robustness of detection models to fruits of different maturity must be improved.

Tian et al. (2019a, 2019b) proposed a model for apple detection at different growth stages in a real orchard environment. The researchers divided apple growth into three stages based on the period when the apple's color characteristics changed significantly: juvenile apples, puffy apples, and mature apples. The experimental results show that the detection performance of ripe apples is the best, and that of young apples is the worst. The same experimental results were obtained for litchi fruit (Wang et al., 2021a, 2021b, 2021c) and cherry fruit (Gai et al., 2021). The three types of fruit share a similar feature: during fruit development, color differences between the fruit's target and the surrounding leaves tend to become more pronounced.

reference

Optimization strategies of fruit detection to overcome the challenge of unstructured background in feld orchard environment: a review

Guess you like

Origin blog.csdn.net/weixin_42990464/article/details/130599410