End-to-end deep learning for directly estimating grape yield from ground-based imagery

التفاصيل البيبلوغرافية
العنوان: End-to-end deep learning for directly estimating grape yield from ground-based imagery
المؤلفون: Alexander G. Olenskyj, Brent S. Sams, Zhenghao Fei, Vishal Singh, Pranav V. Raja, Gail M. Bornhorst, J. Mason Earles
المصدر: Computers and Electronics in Agriculture. 198:107081
بيانات النشر: Elsevier BV, 2022.
سنة النشر: 2022
مصطلحات موضوعية: FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Forestry, Horticulture, Agronomy and Crop Science, Computer Science Applications
الوصف: Yield estimation is a powerful tool in vineyard management, as it allows growers to fine-tune practices to optimize yield and quality. However, yield estimation is currently performed using manual sampling, which is time-consuming and imprecise. This study demonstrates the application of proximal imaging combined with deep learning for yield estimation in vineyards. Continuous data collection using a vehicle-mounted sensing kit combined with collection of ground truth yield data at harvest using a commercial yield monitor allowed for the generation of a large dataset of 23,581 yield points and 107,933 images. Moreover, this study was conducted in a mechanically managed commercial vineyard, representing a challenging environment for image analysis but a common set of conditions in the California Central Valley. Three model architectures were tested: object detection, CNN regression, and transformer models. The object detection model was trained on hand-labeled images to localize grape bunches, and either bunch count or pixel area was summed to correlate with grape yield. Conversely, regression models were trained end-to-end to predict grape yield from image data without the need for hand labeling. Results demonstrated that both a transformer as well as the object detection model with pixel area processing performed comparably, with a mean absolute percent error of 18% and 18.5%, respectively on a representative holdout dataset. Saliency mapping was used to demonstrate the attention of the CNN model was localized near the predicted location of grape bunches, as well as on the top of the grapevine canopy. Overall, the study showed the applicability of proximal imaging and deep learning for prediction of grapevine yield on a large scale. Additionally, the end-to-end modeling approach was able to perform comparably to the object detection approach while eliminating the need for hand-labeling.
تدمد: 0168-1699
DOI: 10.1016/j.compag.2022.107081
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::5c8b9ea7f11de7fafcd48c2724f728fb
https://doi.org/10.1016/j.compag.2022.107081
Rights: OPEN
رقم الانضمام: edsair.doi.dedup.....5c8b9ea7f11de7fafcd48c2724f728fb
قاعدة البيانات: OpenAIRE
الوصف
تدمد:01681699
DOI:10.1016/j.compag.2022.107081