Institutional Repository
Technical University of Crete
EN  |  EL

Search

Browse

My Space

Automatic summarization of phytopathologies using multimodal large language models

Fragkogiannis Georgios

Full record


URI: http://purl.tuc.gr/dl/dias/7885A5EB-837D-4C75-8116-6EC554AABAC2
Year 2025
Type of Item Diploma Work
License
Details
Bibliographic Citation Georgios Fragkogiannis, "Automatic summarization of phytopathologies using multimodal large language models", Diploma Work, School of Electrical and Computer Engineering, Technical University of Crete, Chania, Greece, 2025 https://doi.org/10.26233/heallink.tuc.105007
Appears in Collections

Summary

Modern agriculture is called to respond to the increasing needs for food production, while at the same time maintaining its sustainability and efficiency. This challenge makes necessary the exploitation of advanced technologies, such as computer vision and modern large language models, especially in fields where timely and accurate diagnosis of phytopathologies can significantly reduce both production losses and the use of pesticides. In the present diploma thesis, we focus on the development of a system capable of recognizing diseases that affect plants, based on pictures of their leaves. The proposed architecture consists of three basic stages. Initially, a series of visual models is utilized for the detection of the leaf, the identification of areas with signs of infection, and the classification of the possible disease. In the next stage, a tool calling system allows a language model to select and activate the appropriate tools, enriching the diagnostic process. Finally, the system composes a documented and complete answer, in which both the recognized disease and the proposed ways of addressing it are presented. During the experimental process, YOLO models were used for the detection and classification of images, which were trained on specially configured datasets, based on recognized public datasets in the field of smart agriculture, such as PlantVillage and PlantDoc. At the same time, Retrieval-Augmented Generation (RAG) systems were developed for the retrieval of information regarding methods for dealing with the diseases, while fine-tuning was also carried out on vision-language models to enhance accuracy in the recognition of phytopathologies.

Available Files

Services

Statistics