Evaluating Large Language Models for Agricultural Injury Surveillance

Authors

  • Serap Gorucu University of Florida
  • Jacob Muller
  • Daniel Petti
  • Changying Li
  • Matthew Pilz
  • Bryan Weichelt

Keywords:

injury surveillance, Large Language Models, Language Processing, Automation, Agriculture

Abstract

Collecting and disseminating data for agricultural injury surveillance typically depends on manual input and human review, making the process both time-consuming and labor-intensive. A prime example is AgInjuryNews (AIN), a public platform that compiles injury reports from news articles and investigations. Because the content is unstructured, AIN currently relies on human reviewers to extract relevant information. However, the rise of Large Language Models (LLMs) offers a promising avenue for automation. This study explored the potential of LLMs to assist in the reviewer role at AIN. Models evaluated include OpenAI’s ChatGPT 3.5 and 4, along with a fine-tuned version of Llama 2, to assess their accuracy in extracting incident and victim-related details. Each model was tasked with identifying specific data points such as drug or alcohol involvement, time of incident, and victim demographics from a random sample of news articles previously reviewed by AIN staff. The fine-tuned Llama 2 emerged as the top performer, with an average accuracy of 93% and perfect scores in some categories. While none of the models were perfect, the results highlight the feasibility of integrating LLMs to streamline workflows, reduce resource demands, and enhance the efficiency of data collection and analysis.

Published

31-12-2025

How to Cite

Gorucu, S., Muller, J., Petti, D., Li, C., Pilz, M., & Weichelt, B. (2025). Evaluating Large Language Models for Agricultural Injury Surveillance. I. International Digital Agriculture Congress. from https://www.indac.com.tr/index.php/TURSTEP/article/view/472