AI shows promise in evaluating complex forensic evidence
05 June 2025

An international research collaboration involving scholars from several countries has revealed that artificial intelligence (AI) systems, particularly those enhanced with specialised forensic knowledge, can substantially improve the evaluation of forensic evidence in legal contexts. This study utilized Large Language Models (LLMs) as research “participants”, repeatedly presenting them with experimental materials while their memory was reset between trials.
The study, published in the , was led by researchers Francesco Pompedda (INVEST Research Flagship Centre, University of Turku, Finland) and Pekka Santtila (NYU Shanghai, China), alongside an international team.
This research replicated exactly a previous human participant study by Garrett et al. (2020) examining how mock jurors evaluate firearm examiner testimony. Using advanced LLMs as participants informs on how AI systems process complex legal information compared to human decision-making. This research arrives at a critical juncture, as courts worldwide grapple with the so-called 'CSI effect,' wherein jurors often overestimate the reliability of forensic evidence.
Key findings include:
- Knowledge-enhanced AI consistently provided more cautious and scientifically grounded evaluations of forensic evidence compared to standard AI models.
- AI models responded meaningfully to cross-examinations, lowering assessments of guilt and scientific credibility when forensic evidence was appropriately challenged.
- Unlike human jurors, AI systems did not convict in cases with inconclusive evidence, demonstrating more rigid adherence to legal standards of reasonable doubt.
Professor Pekka Santtila, Professor of Psychology at NYU Shanghai and corresponding author, commented:
“This study highlights a promising future role for AI in supporting legal decision-making, particularly in evaluating complex scientific evidence where human biases and misunderstandings frequently occur. By equipping AI systems with expert forensic knowledge, we can significantly enhance their ability to critically assess forensic claims—potentially addressing longstanding issues in legal decision-making.”
Santtila further elaborated on the findings:
“We formally tested and confirmed that standard AI models lacked the detailed forensic knowledge provided to enhanced models, emphasizing the need to equip AI deliberately for accurate forensic assessments.”
Dr Thomas Nyman, co-author from the 黑料不打烊, said: “This is a novel study that opens up important questions about how AI might eventually support decision-making in legal proceedings. Given the UK's ongoing efforts to prevent miscarriages of justice, this work provides a valuable starting point for exploring how technology might help address the human biases that can contribute to wrongful convictions in our courts.”
The study underscores the practical implications of integrating knowledge-enhanced AI into legal proceedings, potentially aiding jurors, judges, and legal professionals in interpreting forensic evidence more accurately, reducing wrongful convictions, and enhancing overall justice outcomes.
Pompedda, F., Santtila, P., Di Maso, E., Nyman, T. J., Sun, Y., & Zappala, A. (2025). Evaluating firearm examiner testimony using large language models: a comparison of standard and knowledge-enhanced AI systems. Journal of Psychology and AI, 1(1).