Don’t Let AI Do the Policing: Findings from a Multi-Stakeholder Workshop on Evaluating Large Language Models for Policing
In November 2025, PROBabLE Futures, the National Police Chief’s Council, and the Home Office ran a 1.5 day workshop with stakeholders from academia, government, industry, and policing to discuss what is needed in a LLM evaluation framework for policing use cases, and to test the actual systems in practice through a hackathon. LLM evaluation here refers to the process of assuring LLMs are fit for purpose and testing how well they work within policing workflows to support responsible adoption. The aim of the workshop was to develop and gather the information required for practical guidance to support police forces to evaluate LLM systems, drawing on the insights and recommendations of stakeholders. This guidance will inform the rigorous testing of LLMs used in policing, in the interest of justice. Rigorous testing is required for evidence-based decisions about AI adoption.
Nneoma Ogbonna, Mackenzie Jorgensen, Muffy Calder, and Marion Oswald
PROBabLE Futures Policy Brief: Don’t...
Download PDFRAi UK will store your data in accordance with the General Data Protection Regulation 2017 (GDPR). We will not share your data with any third parties and you are given opportunities to unsubscribe at any time within the electronic communications you receive, or by emailing info@rai.ac.uk