PROBabLE Futures Policy Brief: Don’t Let AI Do the Policing: Findings from a Multi-Stakeholder Workshop on Evaluating Large Language Models for Policing

Don’t Let AI Do the Policing: Findings from a Multi-Stakeholder Workshop on Evaluating Large Language Models for Policing

In November 2025, PROBabLE Futures, the National Police Chief’s Council, and the Home Office ran a 1.5 day workshop with stakeholders from academia, government, industry, and policing to discuss what is needed in a LLM evaluation framework for policing use cases, and to test the actual systems in practice through a hackathon. LLM evaluation here refers to the process of assuring LLMs are fit for purpose and testing how well they work within policing workflows to support responsible adoption. The aim of the workshop was to develop and gather the information required for practical guidance to support police forces to evaluate LLM systems, drawing on the insights and recommendations of stakeholders. This guidance will inform the rigorous testing of LLMs used in policing, in the interest of justice. Rigorous testing is required for evidence-based decisions about AI adoption.

Author

Nneoma Ogbonna, Mackenzie Jorgensen, Muffy Calder, and Marion Oswald

Read The PDF

PROBabLE Futures Policy Brief: Don’t...

Download PDF

Sign up to our mailing list

RAi UK will store your data in accordance with the General Data Protection Regulation 2017 (GDPR). We will not share your data with any third parties and you are given opportunities to unsubscribe at any time within the electronic communications you receive, or by emailing info@rai.ac.uk