By Marion Oswald, Mackenzie Jorgensen, Kyriakos N. Kotsoglou, Temitope Lawal
PROBabLE Futures
The headlines seem to say it all – Shaun Thompson, a youth worker stopped and questioned by police after being misidentified by live facial recognition (LFR), has lost his judicial review case against the Metropolitan Police. He was supported in his case by Big Brother Watch. The Met’s commissioner, Sir Mark Rowley, characterises this as an ‘important victory for public safety’.
In the judgment, the High Court Justices pointed out that they were only concerned with the legal issues that Mr Thompson had asked them to decide. These issues were limited to the legality of the Met’s LFR policy adopted in September 2024, and whether its terms had the ‘quality of law.’ Mr Thompson claimed that the policy left too much discretion to the police to decide how to deploy LFR. The court disagreed, ruling that the Met’s policy had enough detail on the ‘who, when and where’ questions about deployment of LFR by the police, and so met the standard of legality. This is what underpins the Commissioner’s claim of victory.
The Court was at pains to distinguish the present case from R (Bridges) v Chief Constable of South Wales Police, where the Court of Appeal found the legal framework governing facial recognition wanting. In Bridges, the Court found that it was not clear who could be placed on a watchlist, nor were there any criteria for determining where LFR could be deployed. By contrast, the High Court Justices in the Thompson case concluded that the September 2024 Met policy was a ‘long way from Bridges’ and that ‘[n]either those pitfalls, nor anything like them, exist in the Policy before us.’
Now let’s look at the judgment a little more closely. In the judicial equivalent of ‘just saying’, Justices Holgate and Farbey drew attention to 5 issues that had not been properly pleaded i.e. not formulated with sufficient precision in Mr Thompson’s case. This meant that they could not reach any conclusions about these issues (but perhaps they were hinting that somebody should!):
‘[T]he risk and potential scope for discrimination on grounds of race was no more than faintly asserted’ by counsel for Mr Thompson. There is however substantial evidence showing that LFR tools perform less effectively on black faces. Where, however, is the systematic and transparent data that would enable the public, a claimant, a court, a regulator, or even the Government to assess misidentification rates among ethnic minorities, the demographics of those most exposed to LFR surveillance, and the overall effectiveness of such deployments? The routine deletion of watchlists by the police after a LFR operation means that there can be no systematic and independent assessment of how the lists have been compiled, and importantly whether the LFR operation has missed anyone.
The Equality and Human Rights Commission reported in their evidence that ‘the average rate of positive identifications relative to watchlist size was 0.06%. The highest rate was 0.16% and the lowest was 0%.’ What does this say about the effectiveness and fairness of this policing tactic? Would equivalent police time and resources have yielded greater returns if deployed differently?
In order to satisfy their equality duties, the police rely on a 2023 evaluation by the National Physical Laboratory (NPL) carried out in a test environment, which reports that using a match similarity threshold of 0.6, there were no statistically significant differences in the LFR tool’s performance across races and genders tested. Of course, that does not mean that the tool performs perfectly; differences in error rates can still cause unfairness in practice even if not statistically significant. Lowering this match threshold, moreover, was shown to introduce racial bias.
Yet, in the Thompson case ‘there was no challenge to the NPL methodology or to its conclusions under test conditions.’ Which begs the question, why not – bearing in mind the test limitations that the NPL itself acknowledge. We require more systematic, iterative and exacting scrutiny of these tools, investigating the sources of uncertainties within them, including threshold calibration and determination, image quality, demographic variation and environmental conditions. This entails rigorous assessment under real-world conditions, the incorporation of feedback loops, and the development of standardised evaluation procedures (including metrics and frameworks), alongside robust quality standards governing the construction of test datasets.
The Met prepares large watchlists (10,000-17,000 names), as demonstrated by data from its recent deployments. It was argued on behalf of Mr Thompson that identical watchlists had been deployed in different locations ‘in the hope that someone on a watchlist would come past’, suggesting an arbitrary approach. Although the Court could not draw any conclusions, the Justices did agree ‘that these very large numbers mean that the inclusion of two identically-sized watchlists is not easily explicable as coincidence and that, if two watchlists are precisely the same size, the same watchlist may have been used twice for different locations’, a point that would be relevant to whether the Met was following its own policy. Furthermore, what proportion of the population for the deployment location do the numbers on the watchlist represent, and can these numbers be rationalised as reflecting a targeted or focused approach?
The Court made the point that LFR is about locating individuals; so if a watchlist is capable of identifying some people (i.e. those with a connection to the deployment location), it may also include others where the Policy does not require any connection to the location (such individuals, however, are unlikely to be encountered in that deployment context.) This has no effect on the scale of intrusion on innocent people, but – the Court emphasised – ‘[w]hether this is an effective style of policing is another question which we have not been asked to consider and is not in itself a question for this court.’ We might be more comfortable with the use of a broader (‘catch all’ style) watchlist at locations such as border patrol in airports or seaports; but we would ordinarily expect a more precision-oriented approach in city centres or residential streets.
This case was about the contents of a policy, not about its operational application. Indeed, following the NPL’s evaluation, the Met amended an older policy that was in place when Mr Thompson was misidentified, and paid him an undisclosed settlement sum. The case was not about the merits or demerits of the use of LFR by the police – ‘Judicial review is simply the means of ensuring that public bodies act within the limits of their legal powers, and in accordance with the Human Rights Act 1998, as well as any relevant procedures and legal principles governing the exercise of their decision-making functions.’ It is for others – Parliament, regulators and public authorities – to decide on how the police carry out their functions, including the use of LFR.
So, who in the future will independently scrutinise the application of the LFR policy in practice, rather than just the policy itself? Who will verify that the match thresholds settings remain appropriate in operational settings? Who will assess whether the legal requirement of reasonable suspicion for stops and arrests is properly maintained where an LFR match is part of the decision-making process? As is well understood, reality on the ground is often very different to theory. Mr Thompson is unlikely to be the last individual misidentified by LFR; as deployment expands, similar experiences are likely to recur, disproportionately affecting members of minoritised groups.
Final thoughts
Court cases are ill-suited for deciding important issues of social policy; the courts cannot be expected to engage in detailed scrutiny of operational policing. But framing this case as a ‘victory’ or a ‘fight’ is neither helpful nor appropriate. Policing by consent is a fundamental principle in the UK; setting up the LFR debate as a conflict only encourages both ‘sides’ to entrench their positions, displacing deliberation with claims of ‘winning’ rather than encouraging transparent evaluation and evidence-led assessment. The policy adjustments already made by the police ought to be acknowledged. It should not, however, be regarded as a ‘win’ for LFR that its further expansion proceeds before the Court’s ‘just saying’ concerns are properly addressed.
What is required, more than ever, is an independent regulator with teeth, i.e. with meaningful enforcement powers—capable of intervening in practice, not merely reviewing policy on paper—and of supplying the degree of institutional objectivity that the present framework lacks. Just saying…