Tottenham Report: Confirmation bias in the jury box – and on the Gambling Commission

Commercial Casinos
Igaming
Sports Betting

Andrew Tottenham — Managing Director, Tottenham & Co

I was watching a television programme called “The Jury”, a fascinating social experiment designed to see how reliable juries really are.

The premise is simple, but ingenious: Take a real murder trial, use actors and a word-for-word script from the court, and run the trial with two juries. Neither jury knows another jury is watching the same trial and the exact same evidence unfold. Cameras are installed in both jury boxes and deliberation rooms.

What becomes immediately obvious is that most of the jurors don’t really understand what’s expected of them. Instead of listening carefully to the prosecution and defence cases, weighing up the evidence, and then reaching a verdict, many members of both juries made up their minds almost instantly. From then on, they simply fit the evidence into the story they’ve already decided is true. It’s a textbook case of “confirmation bias”.

The stronger personalities treat deliberation like a contest: Winning means browbeating others into changing their minds. Success is defined not by carefully examining the evidence, but by persuading the room.

Having served on a jury myself — albeit in a minor case — I can say that this rings true. People like to think they’re open-minded, but most of us start with a hunch, then backfill the reasoning.

So what does this have to do with gambling? Quite a lot, as it turns out.

The Gambling Commission recently released the results of a much-anticipated review of the Gambling Survey for Great Britain (GSGB). This is the survey now used to measure gambling participation and harm across the country. It replaced the long-running Health Survey for England (HSE), which for years had included questions about gambling.

Why the change? The Commission argued that the GSGB was better tailored to gambling behaviours, could be more specific and comprehensive, covered Scotland and Wales, and is ultimately more accurate (and cheaper). But the shift has been controversial, not least because the GSGB reports higher rates of gambling harm than the HSE ever did.

Critics, myself included, wondered, is this because the GSGB is genuinely a better tool or because of the way it recruits participants and asks questions?

To address this criticism, the Commission turned to a series of experiments led by Professor Patrick Sturgis of the London School of Economics. The idea was to test whether different aspects of survey design could explain the discrepancies.

Sturgis and his colleagues conducted three experiments using volunteers from a NatCen research panel. It’s worth noting from the outset that this panel was self-selected, not randomly recruited, which already raises questions about representativeness.

In the first experiment, some participants were invited to complete a survey advertised as being about “gambling,” while others were told it was about “health and recreation.”

The result? The “gambling” survey attracted significantly more gamblers and produced higher rates of reported gambling harm. In other words, the very act of labelling the survey skewed who took part and what they reported.

This is a classic case of topic salience bias. Just as a survey called “Fast Cars” would over-recruit petrolheads, a survey called “Gambling” pulls in gamblers and possibly more highly engaged gamblers.

The second experiment compared responses between an online self-completion survey and a telephone interview with a live interviewer.

Here, respondents in the online survey reported more gambling than those speaking directly to an interviewer. The interpretation is that people may downplay their gambling habits when they have to say them out loud to another human being.

But there’s a catch: the HSE — the survey the GSGB replaced — didn’t use telephone interviews. Instead, it relied on face-to-face interviews with self-completion booklets, where the interviewer didn’t see the answers. So the experiment doesn’t really tell us much about the HSE at all.

Perhaps most tellingly, only 1% of online respondents said they would have reported “less gambling” if interviewed by phone. Hardly compelling evidence of widespread social desirability bias.

The final experiment looked at whether providing a longer list of gambling activities changed the rates of participation or problem-gambling scores (PGSI).

The result? Hardly any difference. This undermines one of the Gambling Commission’s earlier claims — that longer lists in the GSGB partly explain its higher figures compared to the HSE.

So what did these experiments teach us?

They showed convincingly that topic salience matters: Calling something a gambling survey increases reported gambling rates.

They suggested, but didn’t prove, that interviewer presence may affect responses, though the link to the HSE is tenuous.

They showed that longer activity lists don’t make much difference.

In short, interesting findings, but they don’t resolve the central question of whether the GSGB is genuinely more reliable than the HSE.

Even Professor Sturgis himself was cautious in his interpretation. He speculated that higher scores in self-administered surveys may indicate greater honesty, but admitted this was just that — speculation. He also downplayed alternative explanations, such as inattentive responding or differences in how questions were interpreted.

Yet despite these limitations, the Gambling Commission’s conclusion was triumphant: The experiments “build our confidence in the outputs of the GSGB” and will “improve guidance for users.”

How so? The Commission leaned heavily on Experiment 2, treating it as evidence that self-completion methods (used in the GSGB) encourage “more honest responses” than interviewer-administered surveys. The awkward fact that the HSE used self-completion for the key gambling harm questions was quietly brushed aside.

Meanwhile, the strongest and clearest finding, the topic salience effect from Experiment 1, was largely downplayed. After all, if you admit that advertising the GSGB as a “gambling survey” over-recruits gamblers, you might have to concede that its higher harm estimates are inflated.

In other words, just like the jurors in “The Jury,” the Commission had already reached its conclusion. The experiments were then slotted neatly into the narrative. Confirmation bias at work.

—

This isn’t just an academic spat. The numbers produced by the GSGB feed directly into public debate, media reporting, and ultimately government policy on gambling regulation. If the GSGB systematically overestimates harm, the risk is that policy becomes skewed.

On the other hand, if the GSGB is genuinely more accurate than the HSE, downplaying its findings could leave real harms underestimated and under-addressed. Either way, the stakes are high.

That’s why methodological transparency matters so much. We need to be honest, not only about what surveys find, but also about what they can’t tell us. When limitations are glossed over and uncertainties minimised, we risk building policy on shaky ground.

Watching “The Jury”, I was struck by how quickly people settle on a narrative and then interpret the evidence through that lens. The Gambling Commission’s handling of the GSGB review feels uncomfortably similar.

The experiments by Sturgis and his team shed some light, but not enough to deliver a final verdict. Yet the Commission seems eager to treat them as proof that the GSGB is the superior instrument.

Perhaps the fairest conclusion, then, is the one the programme title itself suggests: The jury is still out.

Until we have genuinely independent transparent testing, using representative samples and designs that map directly onto the surveys in question, we’re left with more questions than answers. And when it comes to shaping gambling policy, that should give us all pause.

Tottenham Report: Confirmation bias in the jury box – and on the Gambling Commission

Other site links

Search this site with Google

Connect & Contact

Contact CDC Gaming Reports