Reverse-Bayes: A Game-Changer for Polygraph Evidence?

If you've ever sat through a heated debate about polygraph validity — and I know many of you have — you'll recognise the usual battleground. On one side, the frequentist p-values, significance thresholds, and confidence intervals. On the other, the Bayesian cry: "But what about the prior probability of deception?"

And right in the middle? Paralysis.

I've just finished reading a paper that might offer a way out. It's by Leonhard Held and colleagues, published in Research Synthesis Methods (2022), and it's about Reverse-Bayes methods. I'll be honest: I picked it up expecting dense, impenetrable maths. What I found was a genuinely practical toolkit — one that speaks directly to the problems we face in polygraph testing.

Let me explain why you should care.

The Problem We All Know Too Well

In polygraph examination, we're constantly wrestling with two uncomfortable facts:

Base rates matter, but we rarely know them. The prior probability that a suspect is deceptive varies wildly across settings — counter-intelligence, criminal investigation, employment screening. Without a prior, pure Bayesian inference stalls.
NHST (null-hypothesis significance testing) gives us p-values, but as the paper notes bluntly, these are "not fit for purpose." A p = 0.04 doesn't tell you the probability that the examinee is lying. Yet many practitioners (and lawyers) act as if it does.

The result? Endless circular arguments about whether polygraph "works" — arguments that never resolve because each side starts from different hidden assumptions.

What is Reverse-Bayes, in Plain English?

Conventional Bayes says: Prior + Data = Posterior. That's "forward" Bayes. You need a prior to get started.

Reverse-Bayes flips the equation: Posterior + Data = Prior.

In other words: You decide what level of credibility you want to achieve (say, 95% certainty that an effect exists), and the method tells you what prior belief you would have needed to get there.

Then you ask the killer question: Is that prior reasonable, given what we already know?

This is genius because it doesn't force you to specify a prior upfront — a task that often feels arbitrary in polygraph work. Instead, it forces a conversation about what assumptions would have to be true for your conclusion to be credible.

How This Applies to Polygraph Testing

Let me give you three concrete examples from my own practice.

1. The Intrinsic Credibility of a "Deceptive" Result

Imagine a single-issue polygraph examination where the physiological score produces a p-value of, say, 0.02. Conventional thinking says "significant at the 5% level." But what does that actually buy you?

The paper introduces a concept called intrinsic credibility — a way to assess whether a finding is strong enough to stand on its own, without appealing to external prior evidence. For a novel, "out of the blue" effect, they show that a p-value needs to be ≤ 0.0056 to be intrinsically credible at the 95% level.

Yes, you read that right. 0.0056 — not 0.05.

For polygraph, this is sobering. Most laboratory validation studies produce p-values well above that threshold. And field studies? With all their confounds? Even more so. The implication is not that polygraph is useless — but that we should stop claiming single examinations produce "scientific proof" at conventional significance levels. They don't.

2. The Sceptical Prior: Challenging Your Own Findings

Suppose you've conducted a meta-analysis of polygraph accuracy studies and found a pooled odds ratio of 5.0 (meaning deceptive examinees are five times more likely to be correctly classified than truthful ones). That looks impressive.

But a sceptic could say: "Your prior belief in polygraph validity was too generous."

Reverse-Bayes lets you quantify that scepticism. It asks: How sceptical would you have to be — what prior distribution — to make that odds ratio no longer credible? The answer comes as a scepticism limit — a critical prior interval. If external evidence (e.g., from covertly confirmed cases) falls inside that interval, then even a statistically significant finding might be fragile.

In polygraph, this is gold. It gives us a way to audit our own conclusions. We can say: "For this finding to be non-credible, you would have to believe, before seeing the data, that the true odds ratio lies between X and Y. Is that belief reasonable?" If not, the finding stands.

3. The Advocacy Prior: Rescuing Non-Significant Results

The flip side: what about a study that shows a positive effect but fails to reach significance (a common story in small-sample polygraph research)? The paper's advocacy prior tells you what prior belief would be needed to make that result credible.

In the example they give (a small COVID-19 corticosteroid trial), the advocacy range was enormous — implying the trial provided very weak evidence. But in other contexts, the range might be narrow. For polygraph, this could help decide whether a non-significant finding is genuinely uninformative or simply underpowered.

The False Positive Risk: What P-Values Actually Mean

One section of the paper should be required reading for every polygraph examiner who has ever testified in court.

They analyse the false positive risk (FPR) — the probability that a "significant" result is actually a false positive. Many people mistakenly think that p = 0.05 means a 5% chance of error. The paper shows, using Reverse-Bayes, that to achieve a 5% FPR with p = 0.05, the prior probability of the null hypothesis (i.e., that the examinee is truthful) must be no more than 11—28% , depending on the assumptions.

In plain English: unless you already believe deception is highly likely (over 70-90% prior probability), a p = 0.05 from a polygraph examination does not give you 95% confidence in a deceptive outcome.

That's a bombshell for courtroom testimony.

What This Means for Investigative Psychology

I'm not arguing that Reverse-Bayes is a magic bullet. The methods require some comfort with statistical reasoning, and the paper acknowledges that professional statisticians may debate the finer points. But for practitioners like us, the mindset is transformative.

Instead of pretending we have no prior beliefs, or pretending that p-values answer the wrong question, we can:

Quantify scepticism in terms of hypothetical prior studies ("your challenge to this finding is equivalent to believing in a prior study with N = X events")
Set credibility thresholds that reflect real-world consequences (e.g., "I will only regard a polygraph result as actionable if the required prior probability of deception is less than 20%")
Assess replication success when multiple polygraph examinations or studies are compared

The authors conclude that Reverse-Bayes "can play a key role in the scientific enterprise of the 21st century." For polygraph — a field still fighting for scientific legitimacy — that role might be decisive.

A Challenge to My Colleagues

Next time you present polygraph accuracy data, try this:

Calculate the p-value or confidence interval for your effect.
Use the credibility ratio (upper CI limit divided by lower CI limit) — if it's greater than 5.8 at 95% CI, the finding lacks intrinsic credibility.
Ask: What prior probability of deception would be needed for this result to achieve a 5% false positive risk?
Be honest about whether that prior is plausible.

I suspect many of our field's cherished findings will not survive this scrutiny. That's uncomfortable. But it's also how science progresses.

We owe it to the courts, to the examinees, and to ourselves to stop pretending that polygraph evidence is stronger than it is — and to start using statistical tools that match the complexity of the problem.

Reverse-Bayes won't make polygraph perfect. But it might make us more honest.

Dr. Keith Ashcroft is an investigative psychologist specialising in credibility assessment and evidentiary statistics. He consults to law enforcement and security agencies on the interpretation of psychophysiological data. The views expressed are his own.

Further reading: Held, L., Matthews, R., Ott, M., & Pawel, S. (2022). Reverse-Bayes methods for evidence assessment and research synthesis. Research Synthesis Methods, 13(3), 295-314.