Bayesian AI!

Mar 18
5 min read

I believe that AI and Bayesian reasoning might be the chocolate and peanut-butter we’ve been looking for.

One of the major challenges we currently face is the Rage Machine, in which we are exposed to a fire-hose of misinformation, disinformation, and malinformation. Trying to understand and navigate this environment was hard enough ten years ago, but has now been turbo-charged by AI systems which are being used to flood the zone in order to drive a political agenda.

When we use AI systems, like ChatGPT, Claude, or Grok, it’s important to recognize that these systems “learn” by nudging the probabilities that a given word will come next, then “rolling dice” to select which one to use. They are not “alive”, they are not “intelligent”, and that feeling we get that they are so “human” in their responses is due to the fact that their responses are built on training data generated by humans, and to the “agent detection” wetware in our brains.

To the degree that they give accurate responses, this is mainly due to the accuracy of the data used for training. If that data is pulled from relatively reliable services – ie, ones with a lot of curation to ensure accuracy – you will have better results, but it’s not enough. To get a little closer to accuracy, AI systems include “guardrails” of various types, sometimes including specialized AI systems to act as gatekeepers to improve accuracy.

To be effective, the guardrails need to be able to evaluate the accuracy of statements, and need to be able to perform some basic “reasoning”, which is dependent on the facts available.

But how do we evaluate those facts, especially when so much of the data which might be used for training is of such variable quality?

Take Wikipedia. Enormous time and effort is devoted to providing the system with as many checks and balances as possible, but humans are humans, and not all of the editors are operating in good faith. Overall, I think they do a fantastic job, but it’s still necessary to cross-check and dig into sources to ensure the accuracy of what you find. In fact, this is true of every source out there.

Consider the wombat. To the best of my knowledge, the wombat is not front-and-centre in any social or political movements, so I would assume that the information in Wikipedia is generally accurate. Not guaranteed, not 100%, but I would consider it fine for an initial source.

Now consider the abortion debate. My assumption is that statements taken from this article are due a higher degree of scrutiny, since the issue is tightly tied to various social and political movements. The simple fact that the issue is controversial increases the chance of bias, and increases the likelihood that vandals may try to sabotage the page.

A useful heuristic might be to give a higher initial confidence level to pages dealing with non-controversial topics, unless you have reason to believe there is an issue, in which case the page should be reviewed in more detail. Conversely, if a page deals with highly controversial topics, a lower initial confidence level should be given, unless some cross-check or deeper investigation is performed.

Or, put more simply, our evaluation tools need to use some form of Bayesian reasoning. In order to do this, we would need to have some mechanism for measuring the level of confidence for a given data source, or a given fact.

For something like a Large Language Model (LLM), we would need a way to specify the confidence level associated with a given data source, possibly with some way to break it down, along with some mechanism for adjusting the confidence levels at regular intervals, depending on updated evaluations.

Easy, right?

Sadly, no. First, it would require significant effort to define how this process would work, then more effort to set up the infrastructure and initial estimates, and still more to establish the processes by which updates would be made. And then, some process would be required to “re-train” the LLM to incorporate changes to the confidence levels, which might possibly be harder than all of the other things.

While I think we should do something like this, I also think it’s very difficult and that we probably won’t do it – at least not universally.

Another possibility is to build some form of Bayesian reasoning into the guardrails, which often include smaller, specialized AI systems. These systems could be specific to certain areas, could have heuristics built in to determine default levels of confidence, and could be adjusted far more easily.

A system like this could be used to improve accuracy by evaluating sources and adjusting those evaluations over time. This has the potential to improve accuracy, but is dependent on having solid processes for evaluating accuracy and quality.

Simple to say, but hard to do. Such a system would need to have ways to cross-check facts for accuracy, and for identifying contradictions. Fortunately, though, reality is consistent with reality, while pseudoscience and misinformation are generally not.

Consider acupuncture. While the Mayo Clinic is highly-regarded, its article on acupuncture describes it in a vague way that does NOT make it clear that it is pseudoscientific. In contrast, the article on Science-Based Medicine notes that:

“Acupuncture has recently been transplanted to the West, riding the wave of tolerance for unscientific treatment practices marketed as ‘complementary and alternative medicine.’ While advocates have been successful at pushing acupuncture into the culture, the scientific medical community has still not accepted the practice as a legitimate scientific practice.”

Acupuncture, via Science-Based Medicine

It then clearly lays out four points, whose headers give the game away:

Acupuncture is a pre-scientific superstition
Acupuncture lacks a plausible mechanism
Claims for efficacy are often based upon a bait-and-switch deception
Clinical trials show that acupuncture does not work

And, rather encouragingly, the Wikipedia article quotes Science-Based Medicine and contributors Dr Steven Novella and Dr David Gorski, and includes the clear statement:

“Acupuncture is a pseudoscience; the theories and practices of TCM are not based on scientific knowledge, and it has been characterized as quackery.”

This is good, and any reasonably objective human researcher will quickly figure out that acupuncture has no scientific basis, and that the Mayo Clinic article is suspect. But does that mean that anything published by the Mayo Clinic is now suspect and anything published by Science-Based Medicine is flawless?

Obviously not. This is where Bayesian AI gets complicated. We need a system which has some way (however limited) of applying some degree of reasoning, and I think we needed it years ago.

At the very least, we should get started on it.

But don’t get too discouraged – it’s hard for humans as well.

Cheers!

“In science, 'fact' can only mean 'confirmed to such a degree that it would be perverse to withhold provisional assent.' I suppose that apples might start to rise tomorrow, but the possibility does not merit equal time in physics classrooms."

Stephen Jay Gould

Today I Learned

Bayesian AI!

Recent Posts

Comments