Could your AI Assistant Lie To You? The Truth could surprise you

MemoryMatters #47

organicintelligence

6/9/20257 min read

AI hallucinations stand out as one of the biggest challenges we face with artificial intelligence in today's digital world. A Telus International survey shows 61% of respondents worry about AI hallucinations and how false information keeps spreading. GPT-4 has gotten better at this - it's 40% less likely to make things up compared to older versions. But these made-up responses still create major risks.

What happens during an AI hallucination? The AI creates content that sounds real but is completely false by finding patterns in its training data. This isn't the AI trying to trick anyone - it's just a basic limit of how these systems work. The Mata v. Avianca case shows how serious this can get. A lawyer used ChatGPT's made-up legal cases as references and got in trouble with the court.

Here we want to discuss why AI systems hallucinate including various types of intrinsic and extrinsic hallucinations.

What is an AI hallucination?

AI hallucination happens when AI systems create content that looks real and fits the context but turns out to be made up or wrong. These AI-produced falsehoods don't work like human hallucinations. The system simply can't tell real facts from false information [1].

How AI generates plausible but false content

The way AI hallucinations work comes down to how large language models (LLMs) like ChatGPT or Gemini handle and create text. These systems are basically advanced pattern-matching algorithms that learn from huge amounts of text data.

The AI first looks at patterns in its training data to predict what should come next in context. The prediction process works on probability rather than facts. The AI picks words and phrases based on what's statistically likely to appear next, not what's actually true [2].

These models also have "source amnesia" - they lose track of where their training data came from while creating content [2]. This explains a study that analyzed ChatGPT's research proposals. The study found that from 178 references it created, 69 weren't valid Digital Object Identifiers and 28 references didn't even exist [2].

The AI builds on its own mistakes once it starts giving wrong information. People call this the "snowball effect of hallucination" [2]. The model cares more about making sense than being accurate, which makes these made-up responses sound very believable.

Why hallucinations are not intentional lies

AI hallucinations aren't like human lies. Machines don't have consciousness, intentions, or any sense of right and wrong - things you need to actually lie. The errors come from basic limits in how AI systems work [3].

Hallucinations happen because:

The model doesn't have enough data to answer correctly but still has to give some response
The encoding/decoding between text and internal representations creates errors
The math behind text generation cares more about sounding right than being truthful

People sometimes fill in story gaps with made-up details without meaning to lie [2]. But there's a big difference - humans can say when they're not sure about something. AI systems sound equally confident about everything they say [4].

One AI researcher puts it well: "There is no difference to a language model between something that is true and something that's not" [4]. This makes AI hallucinations both fascinating and potentially risky.

Types of AI hallucinations

Researchers have categorized ai hallucinations into two distinct types: intrinsic and extrinsic hallucinations. This classification helps developers and users spot and tackle different forms of AI-generated falsity.

Intrinsic hallucinations: Misrepresenting known facts

AI systems create intrinsic hallucinations by contradicting information from the prompt or conversation history. These errors happen because the system misinterprets available data. To name just one example, an AI might claim the FDA rejected an Ebola vaccine in 2019 when the article clearly states its approval.

Google's Bard chatbot (now called Gemini) showed this type of hallucination in February 2023. The AI wrongly stated that the James Webb Space Telescope captured the first images of a planet outside our solar system. Scientists had actually taken the first exoplanet images in 2004, way before JWST's 2021 launch [5]. This mistake cost Google dearly - up to $100 billion in market value vanished the next day [5].

Extrinsic hallucinations: Fabricating new information

The AI creates extrinsic hallucinations by generating new information that nobody can verify from the source content or conversation history. These fictional additions go way beyond the reach and influence of the provided context.

A prominent case involved a New York attorney who used ChatGPT to prepare a legal motion. The AI created completely fabricated judicial opinions and legal citations [6]. The lawyer later faced sanctions and fines, admitting he "did not comprehend that ChatGPT could fabricate cases" [6].

Why extrinsic hallucinations are harder to detect

Spotting extrinsic hallucinations presents unique challenges. These fabrications often sound believable and maintain consistency even while presenting fiction. People need external knowledge or research to verify their accuracy.

Large language models tend to create extrinsic hallucinations because of their probabilistic nature. This happens more often with topics that don't appear much in their training data. These models would rather give fluent, coherent responses than admit uncertainty, so they fill knowledge gaps with convincing but invented details [7].

Extrinsic fabrications blend naturally with accurate content, unlike intrinsic hallucinations that clearly contradict given information. This makes them particularly tough to identify without fact-checking against outside sources.

What causes AI hallucinations?

Knowing why ai hallucinations happen lets us spot and prevent them. These made-up outputs come from several technical limits rather than any planned deception.

Training data quality and bias

The core of any AI system—its training data—has both facts and mistakes. Large language models (LLMs) copy patterns they see without knowing what's true or false [8]. Bad data with mistakes, biases, or wrong information shows up in what these models create.

These models face coverage gaps in specific or unusual topics, even with huge training datasets. LLMs try to "fill in the blanks" with text that sounds right but isn't real [9].

Model architecture and decoding errors

Generative AI models work like advanced autocomplete tools that predict what comes next based on patterns—not fact-checkers [8]. These models can create wrong content by mixing patterns in weird ways, even with perfect training data [8].

Technical problems in text generation add to these issues. Decoders sometimes focus on wrong input parts. The transformer-based attention system in LLMs can only remember so much context, which makes them "forget" earlier parts in longer responses [9].

Prompt ambiguity and exposure bias

Unclear prompts often make AI hallucinate as it tries to make sense of fuzzy requests [10]. On top of that, LLMs face "exposure bias"—they train on real data but run on their own output [9].

Training and actual use conditions don't match up. Models create text one piece at a time without fixing previous output. Small mistakes grow into confident but wrong answers [11].

How to prevent AI hallucinations

AI hallucinations need specific strategies to make models more reliable and accurate. Studies show these prevention methods can reduce false outputs by a lot. One study proves that Retrieval-Augmented Generation (RAG) made accuracy better by 39.7% in multiple models [14].

Use of Retrieval-Augmented Generation (RAG)

RAG systems help stop hallucinations by connecting AI responses to verified information. The system gets relevant data from external sources before creating text [15]. The user's question becomes a vector representation that matches with data in a vector database. The retrieved information then becomes part of the prompt. This lets the model create responses based on real facts instead of made-up information [16]. A medical study showed that RAG helped LLMs reach 94% accuracy, making it one of the best ways to prevent hallucinations [14].

Improving prompt clarity and context

Clear and specific prompts make hallucinations less likely. Here's what you should do with prompts:

Start with "According to..." to link responses to reliable sources [17]
Use chain-of-verification prompting to check facts step by step [17]
Give complete context to avoid confusion [1]
Tell the model it's better to say nothing than give wrong information [1]

These methods keep the AI focused on relevant information and stop it from making up answers when it's not sure.

Cross-checking outputs with trusted sources

Fact-checking remains vital. After you get AI-generated content:

Look up key points in credible sources to find what's true [21]
Check if suspicious claims make sense [21]
Use multiple independent sources to verify information [22]
Listen to your gut if something doesn't seem right [23]

Human oversight remains the last defense against convincing AI hallucinations.

CTA - How confident are you in the accuracy of the AI tools you use every day—and have you ever caught one making something up?

Closure Report

AI hallucinations pose a major challenge in our ever-changing world of artificial intelligence. These fabrications can contradict given information or create new "facts" out of thin air. They don't come from any bad intent but emerge from basic limitations in large language models. These systems work on statistics and focus on creating believable responses rather than accurate ones. This leads to outputs that sound right but are completely false.

Users of AI systems need to stay watchful. Learning about detection and prevention strategies becomes crucial to use AI responsibly. Up-to-the-minute data analysis shows that Retrieval-Augmented Generation works well. It increases accuracy by almost 40% in models of all sizes by connecting responses to verified external data. Clear and specific prompts reduce hallucinations by a lot, especially when you have instructions that prefer uncertainty over made-up information.

Human oversight stands as our strongest defense against AI's made-up responses. Technology alone can't replace critical thinking and fact-checking. These systems will keep creating believable-sounding content that isn't true when they reach their knowledge limits or receive unclear instructions.

References

[1] - https://insight.factset.com/ai-strategies-series-7-ways-to-overcome-hallucinations
[2] - https://pmc.ncbi.nlm.nih.gov/articles/PMC11681264/
[3] - https://www.seriousinsights.net/ai-hallucinations-bias-and-lies/
[4] - https://www.scientificamerican.com/article/can-one-chatbot-catch-anothers-lies/
[5] - https://originality.ai/blog/ai-hallucination-factual-error-problems
[6] - https://builtin.com/artificial-intelligence/ai-hallucination
[7] - https://time.com/6989928/ai-artificial-intelligence-hallucinations-prevent/
[8] - https://mitsloanedtech.mit.edu/ai/basics/addressing-ai-hallucinations-and-bias/
[9] - https://www.kapa.ai/blog/ai-hallucination
[10] - https://documentation.suse.com/suse-ai/1.0/html/AI-preventing-hallucinations/index.html
[11] - https://link.springer.com/article/10.1007/s00521-025-11162-0
[12] - https://www.ibm.com/think/topics/ai-hallucinations
[13] - https://www.ada.cx/blog/ai-hallucination-examples-when-artificial-intelligence-gets-it-wrong/
[14] - https://pubmed.ncbi.nlm.nih.gov/39521391/
[15] - https://aws.amazon.com/what-is/retrieval-augmented-generation/
[16] - https://arxiv.org/abs/2402.19473
[17] - https://www.godofprompt.ai/blog/9-prompt-engineering-methods-to-reduce-hallucinations-proven-tips?srsltid=AfmBOopxntJMm3g3zmGnX-RXgHaOmzfz_57mUA2prJVtqTKS2z3PvXrj
[18] - https://www.gdit.com/perspectives/latest/reducing-generative-ai-hallucinations-by-fine-tuning-large-language-models/
[19] - https://medium.com/data-science/safeguarding-llms-with-guardrails-4f5d9f57cff2
[20] - https://www.mckinsey.com/featured-insights/mckinsey-explainers/what-are-ai-guardrails
[21] - https://www.techtarget.com/whatis/feature/Steps-in-fact-checking-AI-generated-content
[22] - https://libguides.stkate.edu/generativeai/evaluatingAI
[23] - https://www.prsa.org/article/4-steps-to-take-to-ensure-the-accuracy-of-your-ai-content

Linked to ObjectiveMind.ai