Research conducted by OpenAI reveals that the o3 and o4-mini models produce incorrect responses in 33% and 48% of cases, respectively. This tendency, referred to as hallucination, raises questions about the reliability of AI systems, especially in critical fields like medicine or finance. Eleanor Watson, an AI ethics expert, highlights the risks of these subtle but potentially serious errors.
Hallucination is not just a flaw but an inherent characteristic of language models. Sohrob Kazerounian, an AI researcher, explains that this ability to generate original content is what enables AI to create rather than merely reproduce. Without this faculty, AI systems would be limited to pre-existing responses, with no possibility for innovation (see chapter at the end of the article).
However, the problem becomes more complex with the most advanced models. Hallucinations become less obvious and harder to detect, embedded within plausible narratives. Eleanor Watson explains that this can erode trust in AI systems, especially when users take the information at face value.
Solutions to mitigate this phenomenon include using external sources to verify generated information. Eleanor Watson also mentions the importance of structuring the models' reasoning and teaching them to recognize their own uncertainty. These approaches, though imperfect, could improve the reliability of responses.
Finally, Sohrob Kazerounian reminds us that, given the limitations of AI, a healthy dose of skepticism remains necessary. Just as with information provided by humans, it is important to verify data produced by language models, especially where accuracy is crucial.
Why do AIs hallucinate?
Hallucination in AI models is a phenomenon stemming from their ability to generate original content. Unlike traditional systems that rely on existing data, advanced language models attempt to create new responses, which can lead to inventions.
This capability is essential for creative tasks, such as writing or design, where innovation is valued. However, it becomes problematic when AI is used to provide factual information, where accuracy is required.
Researchers are working on methods to reduce these hallucinations without stifling the models' creativity. Among these methods are the use of external databases to verify facts or the introduction of internal verification mechanisms.
Despite these efforts, hallucination remains a major issue, especially with the rapid evolution of AI models' capabilities.