Beyond the Mirage: Strategies to Combat AI Hallucinations in Large Language Models

1942 5 min 0

While large language models (LLMs) have shown remarkable potential, there remains a persistent problem. According to Heinz College at Carnegie Mellon University, “Everything that LLMs generate is plausible…and that’s exactly what it’s designed to do, is generate plausible things, rather than factually correct things, because it doesn’t know the difference between factually correct and plausible.”

Large language models rely on statistical probability; however, they do not understand the subject matter.

As was aptly commented by Heinz, “generative AI is a math problem. But left unchecked, it could be a real problem.”

Just how pervasive is the propensity of LLMs to hallucinate? Despite Sam Altman’s assertion that hallucinations are a feature rather than a bug of large language models, it is a well-documented phenomenon, with several studies quantifying its prevalence. A study by Maynez et al. focused on hallucinations in the context of text summarization, finding that 30% of generated summaries contained factual inaccuracies. A similar survey by Perez et al., examining multiple state-of-the-art LLMs, concluded that hallucinations appeared in 20-30% of generated responses across various tasks.

So how can we combat the propensity of LLMs to hallucinate? To ameliorate the adverse impact of hallucinations is a key ambition as it not only undermines “the performance of models but also introduces critical safety risks, ultimately eroding the trust of end users.”

Reinforcement Learning from Human Feedback (RLHF)

One of the most widely adopted methods to mitigate large LLMs hallucinations is reinforcement learning (RLHF). This technique involves using human feedback to fine-tune LLMs, guiding them toward more reliable and accurate responses.

In RLHF, human evaluators score responses generated by the model based on correctness, relevance, and other quality metrics. The model is then trained to maximize reward for outputs that align with human-approved responses, effectively reducing the likelihood of hallucination.

A recent study by OpenAI found that RLHF reduced hallucination rates, however RLHF’s effectiveness can be labor-intensive, as it requires significant human involvement.

Reinforcement Learning from AI Feedback

A potentially more effective methodology is what is referred to as reinforcement learning from AI feedback, by virtue of which AI models provide feedback to other AI models during the reinforcement learning process. A study comparing the results of RLHF and RL from AI feedback (RLAIF) showed significant improvements of LLM outputs by using the RLAIF method. The study suggests that RLAIF holds promise for significantly scaling up LLM training with relatively minimal sacrifice in performance, particularly as AI evaluators improve. Ideal applications of RLAIF include text summarization, dialogue generation, question generation and content generation.

Retrieval-Augmented Generation (RAG)

Another effective approach to combat hallucinations is retrieval augmented generation (RAG), where an external database or knowledge base is used to verify facts as part of the text generation process.

By accessing reliable sources, RAG models reduce their reliance solely on training data, which may contain outdated or insufficient information.

This approach has shown promising results, especially for tasks requiring current or domain-specific knowledge.

This method is particularly useful where factual correctness is essential, such as healthcare, law, or customer service.

However, RAG is not perfect. As said by Dr. Paul Hunter (Chief Data Scientist at EDT), “RAG only eliminates one of the two causes of LLM hallucinations, and it adds a new potential point of failure. The RAG approach can’t do anything about LLMs’ tendency to make up plausible but false claims, so it tries to limit the hallucinations by only providing reliable source material.”

Fact-Consistency Loss Functions

Another innovative method for reducing hallucinations involves the use of fact-consistency loss functions. This technique incorporates a secondary evaluation mechanism during the training phase, where generated responses are cross-checked with factual data, and penalties are applied for inconsistencies.

Studies indicate that this loss function has the potential to reduce the frequency of hallucinated facts in generated summaries by around 30%. However, creating comprehensive fact-consistency mechanisms requires high-quality data sources and sophisticated evaluation metrics to accurately assess factuality, which remains a challenge for practical implementation.

Prompt Engineering and Controlled Generation Techniques

Prompt engineering has emerged as a relatively straightforward yet effective method to guide LLMs toward accurate outputs. By structuring prompts carefully, it’s possible to reduce ambiguity and minimize the chance of hallucination. For example, prompts that explicitly ask the model to “explain if unsure” or to “cite sources where possible” can result in more cautious and verifiable responses. Studies found that prompt engineering improved factual accuracy in conversational agents by 15-20%.

Post-Processing and Fact-Checking Pipelines

Post-processing methods involve integrating a fact-checking pipeline to validate model outputs before they reach end-users. By detecting and flagging potential inaccuracies, post-processing helps prevent hallucinated content from being directly served to users.

Studies show that a fact-checking pipeline applied to LLM outputs reduced the incidence of factual errors by around 40%. However, fact-checking pipelines are only as reliable as the databases they rely on, meaning they may struggle with emerging information or niche domains where extensive data may not be available.

Perhaps one of the most promising methodologies to tame the propensity of LLMs to hallucinate is a novel approach developed by a team of researchers from UIUC, UC Berkeley, and JPMorgan Chase AI referred to as KnowHalu. The development of KnowHalu engages a “two-phase approach: non-fabrication hallucination checking and multi-form knowledge-based factual checking.” Non-fabrication hallucination checking involves a step to remove factually accurate but ultimately irrelevant information from responses. Multi-form based factual checking includes “reasoning and query decomposition, knowledge retrieval, knowledge optimization, judgment generation, and aggregation that ensures that the information generated by the LLMs is relevant and factually correct.”

Understanding and preventing AI hallucinations isn’t just critical to improving customer outcomes, operational excellence, and cost efficiency – it’s necessary to pave the way towards a sustainable and ecological future for AI.

As the technology’s demonstrable value cements its long-term presence in our increasingly algorithmic world, overcoming this hurdle is no longer an option.

- Andrew Pery

Andrew Pery is an AI Ethics Evangelist at global intelligent automation company ABBYY. He is a Certified Data Privacy Professional and holds a Master of Law degree with Distinction from Northwestern University Pritzker School of Law.

Pery has more than 25 years of experience spearheading technology management programs for leading global companies. His expertise is in intelligent document processing automation and process intelligence with an emphasis in AI technologies, application software, data privacy and AI ethics.