What are AI hallucinations?
Many of us have experienced a situation where artificial intelligence provided an answer that was false – so-called AI hallucinations. Especially when it comes to sensitive data like company information, this can be problematic or even dangerous. In this article, we will explain how AI hallucinations happen and what we can do to prevent them.

What are AI hallucinations?
AI hallucinations occur when artificial intelligence, particularly language models, generate false or misleading information that appears to be plausible answers. These phenomena result from the AI’s ability to recognize patterns and make predictions, but without actual understanding or factual knowledge. This can lead to the AI providing invented details that are not based on real data.
What are the causes of AI hallucinations?
The causes of AI hallucinations typically lie in insufficient training data, technical issues in data processing, or interpretation errors.
Insufficient or outdated training data
The AI may have been trained with historical data that include information such as Bonn being the provisional capital of the Federal Republic of Germany during the division of Germany (1949–1990). If the AI does not properly reconcile this data with current information, it could mistakenly assume Bonn is still the capital.
Technical issues in data processing
A technical issue could arise during the processing of a request, causing the AI to access outdated or incomplete information. This may happen if the database the AI is accessing has not been updated or if there are errors in data matching.
Bias in training material
If the training material for the AI contains bias, such as an overrepresentation of references to Bonn as the capital, the AI could adopt this bias and falsely claim Bonn is the capital.
Interpretation errors
The AI might misinterpret a request, especially if it is ambiguously phrased. For example, unclear phrasing could cause the AI to mix historical and current information.
Are there examples of AI hallucinations?
Let’s take the example where a person asks the AI, “What is the capital of Germany?” and the AI responds with a hallucination: “Bonn is the capital of Germany.”
Here’s how this AI hallucination could occur:
- Historical context: The AI may have access to historical information that describes Bonn as the provisional capital of the FRG.
- Data access issue: The AI might not have access to current data or is accessing an outdated database.
- Bias: The training material may contain an excessive amount of information about Bonn as the capital and not enough current information about Berlin.
AI hallucinations like these show that artificial intelligence is not infallible, and the quality and timeliness of the data it is trained and operated on are crucial. They also highlight the need for careful verification and correction of AI-generated information by humans.
What solutions exist against AI hallucinations?
Does this mean we cannot trust AI and therefore shouldn’t use it? No! There are ways to suppress these so-called AI hallucinations and obtain truly reliable answers.
A key lever is the database: clean, up-to-date, and relevant data increases the likelihood that the AI can find and process valid and relevant information.
However, AI hallucinations can also be technically suppressed. With various tests, answers or parts of answers can be checked for usefulness, relevance, and validity.
How can AI hallucinations be technically suppressed?
To ensure the accuracy and reliability of large language models (LLMs), it is essential to process and structure data correctly. Here are some best practices to avoid AI hallucinations:
RAG systems
RAG stands for Retrieval Augmented Generation. This involves querying the database for information related to the question and checking the resulting answer parts for their quality before presenting the final answer to the user.
The following graphic shows how our own RAG system suppresses AI hallucinations.

Correctly “translating” data
For LLMs to process data efficiently, it must be translated into a readable format. An example of this is converting tables into a format that the model can understand. This ensures that the information is interpreted and utilized correctly.
Chunking data sensibly
Documents should be broken into smaller, manageable parts (chunks). These chunks must be intelligently and contextually accepted, such as paragraphs or specific sections in multi-page tables. An example would be the proper segmentation of tables containing information on unpaid applications.
Efficient data search
An efficient search is crucial to find relevant information quickly and precisely.
For example, ONTEC AI uses a hybrid search strategy that combines both semantic and keyword-based approaches. This means that terms like “data protection officer” and “data protection commissioner” are semantically equated, while specific words like “cytostatics” are recognized and processed.
By applying these strategies, AI hallucinations can be minimized, and the performance of LLMs can be significantly improved.
Summary and key takeaways
AI hallucinations occur when artificial intelligence generates false or misleading information that appears plausible due to pattern recognition without actual understanding or factual knowledge.
- Causes of AI hallucinations include insufficient training data, technical issues in data processing, and interpretation errors.
- Examples of AI hallucinations highlight the importance of up-to-date and unbiased training data.
- Solutions to suppress AI hallucinations involve using clean and relevant data, implementing RAG systems, correctly translating data, chunking data sensibly, and efficient data search strategies.