Hallucination (artificial intelligence)
In the field of artificial intelligence, a hallucination or artificial hallucination is a response generated by AI that contains false or misleading information presented as fact. This term draws a loose analogy with human psychology, where a hallucination typically involves false percepts. However, there is a key difference: AI hallucination is associated with erroneously constructed responses, rather than perceptual experiences.
For example, a chatbot powered by large language models, like ChatGPT, may embed plausible-sounding random falsehoods within its generated content. Detecting and mitigating errors and hallucinations pose significant challenges for practical deployment and reliability of LLMs in high-stakes scenarios, such as chip design, supply chain logistics, and medical diagnostics. Some software engineers and statisticians have criticized the specific term "AI hallucination" for unreasonably anthropomorphizing computers.
Term
Origin
Since the 1980s, the term "hallucination" has been used in computer vision with a positive connotation to describe the process of adding detail to an image. For example, the task of generating high-resolution face images from low-resolution inputs is called face hallucination. The first documented use of the term "hallucination" in this sense is in the PhD thesis of Eric Mjolsness in 1986. A notable work is the face hallucination algorithm by Simon Baker and Takeo Kanade published in 1999.In 1995, Stephen Thaler demonstrated how hallucinations and phantom experiences emerge from artificial neural networks through random perturbation of their connection weights.
In the 2000s, hallucinations were described in statistical machine translation as a failure mode.
Since the 2010s, the term underwent a semantic shift to signify the generation of factually incorrect or misleading outputs by AI systems in tasks like machine translation and object detection. In 2015, hallucinations were identified in visual semantic role labeling tasks by Saurabh Gupta and Jitendra Malik.. In 2015, computer scientist Andrej Karpathy used the term "hallucinated" in a blog post to describe his recurrent neural network language model generating an incorrect citation link. In 2017, Google researchers used the term to describe the responses generated by neural machine translation models when they are not related to the source text, and in 2018, the term was used in computer vision to describe instances where non-existent objects are erroneously detected because of adversarial attacks.
The term "hallucinations" in AI gained wider recognition during the AI boom, alongside the rollout of widely used chatbots based on large language models. In July 2021, Meta warned during its release of BlenderBot 2 that the system is prone to "hallucinations", which Meta defined as "confident statements that are not true". Following OpenAI's ChatGPT release in beta version in November 2022, some users complained that such chatbots often seem to pointlessly embed plausible-sounding random falsehoods within their generated content. Many news outlets, including The New York Times, started to use the term "hallucinations" to describe these models' frequently incorrect or inconsistent responses.
Some researchers have highlighted a lack of consistency in how the term is used, but also identified several alternative terms in the literature, such as confabulations, fabrications, and factual errors.
In 2023, the Cambridge dictionary updated its definition of hallucination to include this new sense specific to the field of AI.
Definitions and alternatives
Uses, definitions and characterizations of the term "hallucination" in the context of LLMs include:- "a tendency to invent facts in moments of uncertainty"
- "a model's logical mistakes"
- "fabricating information entirely, but behaving as if spouting facts"
- "making up information"
- "probability distributions"
Hicks, Humphries, and Slater, in their article in Ethics and Information Technology, argue that the output of LLMs is "bullshit" under Harry Frankfurt's definition of the term, and that the models are "in an important way indifferent to the truth of their outputs", with true statements only accidentally true, and false ones accidentally false.
Criticism
In the scientific community, some researchers avoid the term "hallucination", seeing it as potentially misleading. It has been criticized by Usama Fayyad, executive director of the Institute for Experimental Artificial Intelligence at Northeastern University, on the grounds that it misleadingly personifies large language models and is vague. Mary Shaw said, "The current fashion for calling generative AI's errors 'hallucinations' is appalling. It anthropomorphizes the software, and it spins actual errors as somehow being idiosyncratic quirks of the system even when they're objectively incorrect." In Salon, statistician Gary N. Smith argues that LLMs "do not understand what words mean" and consequently that the term "hallucination" unreasonably anthropomorphizes the machine. Some see the AI outputs not as illusory but as prospective—that is, having some chance of being true, similar to early-stage scientific conjectures. The term has also been criticized for its association with psychedelic drug experiences.In natural language generation
In natural language generation, a hallucination is often defined as "generated content that appears factual but is ungrounded". There are different ways to categorize hallucinations. Depending on whether the output contradicts the source or cannot be verified from the source, they are divided into intrinsic and extrinsic, respectively. Depending on whether the output contradicts the prompt or not, they could be divided into closed-domain and open-domain, respectively.Causes
There are several reasons why natural language models hallucinate:Hallucination from data
Hallucinations can stem from incomplete, inaccurate or unrepresentative data sets. One possible cause is source-reference divergence. This divergence may occur as an artifact of heuristic data collection or due to the nature of some natural language generation tasks that inevitably contain such divergence. When a model is trained on data with source-reference divergence, the model can be encouraged to generate text that is not necessarily grounded and not faithful to the provided source.Modeling-related causes
The pre-training of generative pretrained transformers involves predicting the next word. It incentivizes GPT models to "give a guess" about what the next word is, even when they lack information. After pre-training, though, hallucinations can be mitigated through anti-hallucination fine-tuning. Some researchers take an anthropomorphic perspective and posit that hallucinations arise from a tension between novelty and usefulness. For instance, Amabile and Pratt define human creativity as the production of novel and useful ideas. By extension, a focus on novelty in machine creativity can lead to the production of original but inaccurate responses—that is, falsehoods—whereas a focus on usefulness may result in memorized content lacking originality.Pre-training of models on a large corpus is known to result in the model memorizing knowledge in its parameters, creating hallucinations if the system is overconfident in its knowledge. In systems such as GPT-3, an AI generates each next word based on a sequence of previous words, causing a cascade of possible hallucinations as the response grows longer. By 2022, newspapers such as The New York Times expressed concern that, as the adoption of bots based on large language models continued to grow, unwarranted user confidence in bot output could lead to problems.
For models that also have an encoder, errors in encoding and decoding between text and representations can cause hallucinations. When encoders learn the wrong correlations between different parts of the training data, it can result in an erroneous generation that diverges from the input. The decoder takes the encoded input from the encoder and generates the final target sequence. Two aspects of decoding contribute to hallucinations. First, decoders can attend to the wrong part of the encoded input source, leading to erroneous generation. Second, the design of the decoding strategy itself can contribute to hallucinations. A decoding strategy that improves generation diversity, such as top-k sampling, is positively correlated with increased hallucination.
Interpretability research
In 2025, interpretability research by Anthropic on the LLM Claude identified internal circuits that cause it to decline to answer questions unless it knows the answer. By default, the circuit is active and the LLM doesn't answer. When the LLM has sufficient information, these circuits are inhibited and the LLM answers the question. Hallucinations were found to occur when this inhibition happens incorrectly, such as when Claude recognizes a name but lacks sufficient information about that person, causing it to generate plausible but untrue responses.Examples
On 15 November 2022, researchers from Meta AI published Galactica, designed to "store, combine and reason about scientific knowledge". Content generated by Galactica came with the warning: "Outputs may be unreliable! Language Models are prone to hallucinate text." In one case, when asked to draft a paper on creating avatars, Galactica cited a fictitious paper from a real author who works in the relevant area. Meta withdrew Galactica on 17 November due to offensiveness and inaccuracy. Before the cancellation, researchers were working on Galactica Instruct, which would use instruction tuning to allow the model to follow instructions to manipulate LaTeX documents on Overleaf.OpenAI's ChatGPT, released in beta version to the public on November 30, 2022, was based on the foundation model GPT-3.5. Professor Ethan Mollick of Wharton called it an "omniscient, eager-to-please intern who sometimes lies to you". Data scientist Teresa Kubacka has recounted deliberately making up the phrase "cycloidal inverted electromagnon" and testing ChatGPT by asking it about the phenomenon. ChatGPT invented a plausible-sounding answer backed with plausible-looking citations that compelled her to double-check whether she had accidentally typed in the name of a real phenomenon. Other scholars such as Oren Etzioni have joined Kubacka in assessing that such software can often give "a very impressive-sounding answer that's just dead wrong".
When CNBC asked ChatGPT for the lyrics to "Ballad of Dwight Fry", ChatGPT supplied invented lyrics rather than the actual lyrics. Asked questions about the Canadian province of New Brunswick, ChatGPT got many answers right but incorrectly classified Toronto-born Samantha Bee as a "person from New Brunswick". Asked about astrophysical magnetic fields, ChatGPT incorrectly volunteered that " magnetic fields of black holes are generated by the extremely strong gravitational forces in their vicinity". Fast Company asked ChatGPT to generate a news article on Tesla's last financial quarter; ChatGPT created a coherent article, but made up the financial numbers contained within.
Other examples involve baiting ChatGPT with a false premise to see if it embellishes upon the premise. When asked about "Harold Coward's idea of dynamic canonicity", ChatGPT fabricated that Coward wrote a book titled Dynamic Canonicity: A Model for Biblical and Theological Interpretation, arguing that religious principles are actually in a constant state of change. When pressed, ChatGPT continued to insist that the book was real. Asked for proof that dinosaurs built a civilization, ChatGPT claimed there were fossil remains of dinosaur tools and stated, "Some species of dinosaurs even developed primitive forms of art, such as engravings on stones". When prompted that "Scientists have recently discovered churros, the delicious fried-dough pastries ... ideal tools for home surgery", ChatGPT claimed that a "study published in the journal Science found that the dough is pliable enough to form into surgical instruments that can get into hard-to-reach places, and that the flavor has a calming effect on patients.
By 2023, analysts considered frequent hallucination to be a major problem in LLM technology, with a Google executive identifying hallucination reduction as a "fundamental" task for ChatGPT competitor Google Gemini. A 2023 demo for Microsoft's GPT-based Bing AI appeared to contain several hallucinations that went uncaught by the presenter.
In May 2023, it was discovered that Stephen Schwartz had submitted six fake case precedents generated by ChatGPT in his brief to the Southern District of New York on Mata v. Avianca, Inc., a personal injury case against the airline Avianca. Schwartz said that he had never previously used ChatGPT, that he did not recognize the possibility that ChatGPT's output could have been fabricated, and that ChatGPT continued to assert the authenticity of the precedents after their nonexistence was discovered. In response, Brantley Starr of the Northern District of Texas banned the submission of AI-generated case filings that have not been reviewed by a human, noting that:
On June 23, judge P. Kevin Castel dismissed the Mata case and issued a $5,000 fine to Schwartz and another lawyer—who had both continued to stand by the fictitious precedents despite Schwartz's previous claims—for bad faith conduct. Castel characterized numerous errors and inconsistencies in the opinion summaries, describing one of the cited opinions as "gibberish" and " on nonsensical".
In June 2023, Mark Walters, a gun rights activist and radio personality, sued OpenAI in a Georgia state court after ChatGPT mischaracterized a legal complaint in a manner alleged to be defamatory against Walters. The complaint in question was brought in May 2023 by the Second Amendment Foundation against Washington attorney general Robert W. Ferguson for allegedly violating their freedom of speech, whereas the ChatGPT-generated summary bore no resemblance and claimed that Walters was accused of embezzlement and fraud while holding a Second Amendment Foundation office post that he never held in real life. According to AI legal expert Eugene Volokh, OpenAI is likely not shielded against this claim by Section 230, because OpenAI likely "materially contributed" to the creation of the defamatory content. In May 2025, Judge Tracie Cason of Gwinnett County Superior Court ruled in favor of OpenAI. Stating that the plaintiff had not shown he was defamed, as Walters failed to show that OpenAI's statements about him were negligent or made with "actual malice".
In February 2024, Canadian airline Air Canada was ordered by the Civil Resolution Tribunal to pay damages to a customer and honor a bereavement fare policy that was hallucinated by a support chatbot, which incorrectly stated that customers could retroactively request a bereavement discount within 90 days of the date the ticket was issued. The Tribunal rejected Air Canada's defense that the chatbot was a "separate legal entity that is responsible for its own actions".
In October 2025, several hallucinations, including non-existent academic sources and a fake quote from a federal court judgement were discovered in an A$440,000 report written by Deloitte and submitted to the Australian government in July. The company later submitted a revised report with these errors removed, and will issue a partial refund to the government. The following month, in November 2025, The Independent, a news publication in Newfoundland and Labrador, Canada, discovered that Deloitte's CA$1.6 million Health Human Resources Plan for the Government of Newfoundland and Labrador commissioned in May 2025 contained at least four false citations to non-existent research papers.