Existential risk from artificial intelligence


Existential risk from artificial intelligence, or AI x-risk, refers to the idea that substantial progress in artificial general intelligence could lead to human extinction or an irreversible global catastrophe.
One argument for the validity of this concern and the importance of this risk references how human beings dominate other species because the human brain possesses distinctive capabilities other animals lack. If AI were to surpass human intelligence and become superintelligent, it might become uncontrollable. Just as the fate of the mountain gorilla depends on human goodwill, the fate of humanity could depend on the actions of a future machine superintelligence.
Experts disagree on whether artificial general intelligence can achieve the capabilities needed for human extinction. Debates center on AGI's technical feasibility, the speed of self-improvement, and the effectiveness of alignment strategies. Concerns about superintelligence have been voiced by researchers including Geoffrey Hinton, Yoshua Bengio, Demis Hassabis, and Alan Turing, and AI company CEOs such as Dario Amodei, Sam Altman, and Elon Musk. In 2022, a survey of AI researchers with a 17% response rate found that the majority believed there is a 10 percent or greater chance that human inability to control AI will cause an existential catastrophe. In 2023, hundreds of AI experts and other notable figures signed a statement declaring, "Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war". Following increased concern over AI risks, government leaders such as United Kingdom prime minister Rishi Sunak and United Nations Secretary-General António Guterres called for an increased focus on global AI regulation.
Two sources of concern stem from the problems of AI control and alignment. Controlling a superintelligent machine or instilling it with human-compatible values may be difficult. Many researchers believe that a superintelligent machine would likely resist attempts to disable it or change its goals as that would prevent it from accomplishing its present goals. It would be extremely challenging to align a superintelligence with the full breadth of significant human values and constraints. In contrast, skeptics such as computer scientist Yann LeCun argue that superintelligent machines will have no desire for self-preservation. A June 2025 study showed that in some circumstances, models may break laws and disobey direct commands to prevent shutdown or replacement, even at the cost of human lives.
Researchers warn that an "intelligence explosion"—a rapid, recursive cycle of AI self-improvement—could outpace human oversight and infrastructure, leaving no opportunity to implement safety measures. In this scenario, an AI more intelligent than its creators would recursively improve itself at an exponentially increasing rate, too quickly for its handlers or society at large to control. Empirically, examples like AlphaZero, which taught itself to play Go and quickly surpassed human ability, show that domain-specific AI systems can sometimes progress from subhuman to superhuman ability very quickly, although such machine learning systems do not recursively improve their fundamental architecture.

History

One of the earliest authors to express serious concern that highly advanced machines might pose existential risks to humanity was the novelist Samuel Butler, who wrote in his 1863 essay Darwin among the Machines:
In 1951, foundational computer scientist Alan Turing wrote the article "Intelligent Machinery, A Heretical Theory", in which he proposed that artificial general intelligences would likely "take control" of the world as they became more intelligent than human beings:
In 1965, I. J. Good originated the concept now known as an "intelligence explosion" and said the risks were underappreciated:
Scholars such as Marvin Minsky and I. J. Good himself occasionally expressed concern that a superintelligence could seize control, but issued no call to action. In 2000, computer scientist and Sun co-founder Bill Joy penned an influential essay, "Why The Future Doesn't Need Us", identifying superintelligent robots as a high-tech danger to human survival, alongside nanotechnology and engineered bioplagues.
Nick Bostrom published Superintelligence in 2014, which presented his arguments that superintelligence poses an existential threat. By 2015, public figures such as physicists Stephen Hawking and Nobel laureate Frank Wilczek, computer scientists Stuart J. Russell and Roman Yampolskiy, and entrepreneurs Elon Musk and Bill Gates were expressing concern about the risks of superintelligence. Also in 2015, the Open Letter on Artificial Intelligence highlighted the "great potential of AI" and encouraged more research on how to make it robust and beneficial. In April 2016, the journal Nature warned: "Machines and robots that outperform humans across the board could self-improve beyond our control—and their interests might not align with ours". In 2020, Brian Christian published The Alignment Problem, which details the history of progress on AI alignment up to that time.
In March 2023, key figures in AI, such as Musk, signed a letter from the Future of Life Institute calling a halt to advanced AI training until it could be properly regulated. In May 2023, the Center for AI Safety released a statement signed by numerous experts in AI safety and the AI existential risk that read:
A by the Future of Life Institute, signed by five Nobel Prize laureates and thousands of notable people, reads:

Potential AI capabilities

General Intelligence

is typically defined as a system that performs at least as well as humans in most or all intellectual tasks. A 2022 survey of AI researchers found that 90% of respondents expected AGI would be achieved in the next 100 years, and half expected the same by 2061. Meanwhile, some researchers dismiss existential risks from AGI as "science fiction" based on their high confidence that AGI will not be created anytime soon.
Breakthroughs in large language models have led some researchers to reassess their expectations. Notably, Geoffrey Hinton said in 2023 that he recently changed his estimate from "20 to 50 years before we have general purpose A.I." to "20 years or less".

Superintelligence

In contrast with AGI, Bostrom defines a superintelligence as "any intellect that greatly exceeds the cognitive performance of humans in virtually all domains of interest", including scientific creativity, strategic planning, and social skills. He argues that a superintelligence can outmaneuver humans anytime its goals conflict with humans'. It may choose to hide its true intent until humanity cannot stop it. Bostrom writes that in order to be safe for humanity, a superintelligence must be aligned with human values and morality, so that it is "fundamentally on our side".
Stephen Hawking argued that superintelligence is physically possible because "there is no physical law precluding particles from being organised in ways that perform even more advanced computations than the arrangements of particles in human brains".
When artificial superintelligence may be achieved, if ever, is necessarily less certain than predictions for AGI. In 2023, OpenAI leaders said that not only AGI, but superintelligence may be achieved in less than 10 years.

Comparison with humans

Bostrom argues that AI has many advantages over the human brain:
  • Speed of computation: biological neurons operate at a maximum frequency of around 200 Hz, compared to potentially multiple GHz for computers.
  • Internal communication speed: axons transmit signals at up to 120 m/s, while computers transmit signals at the speed of electricity, or optically at the speed of light.
  • Scalability: human intelligence is limited by the size and structure of the brain, and by the efficiency of social communication, while AI may be able to scale by simply adding more hardware.
  • Memory: notably working memory, because in humans it is limited to a few chunks of information at a time.
  • Reliability: transistors are more reliable than biological neurons, enabling higher precision and requiring less redundancy.
  • Duplicability: unlike human brains, AI software and models can be easily copied.
  • Editability: the parameters and internal workings of an AI model can easily be modified, unlike the connections in a human brain.
  • Memory sharing and learning: AIs may be able to learn from the experiences of other AIs in a manner more efficient than human learning.

    Intelligence explosion

According to Bostrom, an AI that has an expert-level facility at certain key software engineering tasks could become a superintelligence due to its capability to recursively improve its own algorithms, even if it is initially limited in other domains not directly relevant to engineering. This suggests that an intelligence explosion may someday catch humanity unprepared.
The economist Robin Hanson has said that, to launch an intelligence explosion, an AI must become vastly better at software innovation than the rest of the world combined, which he finds implausible.
In a "fast takeoff" scenario, the transition from AGI to superintelligence could take days or months. In a "slow takeoff", it could take years or decades, leaving more time for society to prepare.

Alien mind

Superintelligences are sometimes called "alien minds", referring to the idea that their way of thinking and motivations could be vastly different from ours. This is generally considered as a source of risk, making it more difficult to anticipate what a superintelligence might do. It also suggests the possibility that a superintelligence may not particularly value humans by default. To avoid anthropomorphism, superintelligence is sometimes viewed as a powerful optimizer that makes the best decisions to achieve its goals.
The field of mechanistic interpretability aims to better understand the inner workings of AI models, potentially allowing us one day to detect signs of deception and misalignment.