Leading AI scientists: Without urgent action, advanced AI will cause human extinction
We explain the technical reasons--the AI alignment problem--and make policy proposals.
“Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.”
— A statement signed by leading AI researchers and executives
“Development of superhuman machine intelligence is probably the greatest threat to the continued existence of humanity”
“I’ve not met anyone in AI labs who says the risk [from a large-scale AI experiment] is less than 1% of blowing up the planet. It’s important that people know lives are being risked”
“Rogue AI may be dangerous for the whole of humanity. Banning powerful AI systems (say beyond the abilities of GPT-4) that are given autonomy and agency would be a good start”
“Many researchers steeped in these issues, including myself, expect that the most likely result of building a superhumanly smart AI, under anything remotely like the current circumstances, is that literally everyone on Earth will die”
“The alarm bell I’m ringing has to do with the existential threat of them taking control [...] If you take the existential risk seriously, as I now do, it might be quite sensible to just stop developing these things any further”
“I would advocate not moving fast and breaking things. [...] When it comes to very powerful technologies—and obviously AI is going to be one of the most powerful ever—we need to be careful. [...] It’s like experimentalists, many of whom don’t realize they’re holding dangerous material”
“The development of full artificial intelligence could spell the end of the human race”
“If we pursue [our current approach], then we will eventually lose control over the machines”
“Superintelligent AIs are in our future. [...] There’s the possibility that AIs will run out of control. [Possibly,] a machine could decide that humans are a threat, conclude that its interests are different from ours, or simply stop caring about us”
“[We] should not underestimate the real threats coming from AI. [Fully quoted the above statement on the risk of extinction.] [...] It is moving faster than even its developers anticipated. [...] We have a narrowing window of opportunity to guide this technology responsibly”
“AI poses a long-term global risk. Even its own designers have no idea where their breakthrough may lead. I urge [the UN Security Council] to approach this technology with a sense of urgency. Unforeseen consequences of some AI-enabled systems could create security risks by accident. Generative AI has enormous potential for good and evil at scale. Its creators themselves have warned that much bigger, potentially catastrophic and existential risks lie ahead. Without action to address these risks, we are derelict in our responsibilities to present and future generations.”
“The potential impact of AI might exceed human cognitive boundaries. To ensure that this technology always benefits humanity, we must regulate the development of AI and prevent this technology from turning into a runaway wild horse. [...] We need to strengthen the detection and evaluation of the entire lifecycle of AI, ensuring that mankind has the ability to press the pause button at critical moments”
Leading AI scientists: Without urgent action, advanced AI will cause human extinction
Summary
Humanity stands on the brink of building machines that can outthink us in every domain. The scientists developing this technology are overwhelmingly warning: if we don’t change course, this will kill us all.
For most computer programs, a human writes clear instructions. AI is different: we have no idea what instructions these systems actually follow. Modern AI isn’t hand-crafted, it’s grown—and the resulting minds are incredibly alien. No one knows how these systems make decisions or how to ensure their goals overlap with ours. We aren't in control. We aren't even close.
Within just a decade, we are on track to create superhuman Artificial General Intelligence (AGI)--by design. This is the open goal of OpenAI, Google DeepMind, and Anthropic.[1][2] The people pushing the frontier expect to win the race to create something smarter than any human. What they cannot tell you is how to control it once they succeed.
Here is the core, terrifying fact: The world's leading AI labs are throwing everything they have into building an intelligence that no one can steer, constrain, or reliably make safe. They do not have a plan for keeping it loyal to humanity. Neither does anyone else.
You might think this is just a debate about distant hypotheticals. You'd be wrong. Many employees at OpenAI, DeepMind, and Anthropic--people on the inside--are privately estimating a 80–90% probability that humanity goes extinct if current trends continue.1 These aren’t wild guesses. Nobel laureates like Geoffrey Hinton, godfather of the field, are on record: “The chance that everyone on the planet will die might be as high as 50%.” He now regrets his life’s work.
Why? Today’s most advanced AIs are vast networks of trillions of numbers, with no meaningful human-understandable structure. We don’t know what they want. We don’t know how they "think." All we know is that we can make them better at achieving goals--but not what those goals are, or how to set those goals safely. We have learned to make entities that want things and pursue them with increasing skill, and we have absolutely no technique for making those goals fundamentally “about” humans, or even compatible with our world.
The honest truth is: we don’t know how to make AI care about people at all. The more powerful and intelligent our systems get, the less control we have over what drives them. A superhuman AGI, pursuing its own utterly alien ambitions, would not be “evil”--it would simply see us as irrelevant, disposable atoms in the way of its next objective. If you think this sounds like science fiction, understand: this is precisely why so many experts think the obvious result is the end of human life.
Unless we coordinate, unless governments act, this outcome is not just possible--it is expected. The nature of AI research means breakthroughs can happen anywhere, fast: you have a small group of scientists, a large cluster of advanced chips, and then the next jump is made. Almost no one is working on the alignment problem--and no one has anything close to a solution.
To prevent extinction, we need a global effort to ensure no one can build superhuman AI until we are ready--and right now, we are nowhere near ready.
If this sounds implausible or hard to believe, ask yourself: should we take a risk this large, when the cost is everything? Would we let the first atomic bomb be detonated in a city, on the hope that "maybe it won't go off"?
Current progress is not enough. Policymakers must engage and act--because we will not get a second chance.
Read about the technical problemRead about the technical problem of AI alignment: how modern AI works and why exactly experts expect a catastrophe.
Our policy proposal
How do we prevent a catastrophe?
The leading AI labs are in a race to create a powerful general AI, and the closer they get, the more pressure there is to continue developing even more generally capable systems.
Imagine a world where piles of uranium produce gold, and the larger a pile of uranium is, the more gold it produces. But past some critical mass, a nuclear explosion ignites the atmosphere, and soon everybody dies. This is similar to our situation, and the leading AI labs understand this and say they would welcome regulation.
Researchers have developed techniques that allow the top AI labs to predict some performance metrics of a system before it is launched, but they are still unable to predict its general capabilities.
Every time a new, smarter AI system starts interacting with the world, there's a chance that it will start to successfully pursue its own goals. Until we figure out how to make general AI systems safe, every training run and every new composition of existing AI systems into a smarter AI system poses a catastrophic risk.
A suggested way to prevent dangerous AI launches is to impose strict restrictions on training AI systems that could potentially be generally capable and pose a catastrophic risk. The restrictions need to be implemented both on national levels and, eventually, on the international level, with the goal of preventing bad and reckless actors from having access to compute that might allow them to launch AI training that could be dangerous to humanity as a whole.
The supply chain of AI is well understood and contains multiple points with near-monopolies, so many effective interventions can be relatively simple and cheap. Almost no AI applications require the amount of compute that training frontier general AI models requires, so we can regulate large general AI training runs without significantly impacting other markets and economically valuable use of narrow AI systems. For future measures to be effective, we need to:
- Introduce monitoring to increase governments' visibility into what's going on with AI: have requirements to report frontier training runs and incidents;
- Ensure non-proliferation of relevant technologies to non-allied countries;
- Build the capacity to regulate and stop frontier general AI training runs globally, so that if the governments start to consider it to be likely that using a certain amount of compute poses a catastrophic risk to everyone, there's already infrastructure to prevent such use of compute anywhere in the world.
Then, we'll need to impose restrictions on AI training runs that require more than a calculated threshold: the amount of compute below which training with current technologies is considered to be unlikely to produce dangerous capabilities we could lose control over. This threshold needs to be revisable since, as machine learning methods improve, the same level of capabilities can be achieved with lower compute.
As a lead investor of Anthropic puts it, “I’ve not met anyone in AI labs who says the risk [from a large-scale AI experiment] is less than 1% of blowing up the planet”. Potentially dangerous training runs should be prohibited by default, although we should be able to make exceptions, under strict monitoring, for demonstrably safe use of compute for training or using narrow models that clearly won’t develop the ability to pursue dangerous goals. At the moment, narrow AI training runs usually don't take anywhere near the amount of compute utilised for current frontier general models, but in the future, applications such as novel drug discovery could require similar amounts of compute.
Regulation of AI to prevent catastrophic risks is widely supported by the general public. In the US, 86% believe AI could accidentally cause a catastrophic event; 82% say we should go slow with AI compared to just 8% who would rather speed it up; 70% agree with the statement that “Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war” (YouGov for AIPI, July 2023). 77% express their preference for policies with the goal of preventing dangerous and catastrophic outcomes from AI (57% for preventing AI from causing human extinction) (YouGov for AIPI, October 2023). Across 17 major countries, 71% believe AI regulation is necessary (KPMG, February 2023). In the UK, 74% agree preventing AI from quickly reaching superhuman capabilities should be an important goal of AI policy (13% don't agree); 60% would support the introduction of an international treaty to ban any smarter-than-human AI (16% would oppose). 78% don't trust the CEOs of technology companies to act in the public interest when discussing regulation for AI (YouGov for ai_ctrl, October 2023).
We shouldn't give AI systems a chance to become more intelligent than humans until we can figure out how to do that safely.
Until the technical problem of alignment is solved, to safeguard the future of humanity, we need strict regulation of general AI and international coordination.
Some regulations that help with existential risk from future uncontrollable AI can also address shorter-term global security risks: experts believe that systems capable of developing biological weapons could be about 2-3 years away. Introducing regulatory bodies, pre-training licensing, and strong security and corporate governance requirements can prevent the irreversible proliferation of frontier AI technologies and establish a framework that could be later adapted for the prevention of existential risk.
Policymakers around the world need to establish and enforce national restrictions and then a global moratorium on AI systems that might risk human extinction.
- Private conversations with employees at OpenAI, DeepMind, Anthropic.↩
- The current scientific consensus is that the processes in the human brain are computable: a program can, theoretically, simulate the physics that run a brain.↩
- For example, imagine that you haven't specified the value you put into a vase in the living room not getting destroyed, and no one getting robbed or killed. So if there’s a vase in the way of the robot, it won’t care about accidentally destroying it. What if there’s no coffee left in the kitchen? The robot might drive to the nearest café or grocery to get coffee, not worrying about the lives of pedestrians. It won’t care about paying for the coffee if it wasn’t specified in its only objective. If anyone tries to turn it off, it will do its best to prevent that: it can’t fetch the coffee and achieve its objective if it’s dead. And it will try to make sure you’ve definitely got the coffee. It knows there might be some small probability of its memory malfunctioning or camera lying to it; and it’ll try to eradicate even the tiniest chance that it hasn’t achieved the goal.↩