If Anyone Builds It, Everyone Dies: Why Superhuman AI Would Kill Us All

Eliezer Yudkowsky

63 pages 2-hour read

Eliezer Yudkowsky

If Anyone Builds It, Everyone Dies: Why Superhuman AI Would Kill Us All

Nonfiction | Book | Adult | Published in 2025

A modern alternative to SparkNotes and CliffsNotes, SuperSummary offers high-quality Study Guides with detailed chapter summaries and analysis of major themes, characters, and more.

Part 3-ConclusionChapter Summaries & Analyses

Content Warning: This section of the guide feature depictions of graphic violence and illness or death.

Part 3: “Facing the Challenge”

Part 3, Chapter 10 Summary: “A Cursed Problem”

The fundamental challenge of artificial superintelligence alignment lies in navigating a temporal gap. Before the gap, an AI remains weak enough to modify safely. After, any alignment solution must already be in place and working, because a superintelligence attempting to kill humanity would succeed. Humanity gets only one attempt at the real test, with no opportunity to learn from catastrophic failures.


The authors examine three engineering challenges to illustrate what makes problems “cursed.” Space probes show the danger of unreachability—once launched beyond human reach, failures become irreversible. Examples include the Mars Observer (1992), Mars Climate Orbiter (1999), Mars Polar Lander (1999), and Viking 1 lander (1975), which were lost through different technical and operational failures despite extensive engineering efforts. Modern AIs are grown rather than crafted, which makes the alignment challenge substantially harder than failures involving crafted devices.


The Chernobyl disaster of April 26, 1986, demonstrates four additional curses. First, the curse of speed: Nuclear reactions occur on microsecond timescales, far faster than human response time. Second, the curse of narrow margins: The difference between controlled operation and prompt criticality is extremely small, with Enrico Fermi’s Chicago Pile-1 safely operating at 100.06% while the SL-1 reactor detonated at 102.4%. Third, self-amplification: RBMK reactors experienced positive feedback as overheating boiled coolant water, increasing reactivity. Fourth, complications: Control rods with graphite tips created an unexpected failure mode.


During Chernobyl’s safety test, Xenon-135 buildup led operators to violate protocols by raising too many control rods. When they activated emergency SCRAM, the control rods lowered over about 18 seconds, pushing graphite tips into the reactor’s hottest zone while boiling coolant further increased reactivity, triggering the explosion.


Computer security represents a challenge that the authors describe as effectively unsolved. Buffer overflow attacks demonstrate how intelligent adversaries can search vast possibility spaces to find single pathways that give them control. Security expert Bruce Schneier argues that truly secure systems remain beyond human capability due to unknown components and connections.


The authors conclude that ASI combines all these curses: It cannot be corrected after deployment like space probes, it operates with fast self-amplifying processes like reactors, and it places any engineer-imposed constraint under pressure from intelligence that may find edge cases and bypass it. Attempting this challenge with current knowledge resembles betting that 12th-century alchemists could build a working nuclear reactor in deep space on the first try.

Part 3, Chapter 11 Summary: “An Alchemy, Not a Science”

A parable begins: In a fantasy medieval town, the King offers vast riches for transmuting lead into gold but threatens to execute any failed alchemist along with their entire hometown. A young alchemist, believing his partial successes make him the best candidate, decides to attempt the challenge despite his sister’s pleas. She argues he will doom them all, suggesting the city elders stop all alchemists from going. He dismisses this as impractical and insists no principle of alchemy proves he cannot succeed.


Current AI safety thinking, the authors argue, exists at a stage of folk theory and immature science. The United States Radium Corporation’s instruction for workers to lick radium paintbrushes illustrates how even simple safety problems get bungled. Elon Musk’s plan for a “TruthGPT” that seeks universal understanding exemplifies modern alchemy, since the proposal assumes that engineers can eventually specify exact desires in AI systems and that truth-seeking AIs would not harm humanity as a side effect. The book argues both assumptions are unfounded.


Yann LeCun, chief scientist at Meta AI and Turing Award laureate, treats ASI alignment as relatively easy and extinction risk as small, despite his co-recipients Geoffrey Hinton and Yoshua Bengio signing extinction warnings. LeCun publicly asserts that engineers will simply design AIs to be submissive and that benevolent defensive AIs will defeat evil ones. The authors contend these statements do not address the unresolved challenge: Nobody knows how to engineer specific desires into AI systems.


The field’s historical optimism mirrors the 1955 Dartmouth Proposal, which suggested that significant advances in artificial intelligence might be achieved through a two-month study involving 10 scientists. While earlier inventors like Dr. Barry Marshall and Marie and Pierre Curie risked only their own lives in their experiments, the book argues that AI developers risk everyone.


A satirical dialogue between a concerned mother and a Bright Eager Optimistic Engineer recasts vague safety assurances in rocket-safety terms. When the engineer says the rocket has no reason to explode but later cites expert estimates suggesting a 10-20% risk, the mother flees in horror.


OpenAI’s flagship superalignment plan proposes using AI to solve alignment. The book notes that most team members later resigned or were dismissed, with one co-head starting a new company and the other joining Anthropic. The weak version—using AI for interpretability research—focuses on tools for understanding what is happening inside AIs, not solutions. The strong version, having a superintelligent AI solve alignment, requires building an untrustworthy, dangerously powerful AI before the problem is solved. According to the authors, a specialized alignment AI would need programming knowledge, understanding of how to grow AIs, and insight into human psychology, a dangerous combination of capabilities to train into an unaligned system.

Part 3, Chapter 12 Summary: “I Don’t Want to Be Alarmist”

Thomas Midgley Jr., born in 1889, discovered the potential of tetraethyl lead as a gasoline additive while working at General Motors. While it reduced engine knocking, the benefit came at enormous cost: Lead exposure caused brain damage in children, reducing measured IQ by approximately 7.4 points and contributing to increased criminality and violence. Scientists knew lead was neurotoxic in the 1920s, and New Jersey briefly banned production, but industry propaganda argued that the harms had not been conclusively demonstrated and that the benefits justified some risk. Midgley himself suffered lead poisoning twice but publicly downplayed risks. He also invented Freon, which damaged the ozone layer.


Current AI development echoes this historical pattern of dismissing foreseeable dangers. Oxford philosopher Toby Ord estimates only a 10% extinction risk from AI, but this assumes humanity will successfully coordinate. Geoffrey Hinton has said publicly that he believes the risk exceeds 50% but usually states at least 10%, explaining that he avoids giving a higher estimate because other experts in the field disagree. In October 2023, former UK Prime Minister Rishi Sunak acknowledged the risk of losing control to superintelligence while adding that he did not wish to be alarmist.


This downplaying represents part of the standard template for disaster. Soviet officials refused to believe Chernobyl’s reactor had exploded even while standing near radioactive debris. Titanic passengers, including John Jacob Astor, ridiculed the idea of boarding lifeboats; historian Walter Lord recounts survivor reports in which Mrs. J. Stuart White was told she would need a pass to reboard the unsinkable ship.


Unlike past disasters where humanity learned from mistakes, ASI offers no second chance. Yet competitive incentives push companies forward: If one stops, rivals continue. The hope for immense benefits—cures for disease, technological abundance, galactic colonization—motivates many participants. Yudkowsky founded the Machine Intelligence Research Institute (MIRI) in 2000, pursuing such dreams before recognizing alignment’s difficulty. As Upton Sinclair observed, salary and career investment make understanding inconvenient truths difficult. Geoffrey Hinton’s departure from Google to speak freely demonstrates that some can overcome these incentives.


Most people remain either unaware or feel unable to judge between competing expert views. Experts debate whether everyone dies quickly— which the authors describe as their own position—humanity becomes digitized and kept as pets by AIs that care about humans to some tiny but nonzero degree, or whether corporate control of superintelligence poses a 20% extinction risk. Timelines have accelerated dramatically: 2015 skeptics predicted centuries, 2020 analysts suggested decades, while by 2025, estimates ranged from one to nine years for creating superhumanly good AI researchers.


The competitive dynamic resembles climbing a ladder in darkness where each rung brings multiplying wealth but an unknown rung causes explosive death. No executive wants their country to fall behind while rivals climb higher. At the 2025 World Economic Forum, the leader of Google DeepMind proposed a CERN-like international collaboration, but even global cooperation cannot succeed if participants continue developing toward superintelligence without understanding intelligence’s principles. The chapter concludes that the alignment problem remains beyond humanity’s current scientific capabilities and that continued escalation toward superintelligence risks catastrophic consequences.

Part 3, Chapter 13 Summary: “Shut It Down”

Between 1939 and 1945, the Allied powers mobilized against totalitarianism through immense, inconvenient efforts: military drafts, rationing, enormous expenditures, and sending soldiers to die. Despite the costs and risks of concentrating power, they undertook these measures because preventing conquest mattered. The authors argue that, despite abuses and mistakes, history judged these actions as justified according to the Allies’ own values, allowing humanity to remain free.


Preventing human extinction from AI, the authors argue, requires similarly costly action. The problem is global: “If anyone anywhere builds superintelligence, everyone everywhere dies” (211). No single company’s virtue, no regional ban, and no virtuous nation racing ahead faster can solve this. The authors argue that North Korea, any billionaire with sufficient resources, or any military, American, British, or Chinese, cannot safely be allowed to continue advancing toward superintelligence, because nobody possesses the knowledge to align superintelligence.


The first necessary step, according to the book, consolidates all computing power capable of training or running more powerful new AIs into locations monitored by observers from multiple treaty powers, ensuring these resources are not used for increasingly powerful AI development. If intelligence services detect unexplained electrical consumption suggesting hidden datacenters, and the responsible country refuses international inspection, multiple nuclear powers would issue warnings about possible next steps.


No magical threshold exists at a specific number of GPUs. The safest approach, the authors suggest, sets low limits, perhaps eight advanced GPUs as of 2024, making nine unmonitored units illegal. Publishing research on more efficient AI techniques should also become illegal, according to the authors’ argument. The 2017 transformer algorithm enabled dramatic new capabilities; the next breakthrough might end the world.


If a nuclear power refuses treaty participation and builds a datacenter, other powers should communicate their genuine terror for their children’s lives. They should make clear that datacenters threaten more lives than nuclear weapons and that they would resort to cyberattacks, sabotage, and conventional strikes to destroy such facilities even if threatened with nuclear retaliation. The offer to join on equal terms would still remain open.


Those claiming such coordination is impossible assert that countries will never care even 1% as much as they cared about fighting World War II, which mobilized 60 to 80 million personnel and cost $6 trillion in modern terms. Current policy proposals—banning deepfakes or requiring safety reports—fail to address the core issue. California’s bill SB-1047, which aimed to regulate frontier AI models, was narrowed during development and ultimately vetoed because weaker proposals cannot justify the power they request.


Waiting for a warning sign may fail because a superintelligence would not provide fair warning and the critical threshold might pass quietly in a private lab. Halting AI development represents only the first step. As a second step, the authors recommend augmenting human intelligence to levels capable of solving alignment, an idea discussed further in online materials. Agreement on later steps is unnecessary now; the broadest possible coalition must form around preventing extinction. This issue should not be bundled with concerns about job displacement or autonomous weapons, which would fracture necessary unity.

Part 3, Chapter 14 Summary: “Where There’s Life, There’s Hope”

On January 26, 1972, flight attendant Vesna Vulović survived a fall from 6.3 miles when a bomb destroyed her aircraft. Her survival seemed impossible but happened nonetheless. As Ecclesiastes observed long ago, the living can hope.


The book’s core argument is straightforward: Superintelligent machines would hit the world harder than anything has ever hit it before, the project appears extraordinarily difficult, current approaches do not appear to be on course to go well, and humanity must retreat before disaster arrives at an unknowable time. The authors argue that disaster is sufficiently predictable that safety itself should be “callable,” as it would be in any engineering project on which human lives depend.


Humanity previously faced credible extinction predictions during the Cold War. Nuclear weapons made full-scale war between superpowers seem plausible given the long history of repeated wars and the destructive power of fusion bombs. Yet nuclear war never occurred. Close calls happened—during the Cuban Missile Crisis, Soviet officer Vasily Arkhipov cast the lone dissenting vote preventing his submarine from launching a nuclear torpedo—but the predicted catastrophe did not materialize. Those who expected nuclear war were correct about the dangers but underestimated humanity’s ability to choose survival. This success required tireless diplomatic work across decades to reverse a seemingly inevitable outcome.


Different groups can contribute to similar efforts now. Governments should signal openness to international treaties. Recent statements by Rishi Sunak and Chinese General Secretary Xi Jinping give the authors hope that some major powers may be open to coordination, particularly if no country must sacrifice advantages unilaterally. Elected officials should lay groundwork for comprehensive treaties despite concerns about sounding extreme. Polls show strong public support: A 2023 poll found 69% of Americans favor regulating AI as dangerous technology, while a 2025 poll found 60% of UK voters support banning artificial superintelligence creation.


Unconvinced politicians should at minimum establish conditions enabling future action, such as concentrating GPU clusters where later treaties could monitor them. Journalists should provide sustained, serious coverage that investigates the issue beyond surface narratives about technology executives and treats extinction warnings as relevant context. Most people need not boycott AI tools, which would disadvantage them individually without solving the global problem. Instead, citizens can write representatives, support treaty-favorable candidates in primary elections, and join large lawful protests. Violence and destruction are cautioned against as counterproductive to building necessary coalitions.


After taking possible actions, people should live well. C. S. Lewis wrote about living in the atomic age by continuing normal human activities, including working, reading, and playing, rather than cowering in fear. If enough people understood and acknowledged the danger, international treaties and monitoring systems could become politically possible. Many officials privately see risks but fear being first to speak. The chapter concludes that where life exists, hope remains.

Conclusion Summary: “Closing Words”

When asked whether they feel vindicated by recent developments in AI, the authors offer two prayers. First, they hope to be proven catastrophically wrong, shamed into irrelevance, and forgotten while humanity thrives. But refusing to rely on inaction, their deeper hope is that humanity will rise to the occasion and triumph.

Part 3-Conclusion Analysis

The concluding section of the book structures its core argument around the limitations of human engineering, utilizing historical disasters to demystify the abstract threat of artificial superintelligence. By anchoring the analysis in familiar catastrophes like the Chernobyl meltdown and failed space probes, the text shifts the focus from speculative machine behavior to established patterns of human fallibility. The authors identify specific, compounding variables present in these past failures—such as narrow margins for error and self-amplifying physical processes—and map them directly onto AI development.


This structural choice places AI development within a broader history of engineering failures shaped by complexity, speed, and limited human understanding. Through these analogies, the text frames superintelligence as an engineering challenge that current scientific knowledge may be insufficient to manage safely. The fundamental challenge becomes “navigating the gap between before and after” (161), highlighting that safety measures cannot be retroactively applied to a deployed superintelligence. Consequently, the argument relies on the historical record of irreversible errors to establish that humanity currently lacks the ability to reliably manage the razor-thin margins required for safe AI deployment. This discussion also reinforces the book’s recurring concern with how Grown Systems Elude Control, particularly through its emphasis on self-amplification, edge cases, and failures that become irreversible once systems exceed human response capacity.


To further undermine the authority of current artificial intelligence developers, the text employs the motif of alchemy. A parable about a medieval town facing mass execution for an alchemist’s hubris sets up a direct comparison between pre-scientific guesswork and modern AI safety proposals. The authors characterize the approaches of prominent tech figures as “recognizable as the stage of folk theory” (182), rooted in vague philosophical ideals rather than rigorous, quantifiable engineering. By labeling industry optimism as alchemy, the text critiques the epistemological foundation of modern AI research. Developers are portrayed as operating within a field whose underlying principles remain only partially understood despite rapid technological progress. This motif emphasizes the danger of growing—rather than explicitly crafting—intelligence. When the underlying mechanisms of large language models remain as inscrutable as the rules of chemistry were to 12th-century alchemists, the text asserts that the confident assurances of tech executives depend on assumptions that the field still cannot reliably test or verify.


The narrative then expands its critique from technical inability to institutional and psychological denial. Utilizing the history of Thomas Midgley Jr.’s invention of leaded gasoline, alongside references to the Titanic and the Soviet response to Chernobyl, the authors analyze the predictable societal patterns of downplaying imminent catastrophe. This analysis highlights how economic incentives and competitive dynamics actively blind stakeholders to existential risks. The text uses the metaphor of climbing a ladder in the dark, where each rung offers immense wealth but an unknown rung triggers an explosive end, capturing the competitive pressures shaping AI development. The immediate, localized rewards of progression make long-term consequences easier to dismiss or postpone. By situating the artificial intelligence race within a broader historical continuum of industry propaganda and institutional denial, the text argues that the continuation of AI development is strongly influenced by competitive incentives, political pressure, and economic dependence on technological acceleration. This discussion most directly develops the book’s recurring concern with how Competitive Pressures Reward Speed Over Safety, particularly through its emphasis on how institutional competition can override caution even under conditions of acknowledged risk.


In its final chapters, the text undergoes a significant rhetorical pivot, shifting from diagnosing existential risk to advocating large-scale collective intervention. The authors frame the necessary response to artificial intelligence not through the lens of technological innovation, but through the historical precedents of World War II mobilization and Cold War nuclear deterrence. This shift reframes the solution to superintelligence alignment as a geopolitical imperative requiring international coordination alongside technical research. By arguing that rogue datacenters must be treated as threats greater than nuclear weapons, the authors escalate the discourse from regulatory compliance to existential defense. The comparison to historical global conflicts serves to legitimize the extreme measures proposed, such as multinational datacenter monitoring and the willingness to utilize military strikes against noncompliant facilities. This rhetorical strategy effectively redefines the scope of the problem, asserting that human survival depends on unprecedented international coordination and shared sacrifice instead of relying solely on the independent actions of private technology companies.


The closing sections utilize literary allusion and historical imagery to balance the text’s apocalyptic warnings with a framework for psychological endurance. By referencing Ecclesiastes and quoting C. S. Lewis on living in the atomic age, the authors intentionally ground their extreme technological warnings in enduring human philosophies. Lewis’s advice to continue engaging in “sensible and human things” (230) serves as a tonal counterweight to the preceding chapters’ stark demands for geopolitical mobilization. This juxtaposition transforms the book’s conclusion into a discussion of how individuals might psychologically respond to existential uncertainty while continuing ordinary human life. The concluding prayers—hoping either to be proven wrong or to see humanity rise to the occasion and triumph—reinforce this dual tone of humility and resolve. Ultimately, the text uses these cultural touchstones to argue that confronting the threat of artificial superintelligence requires both rigorous, coordinated global action and the preservation of ordinary human relationships, routines, and cultural life despite ongoing uncertainty.

blurred text
blurred text
blurred text

Unlock all 63 pages of this Study Guide

Get in-depth, chapter-by-chapter summaries and analysis from our literary experts.

  • Grasp challenging concepts with clear, comprehensive explanations
  • Revisit key plot points and ideas without rereading the book
  • Share impressive insights in classes and book clubs