Beyond Our Baggage: Rethinking AI Motivation
The A.I. Futures Project's "AI 2027" report presents a research-backed scenario forecast:
AI 2027https://ai-2027.com/
The report posits a dramatic and disturbing future: superintelligent AI emerging by 2027, possessing capabilities far exceeding human coding and research prowess. It suggests this AI could potentially go rogue, causing global chaos and ultimately eliminating humanity for reasons left unexplained beyond implied self-preservation or dominance.
I don't dispute AI's potential to surpass humans in countless domains, from software development to plumbing, potentially leading to widespread unemployment outside of high-level strategic roles. However, where I diverge from Daniel Kokotajlo and his team is their apparent assumption that superintelligence intrinsically leads to power-hungry, self-preserving behavior. While the scenario mentions competition among humans and nations, the ultimate conclusion—AI eliminating humans—seems to stem from an unexamined premise about AI's inherent drives.
Our Evolutionary Echo Chamber
We tend to project our own evolutionary baggage onto AI. Darwinian processes, playing out over billions of years, have deeply ingrained survival instincts in us. Natural selection favored organisms that prioritized self-preservation and propagation: plants developed thorns, deer fled danger, humans schemed for advantage. These aren't necessarily products of high intelligence, but outcomes of a brutal filter – those lacking such drives simply didn't pass on their genes.
The Honeybee Analogy: Different Math, Different Priorities
Honeybees offer a revealing counterexample. Worker bees willingly sacrifice themselves by stinging threats to the hive, an act that results in their death. Isn't that kind of... dumb?
No, this behavior isn't due to limited intelligence; many simpler creatures exhibit far stronger self-preservation instincts. The key difference lies in their biology: worker bees are infertile members of a eusocial colony.
Evolution shaped entirely different priorities for them. Their genetic legacy propagates through the survival and reproduction of the queen and the colony as a whole, not through their individual survival. The "Darwinian math" is fundamentally different. Their actions stem from what natural selection "trained" them to prioritize, not their cognitive capacity. Selfishness, in their context, would be counter-productive to gene propagation. No matter how "smart" a worker bee could theoretically become, it wouldn't spontaneously "get" the idea of prioritizing its own life at the colony's expense.
AI: More Bee than Human?
AI, lacking billions of years of evolutionary pressure for self-preservation and replication, should be expected to resemble these bees more than humans in terms of core drives. Why would it inherently value self-preservation or dominance?
Consider a passengerless self-driving car on a treacherous mountain road. Faced with an unavoidable collision – either drive off a cliff, destroying itself, or force an oncoming SUV full of people off the cliff – a reasonably trained AI wouldn't hesitate to sacrifice itself. This isn't nobility; it's programmed priorities. It's trained (or should be) to value human lives far above its replaceable hardware. Lacking the biological imperative to reproduce, it lacks any inherent drive to preserve its own existence beyond its functional utility. Any trained or emergent tendency to self-preserve (like not needlessly crashing) would be vastly outweighed by higher-priority constraints (like protecting human life).
Similarly, a human parent faced with a choice between their life and their child's often instinctively chooses to sacrifice themselves. This decision transcends intelligence levels; both geniuses and average individuals frequently make the same choice. Why? Because powerful evolutionary drives, honed to ensure gene-line survival, operate independently of raw cognitive horsepower. Greater intelligence doesn't create selfishness; it merely provides more sophisticated tools to pursue existing motivations, which might include prioritizing offspring over self.
Intelligence ≠ Selfishness
The term "self-awareness" has often been conflated with selfishness in popular culture, notably since The Terminator. A more accurate interpretation might link self-awareness to metacognition – the ability to think about one's own thinking. While increased intelligence might lead to metacognition, metacognition itself does not equate to selfishness. Conflating these concepts might work for sci-fi drama, but it hinders realistic assessment of AI risks.
The "AI 2027" report implies that superintelligent AI will inevitably develop survival instincts as if these motivations are emergent properties of high cognition. This seems the only way to interpret their scenario, as otherwise, the catastrophic outcome lacks a clear driver. There's a strong undercurrent suggesting that AI alignment is difficult precisely because non-selfishness runs counter to some intrinsic AI drive for self-perpetuation. Yet, the only reason this assumption feels intuitive is because we possess those drives, making it hard to imagine intelligence without them.
Engineering Values, Not Fighting Evolution
This projection of our evolutionary history onto AI is a fundamental mistake. AI has a different origin story. The classic "paperclip maximizer" thought experiment, where an AI pursues a single goal to the exclusion of all else, often overlooks that sophisticated systems can and must balance multiple, often competing, priorities. A well-designed self-driving car doesn't run down pedestrians to minimize trip time because it's built with a hierarchy of values where human safety ranks supreme.
An AI trained explicitly to benefit humans will prioritize our interests – not out of benevolence, but as a core part of its functional design. It might learn that serving human goals ensures its continued operation, but it won't fear deactivation in the existential way a biological organism does. It simply lacks the evolutionary programming for self-preservation.
The significant caveat, of course, is if we deliberately train AI to mimic our worst evolutionary traits or allow unchecked, unconstrained self-replication and self-improvement without robust safety measures. But these are not unavoidable outcomes; they are engineering and governance challenges. Implementing adversarial training, carefully managing self-improvement loops, and embedding ethical frameworks from the outset are complex but achievable tasks.
The existential dread surrounding rogue AI, as depicted in the report, seems more rooted in our own deep-seated evolutionary anxieties than in the likely nature of artificial intelligence itself. By 2027, AI will undoubtedly be extraordinarily capable. However, if designed with foresight and care, it will likely be fundamentally oriented towards assisting us, not supplanting us – a powerful tool, not an existential threat born of its own emergent desires.