Why Generalizable AI Remains a Myth and What True Intelligence Really Means

True generalization in AI remains elusive, often relying on developer input rather than independent system learning. Human intelligence adapts creatively across domains, a feat current AI cannot match.

Published on: May 27, 2025
Why Generalizable AI Remains a Myth and What True Intelligence Really Means

The Fiction of Generalizable AI: How to Game the System

Progress toward genuine generalization in AI remains essentially nonexistent. It might be time to rethink the very idea of the “I” in artificial intelligence.

Previously, we examined the notion that a neural network can start as a blank slate and become intelligent solely through exposure to enough data. Now, let's explore the consequences of this belief. One major outcome of embracing machine learning—and the assumption of “blank slate” AI—is that it opens the door for researchers to game the system. AI analyst François Chollet highlights that success often comes down to “buying” the right data and features to solve a problem.

This issue is well-known on data science competition platforms like Kaggle. Teams frequently build models fine-tuned to win contests but that perform poorly outside those specific settings, a classic case of overfitting.

This cycle is captured by the so-called AI Effect: every time AI succeeds, skeptics dismiss it as “just another trick,” not real intelligence. But this critique misunderstands the core difference between machines and minds. While AI can be engineered for narrow tasks, humans acquire skills across multiple domains through general intelligence. For example, Garry Kasparov didn't “hack” chess to beat Deep Blue in 1997; he developed expertise through broad cognitive abilities.

Two Types of Generalization

System-centric generalization measures an AI system's ability to handle unseen data similar to what it was trained on. This is the classic approach in statistical learning: train on dataset A, test on dataset B. Success means high accuracy or low error on new, but related, data.

However, this ignores any prior knowledge baked into the system through architecture, feature engineering, or data preprocessing. These developer choices help the system succeed but aren't accounted for in the formal generalization metric.

Developer-aware generalization changes the perspective. Here, neither system nor developer has seen the problem before. This approach treats the combined developer-system effort as the real agent facing a novel task. It asks: Can this setup solve genuinely new problems without prior preparation?

While developer-aware generalization collapses to system-centric when the developer and system act as one, the distinction matters. Most AI today generalizes only because developers anticipate the task and prepare accordingly—not because the system independently discovers new capabilities.

Degrees of Generalization: Local, Broad, and Extreme

Understanding generalization requires distinguishing its degrees.

  • Local generalization deals with data from a known distribution within a narrow task. For example, an image classifier distinguishing unseen cat images from dog images. Chollet describes this as “adaptation to known unknowns” within a defined task. This kind of generalization—essentially robustness—has been the AI field’s main focus since its inception.
  • Broad generalization means handling a wide range of related tasks without human intervention. It involves “unknown unknowns” like Level 5 autonomous driving or a robot performing household chores. The famous “coffee cup test”—walking into any kitchen and making coffee—is a classic example. We don’t have AI systems that can do this today, and it's unclear if current progress escapes the pull of local generalization despite widespread hype.
  • Extreme generalization refers to open-ended systems capable of dealing with completely novel situations that share only abstract relationships with previous tasks. This is human intelligence. We adapt across domains in ways no AI currently matches.

Much of the hype about “artificial general intelligence” ignores these distinctions, falsely treating intelligence as a linear problem solvable by scaling data and compute.

The ARC Challenge: Cutting Through the Hype

Chollet developed the Abstraction and Reasoning Corpus (ARC) to test developer-aware generalization, assuming only minimal cognitive priors like “objects don’t just disappear.” The results? Even the best systems today fail on the latest ARC challenges. Real progress toward genuine generalization remains absent.

This calls for cutting the hype and focusing on the real work. More fundamentally, it may require reexamining what “I” means in AI. Is extreme generalization—the kind natural intelligence shows—even achievable by machines? And if it is, would it truly benefit human society by augmenting intelligence, or merely automate and flatten it? These questions go beyond technology and touch on values.

Human-Centric Extreme Generalization: What Minds Actually Do

Extreme generalization means adapting to unknown unknowns across unknown tasks and domains. This is uniquely human. We can play chess, rethink astronomy, modify physics theories, and predict phenomena like dark matter.

We also excel at everyday tasks: fixing a faucet, driving, ordering books, chatting about the weather, or reading social cues. This kind of intelligence remains unmatched by any current AI system.

It’s unclear what the true limits of this intelligence are. What is clear is that no AI system has moved beyond local generalization to demonstrate the open-ended creativity and adaptability that humans possess—the very traits that led us to invent computers and define AI as a field.

Toward a Human-Centric Theory of Intelligence

Future discussions should explore frameworks like “known knowns,” “known unknowns,” and “unknown unknowns” not just for AI but for human cognition itself. This approach sets the stage for a human-centric understanding of intelligence. We need to move past dominant cognitive ideas, aiming to truly understand intelligence in context.

For those curious about the foundational ideas challenging AI’s current hype, reviewing structured AI courses can provide clarity on what present AI systems can—and can’t—do.

Understanding these distinctions matters for anyone working in IT and development, especially when evaluating AI’s promises and limitations. Real intelligence isn’t a linear scale fixed by more data or bigger models. It’s a complex, deeply human capability we have yet to replicate.