Anthropic CEO Admits We Have No Idea How AI Works

The CEO of one of the world's leading artificial intelligence labs just said the quiet part out loud — that nobody really knows how AI works. In an essay published to his personal website, Anthropic CEO Dario Amodei announced plans to create a robust "MRI on AI" within the next decade not only to figure out what makes the technology tick, but also to head off any unforeseen dangers associated with its (currently) unknowable nature. "When a generative AI system does something, like summarize a financial document, we have no idea, at a specific or precise level, why it makes […]

May 4, 2025 - 15:39
 0
Anthropic CEO Admits We Have No Idea How AI Works
The CEO of Anthropic, one of the world's leading AI labs, just said the quiet part out loud — that nobody really knows how AI works.

The CEO of one of the world's leading artificial intelligence labs just said the quiet part out loud: that nobody really knows how AI works.

In an essay published to his personal website, Anthropic CEO Dario Amodei announced plans to create a robust "MRI on AI" within the next decade. The goal is not only to figure out what makes the technology tick, but also to head off any unforeseen dangers associated with what he says remains its currently enigmatic nature.

"When a generative AI system does something, like summarize a financial document, we have no idea, at a specific or precise level, why it makes the choices it does — why it chooses certain words over others, or why it occasionally makes a mistake despite usually being accurate," the Anthropic CEO admitted.

On its face, it's surprising to folks outside of AI world to learn that the people building these ever-advancing technologies "do not understand how our own AI creations work," he continued — and anyone alarmed by that ignorance is "right to be concerned."

But on another level, maybe it isn't; all the image and text generators that have exploded in popularity over the last few years work under the same principle of feeding in a gigantic pile of data and letting statistical systems mine it for patterns that can be reproduced. The whole thing is driven by ingested human creative works, not from first principles of machine intelligence.

"This lack of understanding," Amodei wrote, "is essentially unprecedented in the history of technology."

In Amodei's telling, that ignorance about how AI works and what unforeseen risks it may pose is a driving factor behind Anthropic.

In late 2020, the CEO and his sister Daniela left OpenAI amid concerns about the Sam Altman-run company's safety practices — and in particular, that it was casting aside those concerns in pursuit of profit. The Amoideis and five other ex-OpenAI-ers founded Anthropic the next year to work on building safer AI — and part of that work seems to have been focused on figuring out the technology's nuts and bolts.

In recent months, Amodei wrote, Anthropic has begun to focus not only on helping "steer" AI — and its possible forthcoming progeny, artificial general intelligence — in ways that would benefit humanity, but also with the "tantalizing possibility" that researchers can finally figure out AI interpretability, or the "inner workings" of these systems, "before models reach an overwhelming level of power."

"Recently, [Anthropic] did an experiment where we had a 'red team' deliberately introduce an alignment issue into a model (say, a tendency for the model to exploit a loophole in a task) and gave various 'blue teams' the task of figuring out what was wrong with it," the CEO explained. "Multiple blue teams succeeded; of particular relevance here, some of them productively applied interpretability tools during the investigation."

While there will be a lot more work to be done to scale these "tools," which weren't explicitly detailed in Amodei's essay, it's still fascinating that folks at OpenAI's biggest competitor are not only working on making AI more advanced, but also tasking themselves with figuring out why and how it works.

"Powerful AI will shape humanity’s destiny," Amodei concluded, "and we deserve to understand our own creations before they radically transform our economy, our lives, and our future."

More on Amodei: Anthropic CEO Suggests That AI Deserves Workers' Rights

The post Anthropic CEO Admits We Have No Idea How AI Works appeared first on Futurism.