Google's New AI Can Talk to Dolphins!

It sounds like the premise of a sci-fi movie written by an overzealous marine biologist—an AI that can understand and talk to dolphins using...

Apr 20, 2025 - 07:32

It sounds like the premise of a sci-fi movie written by an overzealous marine biologist—an AI that can understand and talk to dolphins using clicks, whistles, and squeals. But this is no fantasy. This is DolphinGemma, Google's latest AI model designed in partnership with the Wild Dolphin Project. And it’s not just recording dolphin sounds—it’s learning to interpret them and even talk back.

If successful, DolphinGemma could be the Rosetta Stone of animal communication, ushering in a new era of understanding the intelligence of non-human species. Let’s dive deep (pun absolutely intended) into this mind-bending technological frontier.

The Problem: Dolphins Speak—But What Are They Saying?

Dolphins have long fascinated scientists due to their astonishing cognitive abilities. With brains larger than humans in some regions—especially those governing emotions and social processing—dolphins are capable of incredible feats of communication. Each dolphin has a unique "signature whistle," akin to a name, which it uses in social contexts.

But unlike human speech, dolphin communication is acoustically and spatially complex. It occurs underwater, in three dimensions, using high-pitched whistles, click trains, and burst pulses. Their sounds are rapid, layered, and often happen simultaneously, making it virtually impossible to decode using conventional tools. The challenge has always been: How do we make sense of this acoustic chaos?

The Breakthrough: Meet DolphinGemma

Google’s answer is DolphinGemma, an AI model built on the same architecture as its flagship Gemini models but trained on one of the world’s largest datasets of wild dolphin vocalizations. This data comes from the Wild Dolphin Project, which has spent the past 40 years recording and documenting dolphin behavior in the Bahamas.

The project doesn’t just catalog audio—it connects each sound to context. Was the dolphin reuniting with a calf? Chasing a shark? Playing with seaweed? Flirting? These rich, labeled data points gave DolphinGemma what no other machine learning model ever had: a socially informed soundscape of an alien intelligence.

How It Works: Pattern Recognition at Its Finest

DolphinGemma uses a powerful audio encoder called SoundStream, a Google-developed neural codec that breaks down complex audio into learnable representations. Think of it as the "tokenizer" of dolphin communication—just as language models learn word patterns, DolphinGemma learns sonic patterns.

It maps out which sounds occur together, in what order, and under what circumstances—mirroring the way large language models like ChatGPT predict the next word. Only here, it’s not predicting “pizza” after “cheese and tomato”; it’s predicting a high-frequency whistle burst after a dive with juveniles.

Most astonishingly, DolphinGemma can now generate dolphin-like sounds that fit into these observed conversational structures, mimicking the musicality, rhythm, and structure of real dolphin exchanges.

Two-Way Chat: The CHAT System

Understanding is one thing. Talking back is another. That’s where CHAT (Cetacean Hearing Augmentation Telemetry) comes in—a wearable underwater computer system equipped with a Google Pixel 9, capable of processing AI inference in real-time beneath the waves.

The goal? Create a vocabulary. Researchers play a new whistle and associate it with an object, like a scarf. When repeated, dolphins begin to echo the same whistle—essentially "asking" for the scarf. It’s the linguistic equivalent of a toddler learning to say “milk.”

Now imagine scaling this: hundreds of whistles, each tied to specific concepts, objects, or actions. We're not far from rudimentary two-way communication between humans and dolphins.

The Hardware: AI in a Phone-Sized Package

What makes this even more incredible is the portability. DolphinGemma’s future versions can run on a Pixel 9 inside a waterproof headset. That’s right—real-time dolphin translation on a smartphone. No bulky servers, no research vessel supercomputers. Just AI, a diver, and some highly curious dolphins.

From Dolphins to Other Species

Google isn’t stopping at dolphins. By open-sourcing DolphinGemma this summer, the model architecture will be available for researchers studying other vocal animals like:

Elephants, whose low-frequency rumbles can travel 10+ km
Whales, whose songs evolve culturally over generations
Great apes, who use gestures and vocalizations in rich social contexts

The Interspecies Internet Project and researchers like Dr. Katherine Payne are already exploring similar AI-aided decoding of these communication systems.

The Ethical Dimension: Should We Talk to Dolphins?

Here’s where things get murky.

What happens when a dolphin asks why the ocean is filled with plastic? Or why its calf was taken to a marine park? The ability to converse with another intelligent species opens not just scientific possibilities, but also ethical and philosophical ones.

Dolphins aren’t pets or performers—they’re beings with social lives, emotions, and potentially cultures of their own. Some scientists believe dolphins may visualize information through echolocation in ways we can’t even comprehend—sharing "acoustic pictures" instead of words. AI can detect these patterns where human brains fail.

Are we prepared for what they might say?

What This Means for the Future of AI

DolphinGemma represents a paradigm shift in AI utility. This isn’t about answering your emails or generating blog posts. This is about AI being used to bridge evolutionary gaps between entirely different forms of intelligence.

We often talk about "AGI"—artificial general intelligence—but maybe the real revolution will come not from building human-like machines, but understanding non-human intelligences. From oceans to forests, AI may become the universal translator we never knew we needed.

The Road Ahead

We’re still in the early stages—equivalent to teaching a toddler “ball” and “apple.” But let’s not forget: the first phone call was just “Mr. Watson, come here.” The first text was “Merry Christmas.” Look how far we've come since.

Two decades from now, DolphinGemma and CHAT might be remembered not as novelties, but as the first meaningful step toward interspecies understanding. The moment we realized we weren’t alone in our intelligence on Earth—but merely one of many voices, waiting to be heard.