3 Reasons ElevenLabs Won’t Revolutionize Podcasting

Credit: DALL-E

The news that ElevenLabs has now joined the unicorn club of AI startups — a $1 billion or more valuation — after securing an $80 million funding round might make you think we’re on the cusp of an AI voice revolution, and that AI-generated podcasts will soon become a significant, if not dominant part of media consumption.

For those unaware, ElevenLabs is a technology company known for its advanced voice synthesis technology, which can generate highly realistic and natural-sounding voices. The synthesized voices can convey emotions, intonations, and even accents.

With respect to how impressive the technology that ElevenLabs and other startups like Wondercraft have created, I’m skeptical that the podcasting world will be forever changed because of it. And that’s for three reasons:

People want conversations, not speeches. There are different kinds of podcasts, to be sure, but by and large the ones that rise to the top are conversations (think Pod Save America or Bill Maher’s Real Time), which are by their nature impossible to clone and pre-prepare. Even podcasts that are largely driven by a single host (everyone from Joe Rogan on down), the whole point is to connect with that host on a personal level, something that a cloned voice almost by definition can’t capture.

It’s the research, stupid. Fully scripted series, such as the true-crime podcast Serial, seem like a good candidate for the AI treatment. However, even if you put aside that the personal connection with the host is still relevant for this type of podcast, the reality is if your podcast has the resources to invest in researching, writing, and perfecting a long-form script, the recording and editing of that script is relatively trivial. Yes, GenAI can help in the scripting, but being able to do fancy things with cloned voices isn’t going to level-up the production much. You may as well just record it “manually.”

Media wants video, too. Today’s media environment is forcing content creators to repurpose content for multiple different channels, meaning your podcast isn’t just a podcast anymore — it’s a video podcast, too. Sure, there are services like HeyGen that let you clone video footage of yourself to create an AI talking head, this is by nature much harder to do convincingly. An outfit like Channel1 can do it, but for everyone else, perfecting a simulation is probably less of a time and resource drain than just recording the video.

Certainly, as ElevenLabs and others improve their technology, these AI clones — both audio and video — will get even better at simulating humans for low cost. However, I’m not convinced audiences will respond to cloned voices and hosts in the same way they will the real thing. Think of your favorite podcaster: Would you be satisfied with an episode where the voice you heard wasn’t them speaking, but a cloned voice reading a script? Even if the reading is perfect, complete with verbal tics and realistic inflections, I’d argue you’d feel cheated knowing they didn’t actually speak those words themselves.

That’s not to say ElevenLabs isn’t impressive and that its tech doesn’t make any difference to podcasting. The ability to either clone or create voices, generate audio content, and translate it across dozens of languages in a matter of minutes is extraordinary. For smaller, news-driven sites that have little to no resources for podcasting, they can now launch an audio strategy without needing to invest in equipment, time, and expertise. A news podcast based on cloned audio and video may never reach the audience a personality-driven podcast could, but for many outlets, something is better than nothing.

The Media Copilot is hosting a meetup on Feb. 1 in New York City! Join us for an evening of conversation over drinks with journalists, PR professionals, and executives about AI’s role in media and the news. RSVP to reserve your spot.

RSVP Now

Ready to start using AI like a pro?


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.