The AI Voice Trap

Credit: DALL-E

Do you remember a few years ago when many news sites offered spoken versions of their articles, usually via a button on the web page? They fell out of favor for a host of reasons, but a big one was that the quality was often pretty lousy. Now the article-as-audio idea may be coming back, thanks to generative AI.

Voice cloning software, which convincingly replicates voices, is rapidly evolving from a standalone service to a built-in feature. AI startup ElevenLabs made a name for itself as the go-to voice cloning tool last year. Recently OpenAI announced its own version, though it also said it was holding back release because of concerns about misuse. And new TikTok appears to be getting into the fray.

Over the weekend, someone mined the code of TikTok for Android and claims to have discovered references to both a “TikTok Voice Library” and a feature called “Create your voice with AI,” neither of which exist yet. Hard to say how this will land with TikTok users — the app is almost synonymous with the halting inflections of “Jesse,” the default voice — but it demonstrates just how commodified voice cloning is becoming.

The potential uses of this technology for news sites are clear: they can easily offer high-quality spoken versions of articles and might even allow readers to choose between synthetic celebrity voices (à la Artifact) or the writer’s own voice. That potential was clearly not lost on The New York Times, which recently announced that, following a testing period, it would soon offer spoken versions of all its articles both in its apps and on its website.

You might think the Times doing this will signal a move for other publications to follow suit, and some might. However, the Times is the exception rather than the rule among newsrooms: It’s one of the few mainstream publications that’s on strong financial footing, and it has pre-existing robust audio offerings, with several popular podcasts and a dedicated app, NYT Audio.

In other words, the Times has trained its audience to expect and desire more ways to experience the brand aurally, and it has enough control over its user experience to lean into that if it wants to. A media brand whose main (or sole) format is articles and with a smaller product operation would have a relatively tough time convincing its users to care about audio versions of its articles.

To be clear, I’m talking about offering general readers the option to listen to articles like they would a podcast — not audio meant for visually impaired readers. From my experience considering this feature at CoinDesk and elsewhere, I’ve found that readers with an accessibility issue will use their own devices and software to listen to articles, and whether or not the publication offers their own audio versions is of little relevance to them — another reason those old “listen to this article” buttons mostly went away.

The whole idea of turning articles into podcasts with voice cloning is illustrative of the difference between AI content features and AI content strategy. With today’s AI tools, any newsroom could begin offering articles as audio. But is the audience demanding it? Is it serving a larger audio strategy? Does it make sense to prioritize it over the dozens of other things that might deepen loyalty with existing readers or bring in new ones?

AI tools are truly wondrous in what they enable smaller operations to do — things they never would have been able to even consider in previous eras. What they don’t do, however, is provide guidance on if you should do that thing in the first place.

Hype is a hard thing to resist. Figuring out where AI can make the most difference in your newsroom or content team means ignoring flavors of the day, and being deliberate about what you focus on. If you or your organization would benefit from advice on how to best leverage AI, I strongly recommend booking a consultation with our team. We’re here to help.

Contact Us

Ready to start using AI like a pro?


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.