OpenAI Is Planning a Land Grab to Protect Against Legal Challenges

Image via Midjourney

According to Bloomberg, OpenAI is in talks to license content from CNN, Time, and Fox, a move similar to the partnerships it started with The Associated Press and Axel Springer last year. The decision to finally pay for content is a noble one on OpenAI’s part, especially considering most large language models were born of endless web scraping of that very same content.

From Business Insider:

Time’s CEO Jessica Sibley told Bloomberg in a statement that the publisher “is in discussions with OpenAI and we are optimistic about reaching an agreement that reflects the fair value of our content.”

OpenAI has already inked deals with The Associated Press and Axel Springer, which owns Politico and Business Insider. The talks are part of a larger attempt by AI companies to gain access to content from news organizations without risking lawsuits.

It behooves us to remember that OpenAI took most of its content from the open web, the same open web championed by the freedom-loving techno-utopianists who left the news industry in shambles through aggregation. The resulting glut of repurposed reporting at first seemed like the answer to the celestial jukebox concept of information sharing but eventually ate into the very sources of that information.

STORY CONTINUES BELOW

Subscribe now

As we learned a few days ago, OpenAI wanted to offer publishers as little as $1 million dollars for full access to their archives. While this might be nice for a small outfit, that’s a weekly lunch budget for places like The New York Times, which you may have heard is suing OpenAI over its harvesting of the Times archives to train its AI models. This whiplash — OpenAI wanting to license, partners willing to negotiate, and the resulting numbers being so palpably silly that both parties look like fools — will continue as we price out this next round of media self-destruction.

My hope? That OpenAI actually pays these outlets what they’re worth. Millions if not tens of millions of dollars a year sounds like a good starting number; if AI platforms want access to current events, this information will be vital. The alternative seems to be scraping aggregated versions of those stories, which will more and more be written by AI. Running LLMs on a corpus of LLM-generated news will eventually lead to model collapse, where the AI just starts spitting out junk. The threat of that happening should, at least, keep journalists in some semblance of a job.

These are tough times for media. OpenAI and others like them could be the way forward. The cynic in me — the journalist — says it won’t be so and this is an effort to stave off more lawsuits.

The optimist in me is excited.

Join us in New York on February 1! We are planning our first meetup in Manhattan and we’d love to meet you! Sign up to our Meetup Group here and we’ll send you the details shortly.

Ready to start using AI like a pro?


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.