• Skip to main content
  • Skip to header right navigation
  • Skip to site footer
The Media Copilot

The Media Copilot

How AI is changing Media, journalism and content creation

  • News
  • Reviews
  • Guides
  • AI Courses
    • AI Quick Start
    • NEW—AI for Media
    • Custom AI Training for Teams
  • Newsletter
  • Podcast
  • Events
    • GEO Dinner Series
    • Webinars
  • About

Cloudflare will block AI training crawlers by default on ad-supported sites

The company says new controls will let publishers separate search, agent use and model training

Exterior of Cloudflare's corporate headquarters
Cloudflare says that starting Sept. 15 it will block AI training and agent crawlers by default on ad-supported pages, while continuing to allow search indexing unless site owners choose otherwise. (Credit: Sundry Photography - stock.adobe.com)
Jul 1, 2026

By Romy Abu-Fadel

Cloudflare said Wednesday it will begin blocking AI training and agent crawlers by default on ad-supported websites, a change that could force companies such as Google, Apple and Microsoft to more clearly separate search indexing from AI training if they want continued access to large parts of the web.

The policy, scheduled to take effect Sept. 15, applies to new Cloudflare customers, new sites added by existing customers and existing Free-tier customers who have not changed their settings. Search crawlers will remain allowed by default, but training and agent crawlers will be blocked on pages that display ads.

The company said the changes are designed to help publishers remain visible in AI-powered search results while preventing their content from being used for AI training or autonomous agents without permission or compensation. 

“Now that the majority of traffic is non-human, we must go further and act faster so that a sustainable ecosystem can emerge,” said Matthew Prince, Cloudflare’s co-founder and CEO. 

Splitting up mixed use crawlers

The Web giant said bots that combine search, AI training and agent activity—known as mixed use crawlers—without letting site owners choose among those uses will be blocked on ad-supported pages when training or agent access is blocked. In a company blog post, Cloudflare named Googlebot, Applebot and BingBot as multi-purpose crawlers that could be affected by the most restrictive applicable rules.

“We hope that our proposed default changes encourage mixed use crawlers to separate out search from agent use and training,” Prince said. 

Cloudflare said customers will be able to manage three categories of AI traffic: Search, which indexes content for later retrieval; Agent, which accesses a site on behalf of a user in real time; and Training, which collects content to train or fine-tune models. The controls are available to all Cloudflare customers, including those on the Free tier.

That distinction matters for smaller sites. A spokesperson for  Cloudflare said the new controls are intended to give all website owners more options for managing AI traffic, not only publishers with ads or subscriptions. But the default blocking policy is tied to pages with advertising, and Cloudflare’s compensation plans remain focused on commercial use cases where AI systems access or surface publisher content.

Alongside the new crawler controls, Cloudflare is expanding analytics to show publishers how bots interact with their content and how much traffic AI platforms send back. The company is also pushing into what it calls Answer Engine Optimization, or AEO, offering tools it says will help customers understand how often their content is cited or surfaced in AI-generated answers.

Cloudflare also announced efforts to reduce unnecessary AI crawling. According to the company, more than half of AI crawler traffic is spent repeatedly checking web pages that have not changed. Because Cloudflare sits between websites and online traffic, it says it can signal to AI companies when pages have been updated and worth revisiting. The company said it is testing those signals with AI firms and plans a broader rollout later this year. 

New compensation model

The company is also expanding its publisher compensation strategy by evolving its Pay Per Crawl program into a new system called Pay Per Use. Rather than paying publishers when content is crawled, the new model is designed to compensate them when their content is actually used in AI products. Cloudflare said it is working with AI companies including Ceramic.ai and You.com on the initiative. Under the arrangements, publishers could be paid when their content appears in AI search results or when AI agents access premium content on demand. 

But the model does not yet answer the hardest compensation question: what happens when a publisher’s work is used for model training but never appears in a cited answer? Asked whether Pay Per Use compensates publishers in that scenario, The spokesperson said the program is aimed at “programmatic, real-time access and discovery,” and described Pay Per Crawl and Pay Per Use as only two possible economic frameworks.

“The digital landscape is evolving rapidly,” said Marrissa Holloway for Cloudflare. “We welcome ideas from publishers, creators, and AI companies alike on how to build a thriving agentic Internet.”

Holloway did not directly say what Cloudflare’s cut of any revenue generated would be. “It has always been our philosophy that our customers derive many multiples of value more than they pay us,” she said.

  • Subscribe to our newsletter

    How AI is changing media, journalism, and content creation.

    Learn More

The Media Copilot’s take

Cloudflare is not solving AI compensation for the whole Web. It’s building a bargaining layer for larger publishers with enough traffic and revenue to measure, block and negotiate. That helps the larger content outlets, but smaller sites and independent publishers will get switches to turn on and off. That’s useful, but switches don’t mean they have leverage. The long tail of the Web—the indy blog sites, community web pages and hobby sites—can say “no” more clearly, but there still no obvious way for them to get paid when their work is used for an AI’s training data and never comes back with a citation or link.

Contributors

  • Romy Abu-Fadel: Author

    Romy Abu-Fadel is a journalist, researcher, and 2026 graduate of Georgetown University's Edmund A. Walsh School of Foreign Service. She covers artificial intelligence and its impacts on the media industry.

  • Christopher Allbritton: Editor

    Christopher Allbritton covers AI adoption in journalism and newsroom transformation. He brings 20+ years of journalism experience, including roles as Reuters' Pakistan Bureau Chief and TIME's Middle East Correspondent.

Category: NewsTags:Cloudflare| webscraping| AI traffic| AEO
Share this post:
FacebookTweetLinkedInEmail

What do 1,000 journalists and PR pros know about AI that you don't? They took AI Quick Start, a 1-hour live class from The Media Copilot. 94% satisfaction. Find out how to work smarter with AI in just 60 minutes. Get 20% off with the code AIPRO: https://mediacopilot.ai/

  • Related articles

Digital tunnel of red flagged content icons funneling into an AI chat conversation panel

Can AI deliver trustworthy news? NewsGuard thinks its new Chatbot has the answer

Read moreCan AI deliver trustworthy news? NewsGuard thinks its new Chatbot has the answer
Illustration of a woman at a control panel managing AI company toggles for OpenAI, Anthropic, Google, and Microsoft

Creators get new say over AI scraping through Cloudflare–beehiiv partnership 

Read moreCreators get new say over AI scraping through Cloudflare–beehiiv partnership 
Aerial illustration of a busy highway interchange at night with AI-tagged colored cars

Cloudflare CEO: Bots have overtaken human traffic online

Read moreCloudflare CEO: Bots have overtaken human traffic online
Toll booth plaza at night with Cloudflare, Microsoft, and ScalePost branded booths and cars waiting in line

New report finds wide disparity in AI tollbooths for publishers

Read moreNew report finds wide disparity in AI tollbooths for publishers
AI content scraping

Inside the AI scraping economy nobody wants to talk about

Read moreInside the AI scraping economy nobody wants to talk about
Two hands shaking over a stack of newspapers with a red shield icon, representing a content licensing deal

UK media giants launch coalition to demand AI licensing standards

Read moreUK media giants launch coalition to demand AI licensing standards

The Media Copilot

The Media Copilot is an independent media organization covering the intersection of AI and media. Founded by journalist Pete Pachal, we produce journalism, analysis, and courses meant to help newsrooms and PR professionals navigate the growing presence of AI in our media ecosystem.

  • LinkedIn
  • X
  • YouTube
  • Instagram
  • TikTok
  • Bluesky
  • About The Media Copilot
  • Advertising & Sponsorships
  • Our Methodology
  • Privacy Policy
  • Membership
  • Newsletter
  • Podcast
  • Contact

© 2026 · All Rights Reserved · Powered by Springwire.ai · RSS