Cloudflare to block AI training crawlers by default

Jul 1, 2026

Cloudflare said Wednesday it will begin blocking AI training and agent crawlers by default on ad-supported websites, a change that could force companies such as Google, Apple and Microsoft to more clearly separate search indexing from AI training if they want continued access to large parts of the web.

The policy, scheduled to take effect Sept. 15, applies to new Cloudflare customers, new sites added by existing customers and existing Free-tier customers who have not changed their settings. Search crawlers will remain allowed by default, but training and agent crawlers will be blocked on pages that display ads.

The company said the changes are designed to help publishers remain visible in AI-powered search results while preventing their content from being used for AI training or autonomous agents without permission or compensation.

“Now that the majority of traffic is non-human, we must go further and act faster so that a sustainable ecosystem can emerge,” said Matthew Prince, Cloudflare’s co-founder and CEO.

Splitting up mixed use crawlers

The Web giant said bots that combine search, AI training and agent activity—known as mixed use crawlers—without letting site owners choose among those uses will be blocked on ad-supported pages when training or agent access is blocked. In a company blog post, Cloudflare named Googlebot, Applebot and BingBot as multi-purpose crawlers that could be affected by the most restrictive applicable rules.

“We hope that our proposed default changes encourage mixed use crawlers to separate out search from agent use and training,” Prince said.

Cloudflare said customers will be able to manage three categories of AI traffic: Search, which indexes content for later retrieval; Agent, which accesses a site on behalf of a user in real time; and Training, which collects content to train or fine-tune models. The controls are available to all Cloudflare customers, including those on the Free tier.

That distinction matters for smaller sites. A spokesperson for Cloudflare said the new controls are intended to give all website owners more options for managing AI traffic, not only publishers with ads or subscriptions. But the default blocking policy is tied to pages with advertising, and Cloudflare’s compensation plans remain focused on commercial use cases where AI systems access or surface publisher content.

Alongside the new crawler controls, Cloudflare is expanding analytics to show publishers how bots interact with their content and how much traffic AI platforms send back. The company is also pushing into what it calls Answer Engine Optimization, or AEO, offering tools it says will help customers understand how often their content is cited or surfaced in AI-generated answers.

Cloudflare also announced efforts to reduce unnecessary AI crawling. According to the company, more than half of AI crawler traffic is spent repeatedly checking web pages that have not changed. Because Cloudflare sits between websites and online traffic, it says it can signal to AI companies when pages have been updated and worth revisiting. The company said it is testing those signals with AI firms and plans a broader rollout later this year.

New compensation model

The company is also expanding its publisher compensation strategy by evolving its Pay Per Crawl program into a new system called Pay Per Use. Rather than paying publishers when content is crawled, the new model is designed to compensate them when their content is actually used in AI products. Cloudflare said it is working with AI companies including Ceramic.ai and You.com on the initiative. Under the arrangements, publishers could be paid when their content appears in AI search results or when AI agents access premium content on demand.

But the model does not yet answer the hardest compensation question: what happens when a publisher’s work is used for model training but never appears in a cited answer? Asked whether Pay Per Use compensates publishers in that scenario, The spokesperson said the program is aimed at “programmatic, real-time access and discovery,” and described Pay Per Crawl and Pay Per Use as only two possible economic frameworks.

“The digital landscape is evolving rapidly,” said Marrissa Holloway for Cloudflare. “We welcome ideas from publishers, creators, and AI companies alike on how to build a thriving agentic Internet.”

Holloway did not directly say what Cloudflare’s cut of any revenue generated would be. “It has always been our philosophy that our customers derive many multiples of value more than they pay us,” she said.

Subscribe to our newsletter

How AI is changing media, journalism, and content creation.

Learn More

The Media Copilot’s take

Cloudflare is not solving AI compensation for the whole Web. It’s building a bargaining layer for larger publishers with enough traffic and revenue to measure, block and negotiate. That helps the larger content outlets, but smaller sites and independent publishers will get switches to turn on and off. That’s useful, but switches don’t mean they have leverage. The long tail of the Web—the indy blog sites, community web pages and hobby sites—can say “no” more clearly, but there still no obvious way for them to get paid when their work is used for an AI’s training data and never comes back with a citation or link.