Local queues within CacheDB buffer purge operations to ensure consistent throughput without latency spikes, while the cache proxies consult CacheDB to guarantee rapid, active purges. Our updated distribution pipeline broadcasts purges directly to CacheDB instances across machines, dramatically improving throughput and purge speed.
Using CacheDB, we've reduced storage requirements 10x by eliminating lazy purge storage accumulation, instantly freeing valuable disk space. The freed storage enhances cache retention, boosting cache HIT ratios and minimizing origin egress. These savings in storage and increased throughput allowed us to scale to the point where we can offer Instant Purge to more customers.
For more information on how we designed the new Instant Purge system, please see the previous installment of our Purge series blog posts.
\n
\n
Striking the right balance: what to purge and when
Moving on to practical considerations of using these new purge methods, it’s important to use the right method for what you want to invalidate. Purging too aggressively can overwhelm origin servers with unnecessary requests, driving up egress costs and potentially causing downtime. Conversely, insufficient purging leaves visitors with outdated content. Balancing precision and speed is vital.
Cloudflare supports multiple targeted purge methods to help customers achieve this balance.
Purge Everything: Clears all cached content associated with a website.
Users can select their purge method directly in the Cloudflare dashboard, located under the Cache tab in the configurations section, or via the Cloudflare API. Each purge request should clearly specify the targeted URLs, hostnames, prefixes, or cache tags relevant to the selected purge type (known as purge keys). For instance, a prefix purge request might specify a directory such as example.com/foo/bar. To maximize efficiency and throughput, batching multiple purge keys in a single request is recommended over sending individual purge requests each with a single key.
The new rate limits for Cloudflare's purge by tag, prefix, hostname, and purge everything are different for each plan type. We use a token bucket rate limit system, so each account has a token bucket with a maximum size based on plan type. When we receive a purge request we first add tokens to the account’s bucket based on the time passed since the account’s last purge request divided by the refill rate for its plan type (which can be a fraction of a token). Then we check if there’s at least one whole token in the bucket, and if so we remove it and process the purge request. If not, the purge request will be rate limited. An easy way to think about this rate limit is that the refill rate represents the consistent rate of requests a user can send in a given period while the bucket size represents the maximum burst of requests available.
For example, a free user starts with a bucket size of 25 requests and a refill rate of 5 requests per minute (one request per 12 seconds). If the user were to send 26 requests all at once, the first 25 would be processed, but the last request would be rate limited. They would need to wait 12 seconds and retry their last request for it to succeed.
The current limits are applied per account:
Plan
Bucket size
Request refill rate
Max keys per request
Total keys
Free
25 requests
5 per minute
100
500 per minute
Pro
25 requests
5 per second
100
500 per second
Biz
50 requests
10 per second
100
1,000 per second
Enterprise
500 requests
50 per second
100
5,000 per second
More detailed documentation on all purge rate limits can be found in our documentation.
We’ve spent a lot of time optimizing our purge platform. But we’re not done yet. Looking forward, we will continue to enhance the performance of Cloudflare’s single-file purge. The current P50 performance is around 250 ms, and we suspect that we can optimize it further to bring it under 200 ms. We will also build out our ability to allow for greater purge throughput for all of our systems, and will continue to find ways to implement filtering techniques to ensure we can continue to scale effectively and allow customers to purge whatever and whenever they choose.
We invite you to try out our new purge system today and deliver an instant, seamless experience to your visitors.
"],"published_at":[0,"2025-04-01T14:00+00:00"],"updated_at":[0,"2025-04-09T11:45:36.889Z"],"feature_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/65pxEF0Emq9kVoxKhD6lP3/da052dd60681437575753d231a98b22c/image1.png"],"tags":[1,[[0,{"id":[0,"5RrjSR5vIOJAfRdT8966hf"],"name":[0,"Cache"],"slug":[0,"cache"]}],[0,{"id":[0,"48r7QV00gLMWOIcM1CSDRy"],"name":[0,"Speed & Reliability"],"slug":[0,"speed-and-reliability"]}],[0,{"id":[0,"4gN0ARax0fHxjtZL07THOe"],"name":[0,"Performance"],"slug":[0,"performance"]}],[0,{"id":[0,"7m1u8osPfYlB9uhotDfTol"],"name":[0,"Cache Purge"],"slug":[0,"cache-purge"]}]]],"relatedTags":[0],"authors":[1,[[0,{"name":[0,"Alex Krivit"],"slug":[0,"alex"],"bio":[0,null],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1e9gtWsffhoAWG0rTN45cR/35204e3d77d1a90feeb57a32e388daca/alex.jpg"],"location":[0,null],"website":[0,null],"twitter":[0,"@ackriv"],"facebook":[0,null]}],[0,{"name":[0," Connor Harwood"],"slug":[0,"connor-harwood"],"bio":[0],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7ChQBYRPNIiEuQnGQdajbt/3b67f09fd6dae792eb8a1ffafaa1254a/_tmp_mini_magick20221017-43-y5gfah.jpg"],"location":[0],"website":[0],"twitter":[0],"facebook":[0]}],[0,{"name":[0,"Zaidoon Abd Al Hadi"],"slug":[0,"zaidoon"],"bio":[0,null],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3Ber7SvflfzvyO5J611w7l/59f62d05e428afdd88b10a82857527c0/zaidoon.png"],"location":[0,null],"website":[0,null],"twitter":[0,null],"facebook":[0,null]}]]],"meta_description":[0,"Following up on having the fastest purge in the industry, we have now increased Instant Purge quotas across all Cloudflare plans. "],"primary_author":[0,{}],"localeList":[0,{"name":[0,"blog-english-only"],"enUS":[0,"English for Locale"],"zhCN":[0,"No Page for Locale"],"zhHansCN":[0,"No Page for Locale"],"zhTW":[0,"No Page for Locale"],"frFR":[0,"No Page for Locale"],"deDE":[0,"No Page for Locale"],"itIT":[0,"No Page for Locale"],"jaJP":[0,"No Page for Locale"],"koKR":[0,"No Page for Locale"],"ptBR":[0,"No Page for Locale"],"esLA":[0,"No Page for Locale"],"esES":[0,"No Page for Locale"],"enAU":[0,"No Page for Locale"],"enCA":[0,"No Page for Locale"],"enIN":[0,"No Page for Locale"],"enGB":[0,"No Page for Locale"],"idID":[0,"No Page for Locale"],"ruRU":[0,"No Page for Locale"],"svSE":[0,"No Page for Locale"],"viVN":[0,"No Page for Locale"],"plPL":[0,"No Page for Locale"],"arAR":[0,"No Page for Locale"],"nlNL":[0,"No Page for Locale"],"thTH":[0,"No Page for Locale"],"trTR":[0,"No Page for Locale"],"heIL":[0,"No Page for Locale"],"lvLV":[0,"No Page for Locale"],"etEE":[0,"No Page for Locale"],"ltLT":[0,"No Page for Locale"]}],"url":[0,"https://blog.cloudflare.com/instant-purge-for-all"],"metadata":[0,{"title":[0,"“You get Instant Purge, and you get Instant Purge!” — all purge methods now available to all customers"],"description":[0,"Following up on having the fastest purge in the industry, we have now increased Instant Purge quotas across all Cloudflare plans. "],"imgPreview":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6serphOZ0YKsEL6E3EEiLp/3804087e55df56ca1c5dfcdc909cf181/_You_get_Instant_Purge__and_you_get_Instant_Purge_____all_purge_methods_now_available_to_all_customers-OG.png"]}]}],[0,{"id":[0,"2i9tHkPAfy7GxYAioGM100"],"title":[0,"Improving platform resilience at Cloudflare through automation"],"slug":[0,"improving-platform-resilience-at-cloudflare"],"excerpt":[0,"We realized that we need a way to automatically heal our platform from an operations perspective, and designed and built a workflow orchestration platform to provide these self-healing capabilities "],"featured":[0,false],"html":[0,"
Failure is an expected state in production systems, and no predictable failure of either software or hardware components should result in a negative experience for users. The exact failure mode may vary, but certain remediation steps must be taken after detection. A common example is when an error occurs on a server, rendering it unfit for production workloads, and requiring action to recover.
When operating at Cloudflare’s scale, it is important to ensure that our platform is able to recover from faults seamlessly. It can be tempting to rely on the expertise of world-class engineers to remediate these faults, but this would be manual, repetitive, unlikely to produce enduring value, and not scaling. In one word: toil; not a viable solution at our scale and rate of growth.
In this post we discuss how we built the foundations to enable a more scalable future, and what problems it has immediately allowed us to solve.
The Cloudflare Site Reliability Engineering (SRE) team builds and manages the platform that helps product teams deliver our extensive suite of offerings to customers. One important component of this platform is the collection of servers that power critical products such as Durable Objects, Workers, and DDoS mitigation. We also build and maintain foundational software services that power our product offerings, such as configuration management, provisioning, and IP address allocation systems.
As part of tactical operations work, we are often required to respond to failures in any of these components to minimize impact to users. Impact can vary from lack of access to a specific product feature, to total unavailability. The level of response required is determined by the priority, which is usually a reflection of the severity of impact on users. Lower-priority failures are more common — a server may run too hot, or experience an unrecoverable hardware error. Higher-priority failures are rare and are typically resolved via a well-defined incident response process, requiring collaboration with multiple other teams.
The commonality of lower-priority failures makes it obvious when the response required, as defined in runbooks, is “toilsome”. To reduce this toil, we had previously implemented a plethora of solutions to automate runbook actions such as manually-invoked shell scripts, cron jobs, and ad-hoc software services. These had grown organically over time and provided solutions on a case-by-case basis, which led to duplication of work, tight coupling, and lack of context awareness across the solutions.
We also care about how long it takes to resolve any potential impact on users. A resolution process which involves the manual invocation of a script relies on human action, increasing the Mean-Time-To-Resolve (MTTR) and leaving room for human error. This risks increasing the amount of errors we serve to users and degrading trust.
These problems proved that we needed a way to automatically heal these platform components. This especially applies to our servers, for which failure can cause impact across multiple product offerings. While we have mechanisms to automatically steer traffic away from these degraded servers, in some rare cases the breakage is sudden enough to be visible.
To provide a more reliable platform, we needed a new component that provides a common ground for remediation efforts. This would remove duplication of work, provide unified context-awareness and increase development speed, which ultimately saves hours of engineering time and effort.
A good solution would not allow only the SRE team to auto-remediate, it would empower the entire company. The key to adding self-healing capability was a generic interface for all teams to self-service and quickly remediate failures at various levels: machine, service, network, or dependencies.
A good way to think about auto-remediation is in terms of workflows. A workflow is a sequence of steps to get to a desired outcome. This is not dissimilar to a manual shell script which executes what a human would otherwise do via runbook instructions. Because of this logical fit with workflows and durable execution, we decided to adopt an open-source platform called Temporal.
The concept of durable execution is useful to gracefully manage infrastructure failures such as network outages and transient failures in external service endpoints. This capability meant we only needed to build a way to schedule “workflow” tasks and have the code provide reliability guarantees by default, using Temporal. This allowed us to focus on building out the orchestration system to support the control and flow of workflow execution in our data centers.
Below, we describe how our automatic remediation system works. It is essentially a way to schedule tasks across our global network with built-in reliability guarantees. With this system, teams can serve their customers more reliably. An unexpected failure mode can be recognized and immediately mitigated, while the root cause can be determined later via a more detailed analysis.
After our initial testing of Temporal, it was now possible to write workflows. But we needed a way to schedule workflow tasks from other internal services. The coordinator was built to serve this purpose, and became the primary mechanism for the authorisation and scheduling of workflows.
The most important roles of the coordinator are authorisation, workflow task routing, and safety constraints enforcement. Each consumer is authorized via mTLS authentication, and the coordinator uses an ACL to determine whether to permit the execution of a workflow. An ACL configuration looks like the following example.
Each workflow specifies two key characteristics: where to run the tasks and the safety constraints, using an HCL configuration file. Example constraints could be whether to run on only a specific node type (such as a database), or if multiple parallel executions are allowed: if a task has been triggered too many times, that is a sign of a wider problem that might require human intervention. The coordinator uses the Temporal Visibility API to determine the current state of the executions in the Temporal cluster.
An example of a configuration file is shown below:
\n
task_queue_target = "<target>"\n\n# The following entries will ensure that\n# 1. This workflow is not run at the same time in a 15m window.\n# 2. This workflow will not run more than once an hour.\n# 3. This workflow will not run more than 3 times in one day.\n#\nconstraint {\n kind = "concurency"\n value = "1"\n period = "15m"\n}\n\nconstraint {\n kind = "maxExecution"\n value = "1"\n period = "1h"\n}\n\nconstraint {\n kind = "maxExecution"\n value = "3"\n period = "24h"\n is_global = true\n}\n
An unforeseen benefit of using a central Temporal cluster was the discovery of Task Routing. This feature allows us to schedule a Workflow/Activity on any server that has a running Temporal Worker, and further segment by the type of server, its location, etc. For this reason, we have three primary task queues — the general queue in which tasks can be executed by any worker in the datacenter, the node type queue in which tasks can only be executed by a specific node type in the datacenter, and the individual node queue where we target a specific node for task execution.
We rely on this heavily to ensure the speed and efficiency of automated remediation. Certain tasks can be run in datacenters with known low latency to an external resource, or a node type with better performance than others (due to differences in the underlying hardware). This reduces the amount of failure and latency we see overall in task executions. Sometimes we are also constrained by certain types of tasks that can only run on a certain node type, such as a database.
Task Routing also means that we can configure certain task queues to have a higher priority for execution, although this is not a feature we have needed so far. A drawback of task routing is that every Workflow/Activity needs to be registered to the target task queue, which is a common gotcha. Thankfully, it is possible to catch this failure condition with proper testing.
None of this would be relevant if we didn’t put it to good use. A primary design goal for the platform was to ensure we had easy, quick ways to trigger workflows on the most important failure conditions. The next step was to determine what the best sources to trigger the actions were. The answer to this was simple: we could trigger workflows from anywhere as long as they are properly authorized and detect the failure conditions accurately.
Example triggers are an alerting system, a log tailer, a health check daemon, or an authorized engineer via a chatbot. Such flexibility allows a high level of reuse, and permits to invest more in workflow quality and reliability.
As part of the solution, we built a daemon that is able to poll a signal source for any unwanted condition and trigger a configured workflow. We have initially found Prometheus useful as a source because it contains both service-level and hardware/system-level metrics. We are also exploring more event-based trigger mechanisms, which could eliminate the need to use precious system resources to poll for metrics.
We already had internal services that are able to detect widespread failure conditions for our customers, but were only able to page a human. With the adoption of auto-remediation, these systems are now able to react automatically. This ability to create an automatic feedback loop with our customers is the cornerstone of these self-healing capabilities, and we continue to work on stronger signals, faster reaction times, and better prevention of future occurrences.
The most exciting part, however, is the future possibility. Every customer cares about any negative impact from Cloudflare. With this platform we can onboard several services (especially those that are foundational for the critical path) and ensure we react quickly to any failure conditions, even before there is any visible impact.
The whole system is written in golang, and a single binary can implement each role. We distribute it as an apt package or a container for maximum ease of deployment.
We deploy a Temporal-based worker to every server we intend to run tasks on, and a daemon in datacenters where we intend to automatically trigger workflows based on the local conditions. The coordinator is more nuanced since we rely on task routing and can trigger from a central coordinator, but we have also found value in running coordinators locally in the datacenters. This is especially useful in datacenters with less capacity or degraded performance, removing the need for a round-trip to schedule the workflows.
Temporal provides native mechanisms to test an entire workflow, via a comprehensive test suite that supports end-to-end, integration, and unit testing, which we used extensively to prevent regressions while developing. We also ensured proper test coverage for all the critical platform components, especially the coordinator.
Despite the ease of written tests, we quickly discovered that they were not enough. After writing workflows, engineers need an environment as close as possible to the target conditions. This is why we configured our staging environments to support quick and efficient testing. These environments receive the latest changes and point to a different (staging) Temporal cluster, which enables experimentation and easy validation of changes.
After a workflow is validated in the staging environment, we can then do a full release to production. It seems obvious, but catching simple configuration errors before releasing has saved us many hours in development/change-related-task time.
As you can guess from the title of this post, we put this in production to automatically react to server-specific errors and unrecoverable failures. To this end, we have a set of services that are able to detect single-server failure conditions based on analyzed traffic data. After deployment, we have successfully mitigated potential impact by taking any errant single sources of failure out of production.
We have also created a set of workflows to reduce internal toil and improve efficiency. These workflows can automatically test pull requests on target machines, wipe and reset servers after experiments are concluded, and take away manual processes that cost many hours in toil.
Building a system that is maintained by several SRE teams has allowed us to iterate faster, and rapidly tackle long-standing problems. We have set ambitious goals regarding toil elimination and are on course to achieve them, which will allow us to scale faster by eliminating the human bottleneck.
Our immediate plans are to leverage this system to provide a more reliable platform for our customers and drastically reduce operational toil, freeing up engineering resources to tackle larger-scale problems. We also intend to leverage more Temporal features such as Workflow Versioning, which will simplify the process of making changes to workflows by ensuring that triggered workflows run expected versions.
We are also interested in how others are solving problems using durable execution platforms such as Temporal, and general strategies to eliminate toil. If you would like to discuss this further, feel free to reach out on the Cloudflare Community and start a conversation!
"],"published_at":[0,"2024-10-09T14:00+01:00"],"updated_at":[0,"2025-02-14T14:43:57.072Z"],"feature_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7FvFSEX52ZrPRMJrE9SmhX/ff490b045cfdc52e117a8c8749b3b3b8/BLOG-2498_1.png"],"tags":[1,[[0,{"id":[0,"3OPPjcK7cKutTdeAjpThfG"],"name":[0,"Edge"],"slug":[0,"edge"]}],[0,{"id":[0,"4ggKJG4JUzqG9FpAPIxwij"],"name":[0,"Engineering"],"slug":[0,"engineering"]}],[0,{"id":[0,"5cye1Bh5KxFh3pKSnX8Dsy"],"name":[0,"Serverless"],"slug":[0,"serverless"]}],[0,{"id":[0,"3JAY3z7p7An94s6ScuSQPf"],"name":[0,"Developer Platform"],"slug":[0,"developer-platform"]}],[0,{"id":[0,"4HIPcb68qM0e26fIxyfzwQ"],"name":[0,"Developers"],"slug":[0,"developers"]}],[0,{"id":[0,"KDI5hQcs301H8vxpGKXO0"],"name":[0,"Go"],"slug":[0,"go"]}],[0,{"id":[0,"6QVJOBzgKXUO9xAPEpqxvK"],"name":[0,"Reliability"],"slug":[0,"reliability"]}],[0,{"id":[0,"48r7QV00gLMWOIcM1CSDRy"],"name":[0,"Speed & Reliability"],"slug":[0,"speed-and-reliability"]}]]],"relatedTags":[0],"authors":[1,[[0,{"name":[0,"Opeyemi Onikute"],"slug":[0,"opeyemi"],"bio":[0,null],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4JlqK0q5EMuONqQsWp1lMe/862893cec0f15fc80cafb9b0c092e56c/opeyemi.jpeg"],"location":[0,"London, United Kingdom"],"website":[0,"https://opeonikute.dev"],"twitter":[0,"@Ope__O"],"facebook":[0,null]}]]],"meta_description":[0,"We realized that we need a way to automatically heal our platform from an operations perspective, and designed and built a workflow orchestration platform to provide these self-healing capabilities across our global network. We explore how this has helped us to reduce the impact on our customers due to operational issues, and the rich variety of similar problems it has empowered us to solve."],"primary_author":[0,{}],"localeList":[0,{"name":[0,"blog-english-only"],"enUS":[0,"English for Locale"],"zhCN":[0,"No Page for Locale"],"zhHansCN":[0,"No Page for Locale"],"zhTW":[0,"No Page for Locale"],"frFR":[0,"No Page for Locale"],"deDE":[0,"No Page for Locale"],"itIT":[0,"No Page for Locale"],"jaJP":[0,"No Page for Locale"],"koKR":[0,"No Page for Locale"],"ptBR":[0,"No Page for Locale"],"esLA":[0,"No Page for Locale"],"esES":[0,"No Page for Locale"],"enAU":[0,"No Page for Locale"],"enCA":[0,"No Page for Locale"],"enIN":[0,"No Page for Locale"],"enGB":[0,"No Page for Locale"],"idID":[0,"No Page for Locale"],"ruRU":[0,"No Page for Locale"],"svSE":[0,"No Page for Locale"],"viVN":[0,"No Page for Locale"],"plPL":[0,"No Page for Locale"],"arAR":[0,"No Page for Locale"],"nlNL":[0,"No Page for Locale"],"thTH":[0,"No Page for Locale"],"trTR":[0,"No Page for Locale"],"heIL":[0,"No Page for Locale"],"lvLV":[0,"No Page for Locale"],"etEE":[0,"No Page for Locale"],"ltLT":[0,"No Page for Locale"]}],"url":[0,"https://blog.cloudflare.com/improving-platform-resilience-at-cloudflare"],"metadata":[0,{"title":[0,"Improving platform resilience at Cloudflare through automation"],"description":[0,"We realized that we need a way to automatically heal our platform from an operations perspective, and designed and built a workflow orchestration platform to provide these self-healing capabilities across our global network. We explore how this has helped us to reduce the impact on our customers due to operational issues, and the rich variety of similar problems it has empowered us to solve."],"imgPreview":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2w3GhocOnizExfpXjhITHW/5cdbf7106353ecd9d78b3d5f5047aa1d/BLOG-2498_OG.png"]}]}],[0,{"id":[0,"sC3G9YR2M9IIoRMg8slDl"],"title":[0,"Introducing Speed Brain: helping web pages load 45% faster"],"slug":[0,"introducing-speed-brain"],"excerpt":[0,"We are excited to announce the latest leap forward in speed – Speed Brain. Speed Brain uses the Speculation Rules API to prefetch content for the user's likely next navigations. The goal is to download a web page to the browser before a user navigates to it, allowing pages to load instantly. \n"],"featured":[0,false],"html":[0,"
Each time a user visits your web page, they are initiating a race to receive content as quickly as possible. Performance is a critical factor that influences how visitors interact with your site. Some might think that moving content across the globe introduces significant latency, but for a while, network transmission speeds have approached their theoretical limits. To put this into perspective, data on Cloudflare can traverse the 11,000 kilometer round trip between New York and London in about 76 milliseconds – faster than the blink of an eye.
However, delays in loading web pages persist due to the complexities of processing requests, responses, and configurations. In addition to pushing advancements in connection establishment, compression, hardware, and software, we have built a new way to reduce page load latency by anticipating how visitors will interact with a given web page.
Today we are very excited to share the latest leap forward in speed: Speed Brain. It relies on the Speculation Rules API to prefetch the content of the user's likely next navigations. The main goal of Speed Brain is to download a web page to the browser cache before a user navigates to it, allowing pages to load almost instantly when the actual navigation takes place.
Our initial approach uses a conservative model that prefetches static content for the next page when a user starts a touch or click event. Through the fourth quarter of 2024 and into 2025, we will offer more aggressive speculation models, such as speculatively prerendering (not just fetching the page before the navigation happens but rendering it completely) for an even faster experience. Eventually, Speed Brain will learn how to eliminate latency for your static website, without any configuration, and work with browsers to make sure that it loads as fast as possible.
To illustrate, imagine an ecommerce website selling clothing. Using the insights from our global request logs, we can predict with high accuracy that a typical visitor is likely to click on ‘Shirts’ when viewing the parent page ‘Mens > Clothes’. Based on this, we can start delivering static content, like images, before the shopper even clicks the ‘Shirts’ link. As a result, when they inevitably click, the page loads instantly. Recent lab testing of our aggressive loading model implementation has shown up to a 75% reduction in Largest Contentful Paint (LCP), the time it takes for the largest visible element (like an image, video, or text block) to load and render in the browser.
The best part? We are making Speed Brain available to all plan types immediately and at no cost. Simply toggle on the Speed Brain feature for your website from the dashboard or the API. It’ll feel like magic, but behind the scenes it's a lot of clever engineering.
We have already enabled Speed Brain by default on all free domains and are seeing a reduction in LCP of 45% on successful prefetches. Pro, Business, and Enterprise domains need to enable Speed Brain manually. If you have not done so already, we strongly recommend also enabling Real User Measurements (RUM) via your dashboard so you can see your new and improved web page performance. As a bonus, enabling RUM for your domain will help us provide improved and customized prefetching and prerendering rules for your website in the near future!
Before discussing how Speed Brain can help load content exceptionally fast, we need to take a step back to review the complexity of loading content on browsers. Every time a user navigates to your web page, a series of request and response cycles must be completed.
After the browser establishes a secure connection with a server, it sends an HTTP request to retrieve the base document of the web page. The server processes the request, constructs the necessary HTML document and sends it back to the browser in the response.
\n \n \n
When the browser receives an HTML document, it immediately begins parsing the content. During this process, it may encounter references to external resources such as CSS files, JavaScript, images, and fonts. These subresources are essential for rendering the page correctly, so the browser issues additional HTTP requests to fetch them. However, if these resources are available in the browser’s cache, the browser can retrieve them locally, significantly reducing network latency and improving page load times.
As the browser processes HTML, CSS, and JavaScript, the rendering engine begins to display content on the screen. Once the page’s visual elements are displayed, user interactions — like clicking a link — prompt the browser to restart much of this process to fetch new content for the next page. This workflow is typical of every browsing session: as users navigate, the browser continually fetches and renders new or uncached resources, introducing a delay before the new page fully loads.
Take the example of a user navigating the shopping site described above. As the shopper moves from the homepage to the ‘men's’ section of the site to the ‘clothing’ section to the ‘shirts’ section, the time spent on retrieving each of those subsequent pages can add up and contribute to the shopper leaving the site before they complete the transaction.
Ideally, having prefetched and prerendered pages present in the browser at the time each of those links are clicked would eliminate much of the network latency impact, allowing the browser to load content instantly and providing a smoother user experience.
\n
\n
Wait, I’ve heard this story before (how did we get to Speed Brain?)
We know what you’re thinking. We’ve had prefetching for years. There have even been several speculative prefetching efforts in the past. You’ve heard this all before. How is this different now?
You’re right, of course. Over the years, there has been a constant effort by developers and browser vendors to optimize page load times and enhance user experience across the web. Numerous techniques have been developed, spanning various layers of the Internet stack — from optimizing network layer connectivity to preloading application content closer to the client.
Early prefetching: lack of data and flexibility
Web prefetching has been one such technique that has existed for more than a decade. It is based on the assumption that certain subresources are likely to be needed in the near future, so why not fetch them proactively? This could include anything from HTML pages to images, stylesheets, or scripts that the user might need as they navigate through a website. In fact, the core concept of speculative execution is not new, as it's a general technique that's been employed in various areas of computer science for years, with branch prediction in CPUs as a prime example.
In the early days of the web, several custom prefetching solutions emerged to enhance performance. For example, in 2005, Google introduced the Google Web Accelerator, a client-side application aimed at speeding up browsing for broadband users. Though innovative, the project was short-lived due to privacy and compatibility issues (we will describe how Speed Brain is different below). Predictive prefetching at that time lacked the data insights and API support for capturing user behavior, especially those handling sensitive actions like deletions or purchases.
Static lists and manual effort
Traditionally, prefetching has been accomplished through the use of the <link rel="prefetch"> attribute as one of the Resource Hints. Developers had to manually specify the attribute on each page for each resource they wanted the browser to preemptively fetch and cache in memory. This manual effort has not only been laborious but developers often lacked insight into what resources should be prefetched, which reduced the quality of their specified hints.
In a similar vein, Cloudflare has offered a URL prefetching feature since 2015. Instead of prefetching in browser cache, Cloudflare allows customers to prefetch a static list of resources into the CDN cache. The feature allows prefetching resources in advance of when they are actually needed, usually during idle time or when network conditions are favorable. However, similar concerns apply for CDN prefetching, since customers have to manually decide on what resources are good candidates for prefetching for each page they own. If misconfigured, static link prefetching can be a footgun, causing the web page load time to actually slow down.
Server Push and its struggles
HTTP/2’s "server push" was another attempt to improve web performance by pushing resources to the client before they were requested. In theory, this would reduce latency by eliminating the need for additional round trips for future assets. However, the server-centric dictatorial nature of "pushing" resources to the client raised significant challenges, primarily due to lack of context about what was already cached in the browser. This not only wasted bandwidth but had the potential to slow down the delivery of critical resources, like base HTML and CSS, due to race conditions on browser fetches when rendering the page. The proposed solution of cache digests, which would have informed servers about client cache contents, never gained widespread implementation, leaving servers to push resources blindly. In October 2022, Google Chrome removed Server Push support, and in September 2024, Firefox followed suit.
A step forward with Early Hints
As a successor, Early Hints was specified in 2017 but not widely adopted until 2022, when we partnered with browsers and key customers to deploy it. It offers a more efficient alternative by "hinting" to clients which resources to load, allowing better prioritization based on what the browser needs. Specifically, the server sends a 103 Early Hints HTTP status code with a list of key page assets that the browser should start loading while the main response is still being prepared. This gives the browser a head start in fetching essential resources and avoids redundant preloading if assets are already cached. Although Early Hints doesn't adapt to user behaviors or dynamic page conditions (yet), its use is primarily limited to preloading specific assets rather than full web pages — in particular, cases where there is a long server “think time” to produce HTML.
As the web evolves, tools that can handle complex, dynamic user interactions will become increasingly important to balance the performance gains of speculative execution with its potential drawbacks for end-users. For years Cloudflare has offered performance-based solutions that adapt to user behavior and work to balance the speed and correctness decisions across the Internet like Argo Smart Routing, Smart Tiered Cache, and Smart Placement. Today we take another step forward toward an adaptable framework for serving content lightning-fast.
Speed Brain offers a robust approach for implementing predictive prefetching strategies directly within the browser based on the ruleset returned by our servers. By building on lessons from previous attempts, it shifts the responsibility for resource prediction to the client, enabling more dynamic and personalized optimizations based on user interaction – like hovering over a link, for example – and their device capabilities. Instead of the browser sitting idly waiting for the next web page to be requested by the user, it takes cues from how a user is interacting with a page and begins asking for the next web page before the user finishes clicking on a link.
Behind the scenes, all of this magic is made possible by the Speculation Rules API, which is an emerging standard in the web performance space from Google. When Cloudflare’s Speed Brain feature is enabled, an HTTP header called Speculation-Rules is added to web page responses. The value for this header is a URL that hosts an opinionated Rules configuration. This configuration instructs the browser to initiate prefetch requests for future navigations. Speed Brain does not improve page load time for the first page that is visited on a website, but it can improve it for subsequent web pages that are visited on the same site.
\n \n \n
The idea seems simple enough, but prefetching comes with challenges, as some prefetched content may never end up being used. With the initial release of Speed Brain, we have designed a solution with guardrails that addresses two important but distinct issues that limited previous speculation efforts — stale prefetch configuration and incorrect prefetching. The Speculation Rules API configuration we have chosen for this initial release has been carefully designed to balance safety of prefetching while still maintaining broad applicability of rules for the entire site.
Stale prefetch configuration
As websites inevitably change over time, static prefetch configurations often become outdated, leading to inefficient or ineffective prefetching. This has been especially true for techniques like the rel=prefetch attribute or static CDN prefetching URL sets, which have required developers to manually maintain relevant prefetchable URL lists for each page of their website. Most static prefetch lists are based on developer intuition rather than real user navigation data, potentially missing important prefetch opportunities or wasting resources on unnecessary prefetches.
Incorrect prefetching
Since prefetch requests are just like normal requests except with a `sec-purpose` HTTP request header, they incur the same overhead on the client, network, and server. However, the crucial difference is that prefetch requests anticipate user behavior and the response might not end up being used, so all that overhead might be wasted. This makes prefetch accuracy extremely important — that is, maximizing the percentage of prefetched pages that end up being viewed by the user. Incorrect prefetching can lead to inefficiencies and unneeded costs, such as caching resources that aren't requested, or wasting bandwidth and network resources, which is especially critical on metered mobile networks or in low-bandwidth environments.
With the initial release of Speed Brain, we have designed a solution with important side effect prevention guardrails that completely removes the chance of stale prefetch configuration, and minimizes the risk of incorrect prefetching. This opinionated configuration is achieved by leveraging the document rules and eagerness settings from the Speculation Rules API. Our chosen configuration looks like the following:
Document Rules, indicated by "source": "document" and the "where" key in the configuration, allows prefetching to be applied dynamically over the entire web page. This eliminates the need for a predefined static URL list for prefetching. Hence, we remove the problem of stale prefetch configuration as prefetch candidate links are determined based on the active page structure.
Our use of "relative_to": "document" in the where clause instructs the browser to limit prefetching to same-site links. This has the added bonus of allowing our implementation to avoid cross-origin prefetches to avoid any privacy implications for users, as it doesn’t follow them around the web.
Eagerness
Eagerness controls how aggressively the browser prefetches content. There are four possible settings:
immediate: Used as soon as possible on page load — generally as soon as the rule value is seen by the browser, it starts prefetching the next page.
eager: Identical to immediate setting above, but the prefetch trigger additionally relies on slight user interaction events, such as moving the cursor towards the link (coming soon).
moderate: Prefetches if you hold the pointer over a link for more than 200 milliseconds (or on the pointerdown event if that is sooner, and on mobile where there is no hover event).
conservative: Prefetches on pointer or touch down on the link.
Our initial release of Speed Brain makes use of the conservative eagerness value to minimize the risk of incorrect prefetching, which can lead to unintended resource waste while making your websites noticeably faster. While we lose out on the potential performance improvements that the more aggressive eagerness settings offer, we chose this cautious approach to prioritize safety for our users. Looking ahead, we plan to explore more dynamic eagerness settings for sites that could benefit from a more liberal setting, and we'll also expand our rules to include prerendering.
Another important safeguard we implement is to only accept prefetch requests for static content that is already stored in our CDN cache. If the content isn't in the cache, we reject the prefetch request. Retrieving content directly from our CDN cache for prefetching requests lets us bypass concerns about their cache eligibility. The rationale for this is straightforward: if a page is not eligible for caching, we don't want it to be prefetched in the browser cache, as it could lead to unintended consequences and increased origin load. For instance, prefetching a logout page might log the user out prematurely before the user actually finishes their action. Stateful prefetching or prerendering requests can have unpredictable effects, potentially altering the server's state for actions the client has not confirmed. By only allowing prefetching for pages already in our CDN cache, we have confidence those pages will not negatively impact the user experience.
These guardrails were implemented to work in performance-sensitive environments. We measured the impact of our baseline conservative deployment model on all pages across Cloudflare’s developer documentation in early July 2024. We found that we were able to prefetch the correct content, content that would in fact be navigated to by the users, 94% of the time. We did this while improving the performance of the navigation by reducing LCP at p75 quantile by 40% without inducing any unintended side effects. The results were amazing!
Our global network spans over 330 cities and operates within 50 milliseconds of 95% of the Internet-connected population. This extensive reach allows us to significantly improve the performance of cacheable assets for our customers. By leveraging this network for smart prefetching with Speed Brain, Cloudflare can serve prefetched content directly from the CDN cache, reducing network latency to practically instant.
Our unique position on the network provides us the leverage to automatically enable Speed Brain without requiring any changes from our customers to their origin server configurations. It's as simple as flipping a switch! Our first version of Speed Brain is now live.
\n \n \n
Upon receiving a request for a web page with Speed Brain enabled, the Cloudflare server returns an additional "Speculation-Rules" HTTP response header. The value for this header is a URL that hosts an opinionated Rules configuration (as mentioned above).
When the browser begins parsing the response header, it fetches our Speculation-Rules configuration, and loads it as part of the web page.
The configuration guides the browser on when to prefetch the next likely page from Cloudflare that the visitor may navigate to, based on how the visitor is engaging with the page.
When a user action (such as mouse down event on the next page link) triggers the Rules application, the browser sends a prefetch request for that page with the "sec-purpose: prefetch" HTTP request header.
Our server parses the request header to identify the prefetch request. If the requested content is present in our cache, we return it; otherwise, we return a 503 HTTP status code and deny the prefetch request. This removes the risk of unsafe side-effects of sending requests to origins or Cloudflare Workers that are unaware of prefetching. Only content present exclusively in the cache is returned.
On a success response, the browser successfully prefetches the content in memory, and when the visitor navigates to that page, the browser directly loads the web page from the browser cache for immediate rendering.
Support for Speed Brain relies on the Speculation Rules API, an emerging web standard. As of September 2024, support for this emerging standard is limited to Chromium-based browsers (version 121 or later), such as Google Chrome and Microsoft Edge. As the web community reaches consensus on API standardization, we hope to see wider adoption across other browser vendors.
Prefetching by nature does not apply to dynamic content, as the state of such content can change, potentially leading to stale or outdated data being delivered to the end user as well as increased origin load. Therefore, Speed Brain will only work for non-dynamic pages of your website that are cached on our network. It has no impact on the loading of dynamic pages. To get the most benefit out of Speed Brain, we suggest making use of cache rules to ensure that all static content (especially HTML content) on your site is eligible for caching.
When the browser receives a 503 HTTP status code in response to a speculative prefetch request (marked by the sec-purpose: prefetch header), it cancels the prefetch attempt. Although a 503 error appearing in the browser's console may seem alarming, it is completely harmless for prefetch request cancellation. In our early tests, the 503 response code has caused some site owners concern. We are working with our partners to iterate on this to improve the client experience, but for now follow the specification guidance, which suggests a 503 response for the browser to safely discard the speculative request. We're in active discussions with Chrome, based on feedback from early beta testers, and believe a new non-error dedicated response code would be more appropriate, and cause less confusion. In the meantime, 503 response logs for prefetch requests related to Speed Brain are harmless. If your tooling makes ignoring these requests difficult, you can temporarily disable Speed Brain until we work out something better with the Chrome Team.
Additionally, when a website uses both its own custom Speculation Rules and Cloudflare's Speed Brain feature, both rule sets can operate simultaneously. Cloudflare’s guardrails will limit speculation rules to cacheable pages, which may be an unexpected limitation for those with existing implementations. If you observe such behavior, consider disabling one of the implementations for your site to ensure consistency in behavior. Note that if your origin server responses include the Speculation-Rules header, it will not be overridden. Therefore, the potential for ruleset conflicts primarily applies to predefined in-line speculation rules.
In general, we suggest that you use Speed Brain and most other Cloudflare performance features with our RUM performance measurement tool enabled. Our RUM feature helps developers and website operators understand how their end users are experiencing the performance of their application, providing visibility into:
Loading: How long did it take for content to become available?
Interactivity: How responsive is the website when users interact with it?
Visual stability: How much does the page move around while loading?
With RUM enabled, you can navigate to the Web Analytics section in the dashboard to see important information about how Speed Brain is helping reduce latency in your core web vitals metrics like Largest Contentful Paint (LCP) and load time.
\n \n \n
Example RUM dashboard for a website with a high amount of prefetchable content that enabled Speed Brain around September 16.
Cloudflare currently has tens of millions of domains using Speed Brain. We have measured the LCP at the 75th quantile (p75) for these sites and found an improvement for these sites between 40% and 50% (average around 45%).
We found this improvement by comparing navigational prefetches to normal (non-prefetched) page loads for the same set of domains.
Before Speed Brain is enabled, the p75 of free websites on Cloudflare experience an LCP around 2.2 seconds. With Speed Brain enabled, these sites see significant latency savings on LCP. In aggregate, Speed Brain saves about 0.88 seconds on the low end and up to 1.1 seconds on each successful prefetch!
Currently, the Speculation Rules API is only available in Chromium browsers. From Cloudflare Radar, we can see that approximately 70% of requests from visitors are from Chromium (Chrome, Edge, etc) browsers.
Cloudflare sees hundreds of billions of requests for HTML content each day. Of these requests, about half are cached (make sure your HTML is cacheable!). Around 1% of those requests are for navigational prefetching made by the visitors. This represents significant savings every day for visitors to websites with Speed Brain enabled. Every 24 hours, Speed Brain can save more than 82 years worth of latency!
Our unique position on the Internet provides us valuable insights into web browsing patterns, which we can leverage for improving web performance while maintaining individual user privacy. By employing a generalized data-driven machine learning approach, we can define more accurate and site-specific prefetch predictors for users’ pages.
We are in the process of developing an adaptive speculative model that significantly improves upon our current conservative offering. This model uses a privacy-preserving method to generate a user traversal graph for each site based on same-site Referrer headers. For any two pages connected by a navigational hop, our model predicts the likelihood of a typical user moving between them, using insights extracted from our aggregated traffic data.
This model enables us to tailor rule sets with custom eagerness values to each relevant next page link on your site. For pages where the model predicts high confidence in user navigation, the system will aggressively prefetch or prerender them. If the model does not provide a rule for a page, it defaults to our existing conservative approach, maintaining the benefits of baseline Speed Brain model. These signals guide browsers in prefetching and prerendering the appropriate pages, which helps speed up navigation for users, while maintaining our current safety guardrails.
In lab tests, our ML model improved LCP latency by 75% and predicted visitor navigation with ~98% accuracy, ensuring the correct pages were being prefetched to prevent resource waste for users. As we move toward scaling this solution, we are focused on periodic training of the model to adapt to varying user behaviors and evolving websites. Using an online machine learning approach will drastically reduce the need for any manual update, and content drifts, while maintaining high accuracy — the Speed Brain load solution that gets smarter over time!
As we’ve mentioned, we believe that our RUM tools offer the best insights for how Speed Brain is helping the performance of your website. In the future, we plan on offering the ability to filter RUM tooling by navigation type so that you can compare the browser rendering of prefetched content versus non-prefetched content.
We are currently offering the ability for prefetching on cacheable content. Prefetching downloads the main document resource of the page before the user’s navigation, but it does not instruct the browser to prerender the page or download any additional subresources.
In the future, Cloudflare’s Speed Brain offering will prefetch content into our CDN cache and then work with browsers to know what are the best prospects for prerendering. This will help get static content even closer to instant rendering.
\n
\n
Argo Smart Browsing: Speed Brain & Smart Routing
Speed Brain, in its initial implementation, provides an incredible performance boost whilst still remaining conservative in its implementation; both from an eagerness, and a resource consumption perspective.
As was outlined earlier in the post, lab testing of a more aggressive model, powered by machine-learning and a higher eagerness, yielded a 75% reduction in LCP. We are investigating bundling this more aggressive, additional implementation of Speed Brain with Argo Smart Routing into a product called “Argo Smart Browsing”.
Cloudflare customers will be free to continue using Speed Brain, however those who want even more performance improvement will be able to enable Argo Smart Browsing with a single button click. With Argo Smart Browsing, not only will cacheable static content load up to 75% faster in the browser, thanks to the more aggressive models, however in times when content can’t be cached, and the request must go forward to an origin server, it will be sent over the most performant network path resulting in an average 33% performance increase. Performance optimizations are being applied to almost every segment of the request lifecycle regardless if the content is static or dynamic, cached or not.
We strongly recommend that customers also enable RUM, found in the same section of the dashboard, to give visibility into the performance improvements provided by Speed Brain and other Cloudflare features and products.
We’re excited to continue to build products and features that make web performance reliably fast. If you’re an engineer interested in improving the performance of the web for all, come join us!
"],"published_at":[0,"2024-09-25T14:00+01:00"],"updated_at":[0,"2024-10-09T23:05:02.341Z"],"feature_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6pwJnviaM9hXlQcCbb4uXe/deedbd33b4e0df917e700a29b0f4724b/BLOG-2422_1.png"],"tags":[1,[[0,{"id":[0,"1Cv5JjXzKWKEA10JdYbXu1"],"name":[0,"Birthday Week"],"slug":[0,"birthday-week"]}],[0,{"id":[0,"48r7QV00gLMWOIcM1CSDRy"],"name":[0,"Speed & Reliability"],"slug":[0,"speed-and-reliability"]}],[0,{"id":[0,"1x7tpPmKIUCt19EDgM1Tsl"],"name":[0,"Research"],"slug":[0,"research"]}],[0,{"id":[0,"5RrjSR5vIOJAfRdT8966hf"],"name":[0,"Cache"],"slug":[0,"cache"]}],[0,{"id":[0,"1q6r2yWvvqOfafxEeKdl8J"],"name":[0,"Speed Brain"],"slug":[0,"speed-brain"]}],[0,{"id":[0,"6QktrXeEFcl4e2dZUTZVGl"],"name":[0,"Product News"],"slug":[0,"product-news"]}]]],"relatedTags":[0],"authors":[1,[[0,{"name":[0,"Alex Krivit"],"slug":[0,"alex"],"bio":[0,null],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1e9gtWsffhoAWG0rTN45cR/35204e3d77d1a90feeb57a32e388daca/alex.jpg"],"location":[0,null],"website":[0,null],"twitter":[0,"@ackriv"],"facebook":[0,null]}],[0,{"name":[0,"Suleman Ahmad"],"slug":[0,"suleman"],"bio":[0,null],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1WnbvI0pMk4oVLXHdvpIqH/51084688ace222ec1c1aafbb3bf0fc8b/suleman.png"],"location":[0,"Austin, TX"],"website":[0,null],"twitter":[0,"@sulemanahmadd"],"facebook":[0,null]}],[0,{"name":[0,"William Woodhead"],"slug":[0,"william"],"bio":[0,null],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5dOjEvIFAlo8aibHNxah1w/7bb1c36947686c4b3eda4839ba2e85c3/william.jpg"],"location":[0,null],"website":[0,null],"twitter":[0,null],"facebook":[0,null]}]]],"meta_description":[0,"We are excited to announce the latest leap forward in speed – Speed Brain. Speed Brain uses the Speculation Rules API to prefetch content for the user's likely next navigations. The goal is to download a web page to the browser before a user navigates to it, allowing pages to load instantly. \n"],"primary_author":[0,{}],"localeList":[0,{"name":[0,"Loc: Introducing Speed Brain: helping web pages load 45% faster "],"enUS":[0,"English for Locale"],"zhCN":[0,"Translated for Locale"],"zhHansCN":[0,"No Page for Locale"],"zhTW":[0,"Translated for Locale"],"frFR":[0,"Translated for Locale"],"deDE":[0,"Translated for Locale"],"itIT":[0,"No Page for Locale"],"jaJP":[0,"Translated for Locale"],"koKR":[0,"Translated for Locale"],"ptBR":[0,"No Page for Locale"],"esLA":[0,"No Page for Locale"],"esES":[0,"Translated for Locale"],"enAU":[0,"No Page for Locale"],"enCA":[0,"No Page for Locale"],"enIN":[0,"No Page for Locale"],"enGB":[0,"No Page for Locale"],"idID":[0,"No Page for Locale"],"ruRU":[0,"No Page for Locale"],"svSE":[0,"No Page for Locale"],"viVN":[0,"No Page for Locale"],"plPL":[0,"No Page for Locale"],"arAR":[0,"No Page for Locale"],"nlNL":[0,"No Page for Locale"],"thTH":[0,"No Page for Locale"],"trTR":[0,"No Page for Locale"],"heIL":[0,"No Page for Locale"],"lvLV":[0,"No Page for Locale"],"etEE":[0,"No Page for Locale"],"ltLT":[0,"No Page for Locale"]}],"url":[0,"https://blog.cloudflare.com/introducing-speed-brain"],"metadata":[0,{"title":[0,"Introducing Speed Brain: helping web pages load 45% faster "],"description":[0,"We are excited to announce the latest leap forward in speed – Speed Brain. Speed Brain uses the Speculation Rules API to prefetch content for the user's likely next navigations. The goal is to download a web page to the browser before a user navigates to it, allowing pages to load instantly. "],"imgPreview":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2VHvuIVSAEUvuWpfCFPNDu/3c99d9b6fac2add7f36ccbf34e784562/BLOG-2422_OG.png"]}]}],[0,{"id":[0,"11EWaw0wCUNwPTM30w7oUN"],"title":[0,"Instant Purge: invalidating cached content in under 150ms"],"slug":[0,"instant-purge"],"excerpt":[0,"Today we’re excited to share that we’ve built the fastest cache purge in the industry. We now offer a global purge latency for purge by tags, hostnames, and prefixes of less than 150ms on average (P50), representing a 90% improvement since May 2022. "],"featured":[0,false],"html":[0,"
Over the past 14 years, Cloudflare has evolved far beyond a Content Delivery Network (CDN), expanding its offerings to include a comprehensive Zero Trust security portfolio, network security & performance services, application security & performance optimizations, and a powerful developer platform. But customers also continue to rely on Cloudflare for caching and delivering static website content. CDNs are often judged on their ability to return content to visitors as quickly as possible. However, the speed at which content is removed from a CDN's global cache is just as crucial.
When customers frequently update content such as news, scores, or other data, it is essential they avoid serving stale, out-of-date information from cache to visitors. This can lead to a subpar experience where users might see invalid prices, or incorrect news. The goal is to remove the stale content and cache the new version of the file on the CDN, as quickly as possible. And that starts by issuing a “purge.”
In May 2022, we released the first partof the series detailing our efforts to rebuild and publicly document the steps taken to improve the system our customers use, to purge their cached content. Our goal was to increase scalability, and importantly, the speed of our customer’s purges. In that initial post, we explained how our purge system worked and the design constraints we found when scaling. We outlined how after more than a decade, we had outgrown our purge system and started building an entirely new purge system, and provided purge performance benchmarking that users experienced at the time. We set ourselves a lofty goal: to be the fastest.
Today, we’re excited to share that we’ve built the fastest cache purge in the industry. We now offer a global purge latency for purge by tags, hostnames, and prefixes of less than 150ms on average (P50), representing a 90% improvement since May 2022. Users can now purge from anywhere, (almost) instantly. By the time you hit enter on a purge request and your eyes blink, the file is now removed from our global network — including data centers in 330 cities and 120+ countries.
But that’s not all. It wouldn’t be Birthday Week if we stopped at just being the fastest purge. We are alsoannouncing that we’re opening up more purge options to Free, Pro, and Business plans. Historically, only Enterprise customers had access to the full arsenal of cache purge methods supported by Cloudflare, such as purge by cache-tags, hostnames, and URL prefixes. As part of rebuilding our purge infrastructure, we’re not only fast but we are able to scale well beyond our current capacity. This enables more customers to use different types of purge. We are excited to offer these new capabilities to all plan types once we finish rolling out our new purge infrastructure, and expect to begin offering additional purge capabilities to all plan types in early 2025.
Caching content is like pulling off a spectacular magic trick. It makes loading website content lightning-fast for visitors, slashes the load on origin servers and the cost to operate them, and enables global scalability with a single button press. But here's the catch: for the magic to work, caching requires predicting the future. The right content needs to be cached in the right data center, at the right moment when requests arrive, and in the ideal format. This guarantees astonishing performance for visitors and game-changing scalability for web properties.
Cloudflare helps make this caching magic trick easy. But regular users of our cache know that getting content into cache is only part of what makes it useful. When content is updated on an origin, it must also be updated in the cache. The beauty of caching is that it holds content until it expires or is evicted. To update the content, it must be actively removed and updated across the globe quickly and completely. If data centers are not uniformly updated or are updated at drastically different times, visitors risk getting different data depending on where they are located. This is where cache “purging” (also known as “cache invalidation”) comes in.
Back in part 2 of the blog series, we touched on how there are multiple ways of purging cache: by URL, cache-tag, hostname, URL prefix, and “purge everything”, and discussed a necessary distinction between purging by URL and the other four kinds of purge — referred to as flexible purges — based on the scope of their impact.
The reason flexible purge isn’t also fully coreless yet is because it’s a more complex task than “purge this object”; flexible purge requests can end up purging multiple objects – or even entire zones – from cache. They do this through an entirely different process that isn’t coreless compatible, so to make flexible purge fully coreless we would have needed to come up with an entirely new multi-purge mechanism on top of redesigning distribution. We chose instead to start with just purge by URL, so we could focus purely on the most impactful improvements, revamping distribution, without reworking the logic a data center uses to actually remove an object from cache.
We said our next steps included a redesign of flexible purges at Cloudflare, and today we’d like to walk you through the resulting system. But first, a brief history of flexible cache purges at Cloudflare and elaboration on why the old flexible purge system wasn’t “coreless compatible”.
“Cache” within a given data center is made up of many machines, all contributing disk space to store customer content. When a request comes in for an asset, the URL and headers are used to calculate a cache key, which is the filename for that content on disk and also determines which machine in the datacenter that file lives on. The filename is the same for every data center, and every data center knows how to use it to find the right machine to cache the content. A purge request for a URL (plus headers) therefore contains everything needed to generate the cache key — the pointer to the response object on disk — and getting that key to every data center is the hardest part of carrying out the purge.
Purging content based on response properties has a different hardest part. If a customer wants to purge all content with the cache-tag “foo”, for example, there’s no way for us to generate all the cache keys that will point to the files with that cache-tag at request time. Cache-tags are response headers, and the decision of where to store a file is based on request attributes only. To find all files with matching cache-tags, we would need to look at every file in every cache disk on every machine in every data center. That’s thousands upon thousands of machines we would be scanning for each purge-by-tag request. There are ways to avoid actually continuously scanning all disks worldwide (foreshadowing!) but for our first implementation of our flexible purge system, we hoped to avoid the problem space altogether.
\n \n \n \n \n
An alternative approach to going to every machine and looking for all files that match some criteria to actively delete from disk was something we affectionately referred to as “lazy purge”. Instead of deleting all matching files as soon as we process a purge request, we wait to do so when we get an end user request for one of those files. Whenever a request comes in, and we have the file in cache, we can compare the timestamp of any recent purge requests from the file owner to the insertion timestamp of the file we have on disk. If the purge timestamp is fresher than the insertion timestamp, we pretend we didn’t find the file on disk. For this to work, we needed to keep track of purge requests going back further than a data center’s maximum cache eviction age to be sure that any file a customer sends a matching flex purge to clear from cache will either be naturally evicted, or forced to cache MISS and get refreshed from the origin. With this approach, we just needed a distribution and storage system for keeping track of flexible purges.
At Cloudflare there is a lot of configuration data that needs to go “everywhere”: cache configuration, load balancer settings, firewall rules, host metadata — countless products, features, and services that depend on configuration data that’s managed through Cloudflare’s control plane APIs. This data needs to be accessible by every machine in every datacenter in our network. The vast majority of that data is distributed via a system introduced several years ago called Quicksilver. The system works very, very well (sub-second p99 replication lag, globally). It’s extremely flexible and reliable, and reads are lightning fast. The team responsible for the system has done such a good job that Quicksilver has become a hammer that when wielded, makes everything look like a nail… like flexible purges.
\n \n \n \n \n
Core-based purge request entering a data center and getting backhauled to a core data center where Quicksilver distributes the request to all network data centers (hub and spoke).
Our first version of the flexible purge system used Quicksilver’s spoke-hub distribution to send purges from a core data center to every other data center in our network. It took less than a second for flexible purges to propagate, and once in a given data center, the purge key lookups in the hot path to force cache misses were in the low hundreds of microseconds. We were quite happy with this system at the time, especially because of the simplicity. Using well-supported internal infrastructure meant we weren’t having to manage database clusters or worry about transport between data centers ourselves, since we got that “for free”. Flexible purge was a new feature set and the performance seemed pretty good, especially since we had no predecessor to compare against.
Our first version of flexible purge didn’t start showing cracks for years, but eventually both our network and our customer base grew large enough that our system was reaching the limits of what it could scale to. As mentioned above, we needed to store purge requests beyond our maximum eviction age. Purge requests are relatively small, and compress well, but thousands of customers using the API millions of times a day adds up to quite a bit of storage that Quicksilver needed on each machine to maintain purge history, and all of that storage cut into disk space we could otherwise be using to cache customer content. We also found the limits of Quicksilver in terms of how many writes per second it could handle without replication slowing down. We bought ourselves more runway by putting Kafka queues in front of Quicksilver to buffer and throttle ourselves to even out traffic spikes, and increased batching, but all of those protections introduced latency. We knew we needed to come up with a solution without such a strong correlation between usage and operational costs.
Another pain point exposed by our growing user base that we mentioned in Part 2 was the excessive round trip times experienced by customers furthest away from our core data centers. A purge request sent by a customer in Australia would have to cross the Pacific Ocean and back before local customers would see the new content.
To summarize, three issues were plaguing us:
Latency corresponding to how far a customer was from the centralized ingest point.
Latency due to the bottleneck for writes at the centralized ingest point.
Storage needs in all data centers correlating strongly with throughput demand.
The first two issues affected all types of purge. The spoke-hub distribution model was problematic for purge-by-URL just as much as it was for flexible purges. So we embarked on the path to peer-to-peer distribution for purge-by-URL to address the latency and throughput issues, and the results of that project were good enough that we wanted to propagate flexible purges through the same system. But doing so meant we’d have to replace our use of Quicksilver; it was so good at what it does (fast/reliable replication network-wide, extremely fast/high read throughput) in large part because of the core assumption of spoke-hub distribution it could optimize for. That meant there was no way to write to Quicksilver from “spoke” data centers, and we would need to find another storage system for our purges.
We decided if we’re going to replace our storage system we should dig into exactly what our needs are and find the best fit. It was time to revisit some of our oldest conclusions to see if they still held true, and one of the earlier ones was that proactively purging content from disk would be difficult to do efficiently given our storage layout.
But was that true? Or could we make active cache purge fast and efficient (enough)? What would it take to quickly find files on disk based on their metadata? “Indexes!” you’re probably screaming, and for good reason. Indexing files’ hostnames, cache-tags, and URLs would undoubtedly make querying for relevant files trivial, but a few aspects of our network make it less straightforward.
Cloudflare has hundreds of data centers that see trillions of unique files, so any kind of global index — even ignoring the networking hurdles of aggregation — would suffer the same type of bottlenecking issues with our previous spoke-hub system. Scoping the indices to the data center level would be better, but they vary in size up to several hundred machines. Managing a database cluster in each data center scaled to the appropriate size for the aggregate traffic of all the machines was a daunting proposition; it could easily end up being enough work on its own for a separate team, not something we should take on as a side hustle.
The next step down in scope was an index per machine. Indexing on the same machine as the cache proxy had some compelling upsides:
The proxy could talk to the index over UDS (Unix domain sockets), avoiding networking complexities in the hottest paths.
As a sidecar service, the index just had to be running anytime the machine was accepting traffic. If a machine died, so would the index, but that didn’t matter, so there wasn’t any need to deal with the complexities of distributed databases.
While data centers were frequently adding and removing machines, machines weren’t frequently adding and removing disks. An index could reasonably count on its maximum size being predictable and constant based on overall disk size.
But we wanted to make sure it was feasible on our machines. We analyzed representative cache disks from across our fleet, gathering data like the number of cached assets per terabyte and the average number of cache-tags per asset. We looked at cache MISS, REVALIDATED, and EXPIRED rates to estimate the required write throughput.
After conducting a thorough analysis, we were convinced the design would work. With a clearer understanding of the anticipated read/write throughput, we started looking at databases that could meet our needs. After benchmarking several relational and non-relational databases, we ultimately chose RocksDB, a high-performance embedded key-value store. We found that with proper tuning, it could be extremely good at the types of queries we needed.
And so CacheDB was born — a service written in Rust and built on RocksDB, which operates on each machine alongside the cache proxy to manage the indexing and purging of cached files. We integrated the cache proxy with CacheDB to ensure that indices are stored whenever a file is cached or updated, and they’re deleted when a file is removed due to eviction or purging. In addition to indexing data, CacheDB maintains a local queue for buffering incoming purge operations. A background process reads purge operations in the queue, looking up all matching files using the indices, and deleting the matched files from disk. Once all matched files for an operation have been deleted, the process clears the indices and removes the purge operation from the queue.
To further optimize the speed of purges taking effect, the cache proxy was updated to check with CacheDB — similar to the previous lazy purge approach — when a cache HIT occurs before returning the asset. CacheDB does a quick scan of its local queue to see if there are any pending purge operations that match the asset in question, dictating whether the cache proxy should respond with the cached file or fetch a new copy. This means purges will prevent the cache proxy from returning a matching cached file as soon as a purge reaches the machine, even if there are millions of files that correspond to a purge key, and it takes a while to actually delete them all from disk.
\n \n \n \n \n
Coreless purge using CacheDB and Durable Objects to distribute purges without needing to first stop at a core data center.
The last piece to change was the distribution pipeline, updated to broadcast flexible purges not just to every data center, but to the CacheDB service running on every machine. We opted for CacheDB to handle the last-mile fan out of machine to machine within a data center, using consul to keep each machine informed of the health of its peers. The choice let us keep the Workers largely the same for purge-by-URL (more here) and flexible purge handling, despite the difference in termination points.
Our new approach reduced the long tail of the lazy purge, saving 10x storage. Better yet, we can now delete purged content immediately instead of waiting for the lazy purge to happen or expire. This new-found storage will improve cache retention on disk for all users, leading to improved cache HIT ratios and reduced egress from your origin.
\n \n \n \n \n
The shift from lazy content purging (left) to the new Coreless Purge architecture allows us to actively delete content (right). This helps reduce storage needs and increase cache retention times across our service.
With the new coreless cache purge, we can now get a purge request into any datacenter, distribute the keys to purge, and instantly purge the content from the cache database. This all occurs in less than 150 milliseconds on P50 for tags, hostnames, and prefix URL, covering all 330 cities in 120+ countries.
To measure Instant Purge, we wanted to make sure that we were looking at real user metrics — that these were purges customers were actually issuing and performance that was representative of what we were seeing under real conditions, rather than marketing numbers.
The time we measure represents the period when a request enters the local datacenter, and ends with when the purge has been executed in every datacenter. When the local data center receives the request, one of the first things we do is to add a timestamp to the purge request. When all data centers have completed the purge action, another timestamp is added to “stop the clock.” Each purge request generates this performance data, and it is then sent to a database for us to measure the appropriate quantiles and to help us understand how we can improve further.
In August 2024, we took purge performance data and segmented our collected data by region based on where the local data center receiving the request was located.
Region
P50 Aug 2024 (Coreless)
P50 May 2022 (Core-based)
Improvement
Africa
303ms
1,420ms
78.66%
Asia Pacific Region (APAC)
199ms
1,300ms
84.69%
Eastern Europe (EEUR)
140ms
1,240ms
88.70%
Eastern North America (ENAM)
119ms
1,080ms
88.98%
Oceania
191ms
1,160ms
83.53%
South America (SA)
196ms
1,250ms
84.32%
Western Europe (WEUR)
131ms
1,190ms
88.99%
Western North America (WNAM)
115ms
1,000ms
88.5%
Global
149ms
1,570ms
90.5%
Note: Global latency numbers on the core-based measurements (May 2022) may be larger than the regional numbers because it represents all of our data centers instead of only a regional portion, so outliers and retries might have an outsized effect.
We are currently wrapping up the roll-out of the last throughput changes which allow us to efficiently scale purge requests. As that happens, we will revise our rate limits and open up purge by tag, hostname, and prefix to all plan types! We expect to begin rolling out the additional purge types to all plans and users beginning in early 2025.
In addition, in the process of implementing this new approach, we have identified improvements that will shave a few more milliseconds off our single-file purge. Currently, single-file purges have a P50 of 234ms. However, we want to, and can, bring that number down to below 200ms.
If you want to come work on the world's fastest purge system, check out our open positions.
"],"published_at":[0,"2024-09-25T00:00+01:00"],"updated_at":[0,"2024-10-09T23:05:06.504Z"],"feature_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/59kdx20ZFbCY9AqWfUCsew/d06d6b1e998e56e616690ce8a77c62bf/image4.png"],"tags":[1,[[0,{"id":[0,"1Cv5JjXzKWKEA10JdYbXu1"],"name":[0,"Birthday Week"],"slug":[0,"birthday-week"]}],[0,{"id":[0,"4gN0ARax0fHxjtZL07THOe"],"name":[0,"Performance"],"slug":[0,"performance"]}],[0,{"id":[0,"5RrjSR5vIOJAfRdT8966hf"],"name":[0,"Cache"],"slug":[0,"cache"]}],[0,{"id":[0,"48r7QV00gLMWOIcM1CSDRy"],"name":[0,"Speed & Reliability"],"slug":[0,"speed-and-reliability"]}]]],"relatedTags":[0],"authors":[1,[[0,{"name":[0,"Alex Krivit"],"slug":[0,"alex"],"bio":[0,null],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1e9gtWsffhoAWG0rTN45cR/35204e3d77d1a90feeb57a32e388daca/alex.jpg"],"location":[0,null],"website":[0,null],"twitter":[0,"@ackriv"],"facebook":[0,null]}],[0,{"name":[0,"Tim Kornhammar"],"slug":[0,"tim-kornhammar"],"bio":[0,null],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/48jH4cWwD5fakrdEvnB67d/5b3dae96cf6bc0b1032cb17a7a300b74/tim-kornhammar.jpg"],"location":[0,"Texas, USA"],"website":[0,null],"twitter":[0,null],"facebook":[0,null]}],[0,{"name":[0," Connor Harwood"],"slug":[0,"connor-harwood"],"bio":[0],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7ChQBYRPNIiEuQnGQdajbt/3b67f09fd6dae792eb8a1ffafaa1254a/_tmp_mini_magick20221017-43-y5gfah.jpg"],"location":[0],"website":[0],"twitter":[0],"facebook":[0]}]]],"meta_description":[0,"Today we’re excited to share that we’ve built the fastest cache purge in the industry. We now offer a global purge latency for purge by tags, hostnames, and prefixes of less than 150ms on average (P50), representing a 90% improvement since May 2022. "],"primary_author":[0,{}],"localeList":[0,{"name":[0,"blog-english-only"],"enUS":[0,"English for Locale"],"zhCN":[0,"No Page for Locale"],"zhHansCN":[0,"No Page for Locale"],"zhTW":[0,"No Page for Locale"],"frFR":[0,"No Page for Locale"],"deDE":[0,"No Page for Locale"],"itIT":[0,"No Page for Locale"],"jaJP":[0,"No Page for Locale"],"koKR":[0,"No Page for Locale"],"ptBR":[0,"No Page for Locale"],"esLA":[0,"No Page for Locale"],"esES":[0,"No Page for Locale"],"enAU":[0,"No Page for Locale"],"enCA":[0,"No Page for Locale"],"enIN":[0,"No Page for Locale"],"enGB":[0,"No Page for Locale"],"idID":[0,"No Page for Locale"],"ruRU":[0,"No Page for Locale"],"svSE":[0,"No Page for Locale"],"viVN":[0,"No Page for Locale"],"plPL":[0,"No Page for Locale"],"arAR":[0,"No Page for Locale"],"nlNL":[0,"No Page for Locale"],"thTH":[0,"No Page for Locale"],"trTR":[0,"No Page for Locale"],"heIL":[0,"No Page for Locale"],"lvLV":[0,"No Page for Locale"],"etEE":[0,"No Page for Locale"],"ltLT":[0,"No Page for Locale"]}],"url":[0,"https://blog.cloudflare.com/instant-purge"],"metadata":[0,{"title":[0,"Instant Purge: invalidating cached content in under 150ms"],"description":[0,"Today we’re excited to share that we’ve built the fastest cache purge in the industry. We now offer a global purge latency for purge by tags, hostnames, and prefixes of less than 150ms on average (P50), representing a 90% improvement since May 2022. "],"imgPreview":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/18ZVwqqO5THZ7TrRFKjxR6/e3c93c852b89567e4dde5cb76240f5d7/Instant_Purge-_invalidating_cached_content_in_under_150ms-OG.png"]}]}]]],"locale":[0,"en-us"],"translations":[0,{"posts.by":[0,"By"],"footer.gdpr":[0,"GDPR"],"lang_blurb1":[0,"This post is also available in {lang1}."],"lang_blurb2":[0,"This post is also available in {lang1} and {lang2}."],"lang_blurb3":[0,"This post is also available in {lang1}, {lang2} and {lang3}."],"footer.press":[0,"Press"],"header.title":[0,"The Cloudflare Blog"],"search.clear":[0,"Clear"],"search.filter":[0,"Filter"],"search.source":[0,"Source"],"footer.careers":[0,"Careers"],"footer.company":[0,"Company"],"footer.support":[0,"Support"],"footer.the_net":[0,"theNet"],"search.filters":[0,"Filters"],"footer.our_team":[0,"Our team"],"footer.webinars":[0,"Webinars"],"page.more_posts":[0,"More posts"],"posts.time_read":[0,"{time} min read"],"search.language":[0,"Language"],"footer.community":[0,"Community"],"footer.resources":[0,"Resources"],"footer.solutions":[0,"Solutions"],"footer.trademark":[0,"Trademark"],"header.subscribe":[0,"Subscribe"],"footer.compliance":[0,"Compliance"],"footer.free_plans":[0,"Free plans"],"footer.impact_ESG":[0,"Impact/ESG"],"posts.follow_on_X":[0,"Follow on X"],"footer.help_center":[0,"Help center"],"footer.network_map":[0,"Network Map"],"header.please_wait":[0,"Please Wait"],"page.related_posts":[0,"Related posts"],"search.result_stat":[0,"Results {search_range} of {search_total} for {search_keyword}"],"footer.case_studies":[0,"Case Studies"],"footer.connect_2024":[0,"Connect 2024"],"footer.terms_of_use":[0,"Terms of Use"],"footer.white_papers":[0,"White Papers"],"footer.cloudflare_tv":[0,"Cloudflare TV"],"footer.community_hub":[0,"Community Hub"],"footer.compare_plans":[0,"Compare plans"],"footer.contact_sales":[0,"Contact Sales"],"header.contact_sales":[0,"Contact Sales"],"header.email_address":[0,"Email Address"],"page.error.not_found":[0,"Page not found"],"footer.developer_docs":[0,"Developer docs"],"footer.privacy_policy":[0,"Privacy Policy"],"footer.request_a_demo":[0,"Request a demo"],"page.continue_reading":[0,"Continue reading"],"footer.analysts_report":[0,"Analyst reports"],"footer.for_enterprises":[0,"For enterprises"],"footer.getting_started":[0,"Getting Started"],"footer.learning_center":[0,"Learning Center"],"footer.project_galileo":[0,"Project Galileo"],"pagination.newer_posts":[0,"Newer Posts"],"pagination.older_posts":[0,"Older Posts"],"posts.social_buttons.x":[0,"Discuss on X"],"search.icon_aria_label":[0,"Search"],"search.source_location":[0,"Source/Location"],"footer.about_cloudflare":[0,"About Cloudflare"],"footer.athenian_project":[0,"Athenian Project"],"footer.become_a_partner":[0,"Become a partner"],"footer.cloudflare_radar":[0,"Cloudflare Radar"],"footer.network_services":[0,"Network services"],"footer.trust_and_safety":[0,"Trust & Safety"],"header.get_started_free":[0,"Get Started Free"],"page.search.placeholder":[0,"Search Cloudflare"],"footer.cloudflare_status":[0,"Cloudflare Status"],"footer.cookie_preference":[0,"Cookie Preferences"],"header.valid_email_error":[0,"Must be valid email."],"search.result_stat_empty":[0,"Results {search_range} of {search_total}"],"footer.connectivity_cloud":[0,"Connectivity cloud"],"footer.developer_services":[0,"Developer services"],"footer.investor_relations":[0,"Investor relations"],"page.not_found.error_code":[0,"Error Code: 404"],"search.autocomplete_title":[0,"Insert a query. Press enter to send"],"footer.logos_and_press_kit":[0,"Logos & press kit"],"footer.application_services":[0,"Application services"],"footer.get_a_recommendation":[0,"Get a recommendation"],"posts.social_buttons.reddit":[0,"Discuss on Reddit"],"footer.sse_and_sase_services":[0,"SSE and SASE services"],"page.not_found.outdated_link":[0,"You may have used an outdated link, or you may have typed the address incorrectly."],"footer.report_security_issues":[0,"Report Security Issues"],"page.error.error_message_page":[0,"Sorry, we can't find the page you are looking for."],"header.subscribe_notifications":[0,"Subscribe to receive notifications of new posts:"],"footer.cloudflare_for_campaigns":[0,"Cloudflare for Campaigns"],"header.subscription_confimation":[0,"Subscription confirmed. Thank you for subscribing!"],"posts.social_buttons.hackernews":[0,"Discuss on Hacker News"],"footer.diversity_equity_inclusion":[0,"Diversity, equity & inclusion"],"footer.critical_infrastructure_defense_project":[0,"Critical Infrastructure Defense Project"]}],"localesAvailable":[1,[]],"footerBlurb":[0,"Cloudflare's connectivity cloud protects entire corporate networks, helps customers build Internet-scale applications efficiently, accelerates any website or Internet application, wards off DDoS attacks, keeps hackers at bay, and can help you on your journey to Zero Trust.
Visit 1.1.1.1 from any device to get started with our free app that makes your Internet faster and safer.
To learn more about our mission to help build a better Internet, start here. If you're looking for a new career direction, check out our open positions."]}" client="load" opts="{"name":"Post","value":true}" await-children="">
Since signing up for any new service can often lead to a number ofquestions, I created a guide of things you should know after activatingCloudFlare.
1. Check to make sure that your DNS records are correct on the DNS settings page.
If you need to add or edit entries to your DNS records, you can do so by going to:
Your Websites->Settings->DNS settings
One very important note to make about DNS: While CloudFlare can resolveDNS for all of your subdomains, CloudFlare can only proxy domains thatwould be considered web traffic (www, blog, etc.). Please read more about thesubdomains that are suitable to have a gray or orange cloud here.
2. Familiarize yourself with the CloudFlare Settings page
The CloudFlare Settings page is where you will find all of the optionalfeatures that can be turned on or off. Some key features on this pageinclude:
CloudFlare Security Level Settings
Use case: You can set the security level for your website to high,medium or low.
CloudFlare's Development Mode
Use case: You would want to turn this option on when you're making changes to static files (javascript, CSS, images, etc.) on your site that you want to appear immediately.
Purge Cache
Use case: You want to clear the CloudFlare cache of all the static files for a domain
Visit 1.1.1.1 from any device to get started with our free app that makes your Internet faster and safer.
To learn more about our mission to help build a better Internet, start here. If you're looking for a new career direction, check out our open positions.
We realized that we need a way to automatically heal our platform from an operations perspective, and designed and built a workflow orchestration platform to provide these self-healing capabilities ...
We are excited to announce the latest leap forward in speed – Speed Brain. Speed Brain uses the Speculation Rules API to prefetch content for the user's likely next navigations. The goal is to download a web page to the browser before a user navigates to it, allowing pages to load instantly.
...
Today we’re excited to share that we’ve built the fastest cache purge in the industry. We now offer a global purge latency for purge by tags, hostnames, and prefixes of less than 150ms on average (P50), representing a 90% improvement since May 2022. ...