The Cloudflare Blog

Get better visibility for the WAF with payload logging

Paschal Obba — Mon, 24 Nov 2025 14:00:00 GMT

As the surface area for attacks on the web increases, Cloudflare’s Web Application Firewall (WAF) provides a myriad of solutions to mitigate these attacks. This is great for our customers, but the cardinality in the workloads of the millions of requests we service means that generating false positives is inevitable. This means that the default configuration we have for our customers has to be fine-tuned.

Fine-tuning isn’t an opaque process: customers have to get some data points and then decide what works for them. This post explains the technologies we offer to enable customers to see why the WAF takes certain actions — and the improvements that have been made to reduce noise and increase signal.

The Log action is great — can we do more?

Cloudflare’s WAF protects origin servers from different kinds of layer 7 attacks, which are attacks that target the application layer. Protection is provided with various tools like:

Managed rules, which security analysts at Cloudflare write to address common vulnerabilities and exposures (CVE), OWASP security risks, and vulnerabilities like Log4Shell.
Custom rules, where customers can write rules with the expressive Rules language.
Rate limiting rules, malicious uploads detection, leaked credentials detection, etc.

These tools are built on the Rulesets engine. When there is a match on a Rule expression, the engine executes an action.

The Log action is used to simulate the behaviour of rules. This action proves that a rule expression is matched by the engine and emits a log event which can be accessed via Security Analytics, Security Events, Logpush or Edge Log Delivery.

Logs are great at validating a rule works as expected on the traffic it was expected to match, but showing that the rule matches isn’t sufficient, especially when a rule expression can take many code paths. In pseudocode, an expression can look like:

If any of the http request headers contains an “authorization” key OR the lowercased representation of the http host header starts with “cloudflare” THEN log The rules language syntax will be:

any(http.request.headers[*] contains "authorization") or starts_with(lower(http.host), "cloudflare")

Debugging this expression poses a couple of problems. Is it the left-hand side (LHS) or right-hand side (RHS) of the OR expression above that matches? Functions such as Base64 decoding, URL decoding, and in this case lowercasing can apply transformations to the original representation of these fields, which leads to further ambiguity as to which characteristics of the request led to a match.

To further complicate this, many rules in a ruleset can register matches. Rulesets like Cloudflare OWASP use a cumulative score of different rules to trigger an action when the score crosses a set threshold.

Additionally, the expressions of the Cloudflare Managed and OWASP rules are private. This increases our security posture – but it also means that customers can only guess what these rules do from their titles, tags and descriptions. For instance, one might be labeled “SonicWall SMA - Remote Code Execution - CVE:CVE-2025-32819.”

Which raises questions: What part of my request led to a match in the Rulesets engine? Are these false positives?

This is where payload logging shines. It can help us drill down to the specific fields and their respective values, post-transformation, in the rule that led to a match.

Payload logging

Payload logging is a feature that logs which fields in the request are associated with a rule that led to the WAF taking an action. This reduces ambiguity and provides useful information that can help spot check false positives, guarantee correctness, and aid in fine-tuning of these rules for better performance.

From the example above, a payload log entry will contain either the LHS or RHS of the expression, but not both.

How does payload logging work ?

The payload logging and Rulesets engines are built on Wirefilter, which has been explained extensively.

Fundamentally, these engines are objects written in Rust which implement a compiler trait. This trait drives the compilation of the abstract syntax trees (ASTs) derived from these expressions.

struct PayloadLoggingCompiler {
     regex_cache HashMap>
}

impl wirefilter::Compiler for PayloadLoggingCompiler {
	type U = PayloadLoggingUserData
	
	fn compile_logical_expr(&mut self, node: LogicalExpr) -> CompiledExpr {
		// ...
		let regex = self.regex_cache.entry(regex_pattern)
		.or_insert_with(|| Arc::new(regex))
		// ...
	}

}

The Rulesets Engine executes an expression and if it evaluates to true, the expression and its execution context are sent to the payload logging compiler for re-evaluation. The execution context provides all the runtime values needed to evaluate the expression.

After re-evaluation is done, the fields involved in branches of the expression that evaluate to true are logged.

The structure of the log is a map of wirefilter fields and their values Map

{

	“http.host”: “cloudflare.com”,
	“http.method”: “get”,
	“http.user_agent”: “mozilla”

}

Note: These logs are encrypted with the public key provided by the customer.

These logs go through our logging pipeline and can be read in different ways. Customers can configure a Logpush job to write to a custom Worker we built that uses the customer’s private key to automatically decrypt these logs. The Payload logging CLI tool, Worker, or the Cloudflare dashboard can also be used for decryption.

What improvements have been shipped?

In wirefilter, some fields are array types. The field http.request.headers.names is an array of all the header names in a request. For example:

[“content-type”, “content-length”, “authorization”, "host"]

An expression that reads any(http.request.headers.names[*] contains “c”) will evaluate to true because at least one of the headers contains the letter “c”. With the previous version of the payload logging compiler, all the headers in the “http.request.headers.names” field will be logged since it's a part of the expression that evaluates to true.

Payload log (previous)

http.request.headers.names[*] = [“content-type”, “content-length”, “authorization”, "host"]

Now, we partially evaluate the array fields and log the indexes that match the expressions constraint. In this case, it’ll be just the headers that contain a “c”!

Payload log (new)

http.request.headers.names[0,1] = [“content-type”, “content-length”]

Operators

This brings us to operators in wirefilter. Some operators like “eq” result in exact matches, e.g. http.host eq “a.com”. There are other operators that result in “partial” matches – like “in”, “contains”, “matches” – that work alongside regexes. The expression in this example: `any(http.request.headers[*] contains “c”)` uses a “contains” operator which produces a partial match. It also uses the “any” function which we can say produces a partial match, because if at least one of the headers contains a “c”, then we should log that header – not all the headers as we did in the previous version.

With the improvements to the payload logging compiler, when these expressions are evaluated, we log just the partial matches. In this case, the new payload logging compiler handles the “contains” operator similarly to the “find” method for bytes in the Rust standard library. This improves our payload log to:

http.request.headers.names[0,1] = [“c”, “c”]

This makes things a lot clearer. It also saves our logging pipeline from processing millions of bytes. For example, a field that is analyzed a lot is the request body — http.request.body.raw — which can be tens of kilobytes in size. Sometimes the expressions are checking for a regex pattern that should match three characters. In this case we’ll be logging 3 bytes instead of kilobytes!

Context

I know, I know, [“c”, “c”] doesn’t really mean much. Even if we’ve provided the exact reason for the match and are significantly saving on the volume of bytes written to our customers storage destinations, the key goal is to provide useful debugging information to the customer. As part of the payload logging improvements, the compiler now also logs a “before” and "after” (if applicable) for partial matches. The size for these buffers are currently 15 bytes each. This means our payload log now looks like:

http.request.headers[0,1] = [
    {
        before: null, // isnt included in the final log
        content: “c”, 
        after: “ontent-length”
    },
    {
        before: null, // isnt included in the final log
        content: “c”, 
        after:”ontent-type”
    }
]

Example of payload log (previous)

Example of payload log (new)

In the previous log, we have all the header values. In the new log, we have the 8th index which is a malicious script in a HTTP header. The match is on the “. You should see the following returned:

And a few moments later in livetail:

{"RayName":"58830d3f9945bc36","Source":"waf","RuleId":"958052","Action":"log","EdgeColoName":"LHR","ClientIP":"203.0.113.69","ClientCountryName":"gb","ClientASNDescription":"NTL","UserAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36","ClientRequestHTTPMethodName":"GET","ClientRequestHTTPHost":"upinatoms.com"}
{"RayName":"58830d3f9945bc36","Source":"waf","RuleId":"958051","Action":"log","EdgeColoName":"LHR","ClientIP":"203.0.113.69","ClientCountryName":"gb","ClientASNDescription":"NTL","UserAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36","ClientRequestHTTPMethodName":"GET","ClientRequestHTTPHost":"upinatoms.com"}
{"RayName":"58830d3f9945bc36","Source":"waf","RuleId":"973300","Action":"log","EdgeColoName":"LHR","ClientIP":"203.0.113.69","ClientCountryName":"gb","ClientASNDescription":"NTL","UserAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36","ClientRequestHTTPMethodName":"GET","ClientRequestHTTPHost":"upinatoms.com"}
{"RayName":"58830d3f9945bc36","Source":"waf","RuleId":"973307","Action":"log","EdgeColoName":"LHR","ClientIP":"203.0.113.69","ClientCountryName":"gb","ClientASNDescription":"NTL","UserAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36","ClientRequestHTTPMethodName":"GET","ClientRequestHTTPHost":"upinatoms.com"}
{"RayName":"58830d3f9945bc36","Source":"waf","RuleId":"973331","Action":"log","EdgeColoName":"LHR","ClientIP":"203.0.113.69","ClientCountryName":"gb","ClientASNDescription":"NTL","UserAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36","ClientRequestHTTPMethodName":"GET","ClientRequestHTTPHost":"upinatoms.com"}
{"RayName":"58830d3f9945bc36","Source":"waf","RuleId":"981176","Action":"drop","EdgeColoName":"LHR","ClientIP":"203.0.113.69","ClientCountryName":"gb","ClientASNDescription":"NTL","UserAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36","ClientRequestHTTPMethodName":"GET","ClientRequestHTTPHost":"upinatoms.com"}

Note that for this one malicious request Cloudflare Logs actually sent 6 separate Firewall Events to Sumo Logic. The reason for this is that this specific request triggered a variety of different Managed Rules: #958051, 958052, 973300, 973307, 973331, and 981176.

Seeing it all in action

Here's a demo of launching livetail, making a malicious request in a browser, and then seeing the result sent from the Cloudflare Logpush job:

Firewall Analytics: Now available to all paid plans

Alex Cruz Farmer — Mon, 09 Dec 2019 15:16:00 GMT

Our Firewall Analytics tool enables customers to quickly identify and investigate security threats using an intuitive interface. Until now, this tool had only been available to our Enterprise customers, who have been using it to get detailed insights into their traffic and better tailor their security configurations. Today, we are excited to make Firewall Analytics available to all paid plans and share details on several recent improvements we have made.

All paid plans are now able to take advantage of these capabilities, along with several important enhancements we’ve made to improve our customers’ workflow and productivity.

Increased Data Retention and Adaptive Sampling

Previously, Enterprise customers could view 14 days of Firewall Analytics for their domains. Today we’re increasing that retention to 30 days, and again to 90 days in the coming months. Business and Professional plan zones will get 30 and 3 days of retention, respectively.

In addition to the extended retention, we are introducing adaptive sampling to guarantee that Firewall Analytics results are displayed in the Cloudflare Dashboard quickly and reliably, even when you are under a massive attack or otherwise receiving a large volume of requests.

Adaptive sampling works similar to Netflix: when your internet connection runs low on bandwidth, you receive a slightly downscaled version of the video stream you are watching. When your bandwidth recovers, Netflix then upscales back to the highest quality available.

Firewall Analytics does this sampling on each query, ensuring that customers see the best precision available in the UI given current load on the zone. When results are sampled, the sampling rate will be displayed as shown below:

Event-Based Logging

As adoption of our expressive Firewall Rules engine has grown, one consistent ask we’ve heard from customers is for a more streamlined way to see all Firewall Events generated by a specific rule. Until today, if a malicious request matched multiple rules, only the last one to execute was shown in the Activity Log, requiring customers to click into the request to see if the rule they’re investigating was listed as an “Additional match”.

To streamline this process, we’ve changed how the Firewall Analytics UI interacts with the Activity Log. Customers can now filter by a specific rule (or any other criteria) and see a row for each event generated by that rule. This change also makes it easier to review all requests that would have been blocked by a rule by creating it in Log mode first before changing it to Block.

Challenge Solve Rates to help reduce False Positives

When our customers write rules to block undesired, automated traffic they want to make sure they’re not blocking or challenging desired traffic, e.g., humans wanting to make a purchase should be allowed but not bots scraping pricing.

To help customers determine what percent of CAPTCHA challenges returned to users may have been unnecessary, i.e., false positives, we are now showing the Challenge Solve Rate (CSR) for each rule. If you’re seeing rates higher than expected, e.g., for your Bot Management rules, you may want to relax the rule criteria. If the rate you see is 0% indicating that no CAPTCHAs are being solved, you may want to change the rule to Block outright rather than challenge.

Hovering over the CSR rate will reveal the number of CAPTCHAs issued vs. solved:

Exporting Firewall Events

Business and Enterprise customers can now export a set of 500 events from the Activity Log. The data exported are those events that remain after any selected filters have been applied.

Column Customization

Sometimes the columns shown in the Activity Log do not contain the details you want to see to analyze the threat. When this happens, you can now click “Edit Columns” to select the fields you want to see. For example, a customer diagnosing a Bot related issue may want to also view the User-Agent and the source country whereas a customer investigating a DDoS attack may want to see IP addresses, ASNs, Path, and other attributes. You can now customize what you’d like to see as shown below.

We would love to hear your feedback and suggestions, so feel free to reach out to us via our Community forums or through your Customer Success team.

If you’d like to receive more updates like this one directly to your inbox, please subscribe to our Blog!

Supercharging Firewall Events for Self-Serve

Alex Cruz Farmer — Thu, 22 Aug 2019 13:00:00 GMT

Today, I’m very pleased to announce the release of a completely overhauled version of our Firewall Event log to our Free, Pro and Business customers. This new Firewall Events log is now available in your Dashboard, and you are not required to do anything to receive this new capability.

No more modals!

We have done away with those pesky modals, providing a much smoother user experience. To review more detailed information about an event, you simply click anywhere on the event list row.

In the expanded view, you are provided with all the information you may need to identify or diagnose issues with your Firewall or find more details about a potential threat to your application.

Additional matches per event

Cloudflare has several Firewall features to give customers granular control of their security. With this control comes some complexity when debugging why a request was stopped by the Firewall. To help clarify what happened, we have provided an “Additional matches” count at the bottom for events triggered by multiple services or rules for the same request. Clicking the number expands a list showing each rule and service along with the corresponding action.

Search for any field within a Firewall Event

This is one of my favourite parts of our new Firewall Event Log. Many of our customers have expressed their frustration with the difficulty of pinpointing specific events. This is where our new search capabilities come into their own. Customers can now filter and freeform search for any field that is visible in a Firewall Event!

Let’s say you want to find all the requests originating from a specific ISP or country where your Firewall Rules issued a JavaScript challenge. There are two different ways to do this in the UI.

Firstly, when in the detail view, you can create an include or exclude filter for that field value.

Secondly, you can create a freeform filter using the “+ Add Filter” button at the top, or edit one of the already filtered fields:

As illustrated above, with our WAF Managed Rules enabled in log only, we can see all the rules which would have triggered if this was a legitimate attack. This allows you to confirm that your configuration is working as expected.

Scoping your search to a specific date and time

In our old Firewall Event Log, to find an event, users had to traverse through many pages to find Events from a specific date. The last major change we have added is the capability to select a time window to view events between two points in time over the last 2 weeks. In the time selection window, Free and Pro customers can choose a 24 hour time window and our Business customers can view up to 72 hours.

We want your feedback!

We need your help! Please feel free to leave any feedback on our Community forums, or open a Support ticket with any problems you find. Your feedback is critical to our product improvement process, and we look forward to hearing from you.

Protecting Project Galileo websites from HTTP attacks

Maxime Guerreiro — Thu, 13 Jun 2019 13:00:00 GMT

Yesterday, we celebrated the fifth anniversary of Project Galileo. More than 550 websites are part of this program, and they have something in common: each and every one of them has been subject to attacks in the last month. In this blog post, we will look at the security events we observed between the 23 April 2019 and 23 May 2019.

Project Galileo sites are protected by the Cloudflare Firewall and Advanced DDoS Protection which contain a number of features that can be used to detect and mitigate different types of attack and suspicious traffic. The following table shows how each of these features contributed to the protection of sites on Project Galileo.

WAF (Web Application Firewall)

Although not the most impressive in terms of blocked requests, the WAF is the most interesting as it identifies and blocks malicious requests, based on heuristics and rules that are the result of seeing attacks across all of our customers and learning from those. The WAF is available to all of our paying customers, protecting them against 0-days, SQL/XSS exploits and more. For the Project Galileo customers the WAF rules blocked more than 4.5 million requests in the month that we looked at, matching over 130 WAF rules and approximately 150k requests per day.

Heat map showing the attacks seen on customer sites (rows) per day (columns)

This heat map may initially appear confusing but reading one is easy once you know what to expect so bear with us! It is a table where each line is a website on Project Galileo and each column is a day. The color represents the number of requests triggering WAF rules - on a scale from 0 (white) to a lot (dark red). The darker the cell, the more requests were blocked on this day.

We observe malicious traffic on a daily basis for most websites we protect. The average Project Galileo site saw malicious traffic for 27 days in the 1 month observed, and for almost 60% of the sites we noticed daily events.

Fortunately, the vast majority of websites only receive a few malicious requests per day, likely from automated scanners. In some cases, we notice a net increase in attacks against some websites - and a few websites are under a constant influx of attacks.

Heat map showing the attacks blocked for each WAF rule (rows) per day (columns)

This heat map shows the WAF rules that blocked requests by day. At first, it seems some rules are useless as they never match malicious requests, but this plot makes it obvious that some attack vectors become active all of a sudden (isolated dark cells). This is especially true for 0-days, malicious traffic starts once an exploit is published and is very active on the first few days. The dark active lines are the most common malicious requests, and these WAF rules protect against things like XSS and SQL injection attacks.

DoS (Denial of Service)

A DoS attack prevents legitimate visitors from accessing a website by flooding it with bad traffic. Due to the way Cloudflare works, websites protected by Cloudflare are immune to many DoS vectors, out of the box. We block layer 3 and 4 attacks, which includes SYN floods and UDP amplifications. DNS nameservers, often described as the Internet’s phone book, are fully managed by Cloudflare, and protected - visitors know how to reach the websites.

Line plot - requests per second to a website under DoS attack

Can you spot the attack?

As for layer 7 attacks (for instance, HTTP floods), we rely on Gatebot, an automated tool to detect, analyse and block DoS attacks, so you can sleep. The graph shows the requests per second we received on a zone, and whether or not it reached the origin server. As you can see, the bad traffic was identified automatically by Gatebot, and more than 1.6 million requests were blocked as a result.

Firewall Rules

For websites with specific requirements we provide tools to allow customers to block traffic to precisely fit their needs. Customers can easily implement complex logic using Firewall Rules to filter out specific chunks of traffic, block IPs / Networks / Countries using Access Rules and Project Galileo sites have done just that. Let’s see a few examples.

Firewall Rules allows website owners to challenge or block as much or as little traffic as they desire, and this can be done as a surgical tool “block just this request” or as a general tool “challenge every request”.

For instance, a well-known website used Firewall Rules to prevent twenty IPs from fetching specific pages. 3 of these IPs were then used to send a total of 4.5 million requests over a short period of time, and the following chart shows the requests seen for this website. When this happened Cloudflare, mitigated the traffic ensuring that the website remains available.

Cumulative line plot. Requests per second to a website

Another website, built with WordPress, is using Cloudflare to cache their webpages. As POST requests are not cacheable, they always hit the origin machine and increase load on the origin server - that’s why this website is using firewall rules to block POST requests, except on their administration backend. Smart!

Website owners can also deny or challenge requests based on the visitor’s IP address, Autonomous System Number (ASN) or Country. Dubbed Access Rules, it is enforced on all pages of a website - hassle-free.

For example, a news website is using Cloudflare’s Access Rules to challenge visitors from countries outside of their geographic region who are accessing their website. We enforce the rules globally even for cached resources, and take care of GeoIP database updates for them, so they don’t have to.

The Zone Lockdown utility restricts a specific URL to specific IP addresses. This is useful to protect an internal but public path being accessed by external IP addresses. A non-profit based in the United Kingdom is using Zone Lockdown to restrict access to their WordPress’ admin panel and login page, hardening their website without relying on non official plugins. Although it does not prevent very sophisticated attacks, it shields them against automated attacks and phishing attempts - as even if their credentials are stolen, they can’t be used as easily.

Rate Limiting

Cloudflare acts as a CDN, caching resources and happily serving them, reducing bandwidth used by the origin server … and indirectly the costs. Unfortunately, not all requests can be cached and some requests are very expensive to handle. Malicious users may abuse this to increase load on the server, and website owners can rely on our Rate Limit to help them: they define thresholds, expressed in requests over a time span, and we make sure to enforce this threshold. A non-profit fighting against poverty relies on rate limits to protect their donation page, and we are glad to help!

Security Level

Last but not least, one of Cloudflare’s greatest assets is our threat intelligence. With such a wide lens of the threat landscape, Cloudflare uses our Firewall data, combined with machine learning to curate our IP Reputation databases. This data is provided to all Cloudflare customers, and is configured through our Security Level feature. Customers then may define their threshold sensitivity, ranging from Essentially Off to I’m Under Attack. For every incoming request, we ask visitors to complete a challenge if the score is above a customer defined threshold. This system alone is responsible for 25% of the requests we mitigated: it’s extremely easy to use, and it constantly learns from the other protections.

Conclusion

When taken together, the Cloudflare Firewall features provide our Project Galileo customers comprehensive and effective security that enables them to ensure their important work is available. The majority of security events were handled automatically, and this is our strength - security that is always on, always available, always learning.

Spectrum for UDP: DDoS protection and firewalling for unreliable protocols

Achiel van der Mandele — Wed, 20 Mar 2019 15:01:00 GMT

Today, we're announcing Spectrum for UDP. Spectrum for UDP works the same as Spectrum for TCP: Spectrum sits between your clients and your origin. Incoming connections are proxied through, whilst applying our DDoS protection and IP Firewall rules. This allows you to protect your services from all sorts of nasty attacks and completely hides your origin behind Cloudflare.

Last year, we launched Spectrum. Spectrum brought the power of our DDoS and firewall features to all TCP ports and services. Spectrum for TCP allows you to protect your SSH services, gaming protocols, and as of last month, even FTP servers. We’ve seen customers running all sorts of applications behind Spectrum, such as Bitfly, Nicehash, and Hypixel.

This is great if you're running TCP services, but plenty of our customers also have workloads running over UDP. As an example, many multiplayer games prefer the low cost and lighter weight of UDP and don't care about whether packets arrive or not.

UDP applications have historically been hard to protect and secure, which is why we built Spectrum for UDP. Spectrum for UDP allows you to protect standard UDP services (such as RDP over UDP), but can also protect any custom protocol you come up with! The only requirement is that it uses UDP as an underlying protocol.

Configuring a UDP application on Spectrum

To configure on the dashboard, simply switch the application type from TCP to UDP:

Retrieving client information

With Spectrum, we terminate the connection and open a new one to your origin. But, what if you want to still see who's actually connecting to you? For TCP, there's Proxy Protocol. Whilst initially introduced by HAProxy, it has since been adopted by more parties, such as nginx. We added support late 2018, allowing you to easily read the client's IP and port from a header that precedes each data stream.

Unfortunately, there is no equivalent for UDP, so we're rolling our own. Due to the fact that UDP is connection-less, we can't get away with the Proxy Protocol approach for TCP, which prepends the entire stream with one header. Instead, we are forced to prepend each packet with a small header that specifies:

the original client IP
the Spectrum IP
the original client port
the Spectrum port

Schema representing a UDP packet prefaced with our Simple Proxy Protocol header.

0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          Magic Number         |                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               +
|                                                               |
+                                                               +
|                                                               |
+                         Client Address                        +
|                                                               |
+                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                               |                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               +
|                                                               |
+                                                               +
|                                                               |
+                         Proxy Address                         +
|                                                               |
+                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                               |         Client Port           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|           Proxy Port          |          Payload...           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Simple Proxy Protocol is turned off by default, which means UDP packets will arrive at your origin as if they were sent from Spectrum. To enable, just enable it on your Spectrum app.

Getting access to Spectrum for UDP

We're excited about launching this and and even more excited to see what you'll build and protect with it. In fact, what if you could build serverless services on Spectrum, without actually having an origin running? Stay tuned for some cool announcements in the near future.

Spectrum for UDP is currently an Enterprise-only feature. To get UDP enabled for your account, please reach out to your account team and we’ll get you set up.

One more thing... if you’re at GDC this year, say hello at booth P1639! We’d love to talk more and learn about what you’d like to do with Spectrum.

How we made Firewall Rules

David Kitchen — Mon, 04 Mar 2019 13:00:00 GMT

Recently we launched Firewall Rules, a new feature that allows you to construct expressions that perform complex matching against HTTP requests and then choose how that traffic is handled. As a Firewall feature you can, of course, block traffic. The expressions we support within Firewall Rules along with powerful control over the order in which they are applied allows complex new behaviour.

In this blog post I tell the story of Cloudflare’s Page Rules mechanism and how Firewall Rules came to be. Along the way I’ll look at the technical choices that led to us building the new matching engine in Rust.

The evolution of the Cloudflare Firewall

Cloudflare offers two types of firewall for web applications, a managed firewall in the form of a WAF where we write and maintain the rules for you, and a configurable firewall where you write and maintain rules. In this article, we will focus on the configurable firewall.

One of the earliest Cloudflare firewall features was the IP Access Rule. It dates backs to the earliest versions of the Cloudflare Firewall and simply allows you to block traffic from specific IP addresses:

if request IP equals 203.0.113.1 then block the request

As attackers and spammers frequently launched attacks from a given network we also introduced the ASN matching capability:

if request IP belongs to ASN 64496 then block the request

We also allowed blocking ranges of addresses defined by CIDR notation when an IP was too specific and an ASN too broad:

if request IP is within 203.0.113.0/24 then block the request

Blocking is not the only action you might need and so other actions are available:

Allowlist = apply no other firewall rules and allow the request to pass this part of the firewall
Challenge = issue a CAPTCHA and if this is passed then allow the request but otherwise deny. This would be used to determine if the request came from a human operator
JavaScript challenge = issue an automated JavaScript challenge and if this is passed then allow the request. This would be used to determine if the request came from a simple stateless bot (perhaps a wget or curl script)
Block = deny the request

Cloudflare also has Page Rules. Page Rules allow you to match full URIs and then perform actions such as redirects or to raise the security level which can be considered firewall functions:

if request URI matches /nullroute then redirect to http://127.0.0.1

Cloudflare also added GeoIP information within an HTTP header and the firewall was extended to include that:

if request IP originates from county GB then CAPTCHA the request

All of the above existed in Cloudflare pre-2014, and then during 2016 we set about to identify the most commonly requested firewall features (according to Customer Support tickets and feedback from paying customers) and provide a self-service solution. From that analysis, we added three new capabilities during late 2016: Rate Limiting, User Agent Rules, and Zone Lockdown.

Whilst Cloudflare automatically stops very large denial of service attacks, rate limiting allowed customers to stop smaller attacks that were a real concern to them but were low enough volume that Cloudflare’s DDoS defences were not being applied.

if request method is POST and request URI matches /wp-admin/index.php and response status code is 403 and more than 3 requests like this occur in a 15 minute time period then block the traffic for 2 hours

User Agent rules are as simple as:

if request user_agent is `Fake User Agent` then CAPTCHA the request

Zone Lockdown, however was the first default deny feature. The Cloudflare Firewall could be thought of as “allow all traffic, except where a rule exists to block it”. Zone Lockdown is the opposite “for a given URI, block all traffic, except where a rule exists to allow it”.

Zone Lockdown allowed customers could to block access to a public website for all but a few IP addresses or IP ranges. For example, many customers wanted access to a staging website to only be available to their office IP addresses.

if request URI matches https://staging.example.com/ and request IP not in 203.0.113.0/24 then block the request

Finally, an Enterprise customer could also contact Cloudflare and have a truly bespoke rule created for them within the WAF engine.

Seeing the problem

The firewall worked well for simple mitigation, but it didn’t fully meet the needs of our customers.

Each of the firewall features had targeted a single attribute, and the interfaces and implementations reflected that. Whilst the Cloudflare Firewall had evolved to solve a problem as each problem arose, these did not work together. In late 2017 you could sum up the firewall capabilities as:

You can block any attack traffic on any criteria, so long as you only pick one of:

IP
CIDR
ASN
Country
User Agent
URI

We saw the problem, but how to fix it?

We match our firewall rules in two ways:

Lookup matching
String pattern matching

Lookup matching covers the IP, CIDR, ASN, Country and User Agent rules. We would create a key in our globally distributed key/value data store Quicksilver, and store the action in the value:

Key   = zone:www.example.com_ip:203.0.113.1
Value = block

When a request for www.example.com is received, we look at the IP address of the client that made the request, construct the key and perform the lookup. If the key exists in the store, then the value would tell us what action to perform, in this case if the client IP were 203.0.113.1 then we would block the request.

Lookup matching is a joy to work with, it is O(1) complexity meaning that a single request would only perform a single lookup for an IP rule regardless of how many IP rules a customer had. Whilst most customers had a few rules, some customers had hundreds of thousands of rules (typically created automatically by combining fail2ban or similar with a Cloudflare API call).

Lookups work well when you are only looking up a single value. If you need to combine an IP and a User Agent we would need to produce keys that composed these values together. This massively increases the number of keys that you need to publish.

String pattern matching occurs where URI matching is required. For our Page Rules feature this meant combining all of the Page Rules into a single regular expression that we would apply to the request URI whilst handling a request.

If you had Page Rules that said (in order):

Match */wp-admin/index.php and then block
Then match */xmlrpc.php and then block

These are converted into:

^(?(?:.*/wp-admin/index.php))|(?(?:.*/xmlrpc.php))$

Yes, you read that correctly. Each Page Rule was appended to a single regular expression in the order of execution, and the naming group is used as an overload for the desired action.

This works surprisingly well as regular expression matching can be simple and fast especially when the regular expression matches against a single value like the URI, but as soon as you want to match the URI plus an IP range it becomes less obvious how to extend this.

This is what we had, a set of features that worked really well providing you want to match a single property of a request. The implementation also meant that none of these features could be trivially extended to embrace multiple properties at a time. We needed something else, a fast way to compute if a request matches a rule that could contain multiple properties as well as pattern matching.

A solution that works now and in the future

Over time Cloudflare engineers authored internal posts exploring how a new matching engine might work. The first thing that occurred to every engineer was that the matching must be an expression. These ideas followed a similar approach which we would construct an expression within JSON as a DSL (Domain Specific Language) of our expression language. This DSL could describe matching a request and a UI could render this, and a backend could process it.

Early proposals looked like this:

{
  "And": [
    {
      "Equals"{
        "host": "www.example.com"
      }
    },
    "Or": [
      {
        "Regex": {
          "path": "^(?: .*/wp-admin/index.php)$"
        }
      }{
        "Regex": {
          "path": "^(?: .*/xmlrpc.php)$"
        }
      }
    ]
  ]
}

The JSON describes an expression that computers can easily turn into a rule to apply, but people find this hard to read and work with.

As we did not wish to display JSON like this in our dashboard we thought about how we might summarise it for a UI:

if request host equals www.example.com and (request path matches ^(?:.*/wp-admin/index.php)$ or request path matches ^(?:.*/xmlrpc.php)$)

And there came an epiphany. As engineers working we’ve seen an expression language similar to this before, so may I introduce to you our old friend Wireshark®.

Wireshark is a network protocol analyzer. To use it you must run a packet capture to record network traffic from a capture device (usually a network card). This is then saved to disk as a .pcap file which you subsequently open in the Wireshark GUI. The Wireshark GUI has a display filter entry box, and when you fill in a display filter the GUI will dissect the saved packet capture such that it will determine which packets match the expression and then show those in the GUI.

But we don’t need to do that. In fact, for our scenario that approach does not work as we have a firewall and need to make decisions in real-time as part of the HTTP request handling rather than via the packet capture process.

For Cloudflare, we would want to use something like the expression language that is the Wireshark Display Filters but without the capture and dissection as we would want to do this potentially thousands of times per request without noticeable delay.

If we were able to use a Wireshark-style expression language then we can reduce the JSON encapsulated expression above to:

http.host eq "www.example.com" and (http.request.path ~ "wp-admin/index\.php" or http.request.path ~ "xmlrpc.php")

This is human readable, machine parseable, succinct.

It also benefits from being highly similar to Wireshark. For security engineers used to working with Wireshark when investigating attacks it offers a degree of portability from an investigation tool to a mitigation engine.

To make this work we would need to collect the properties of the request into a simple data structure to match the expressions against. Unlike the packet capture approach we run our firewall within the context of an HTTP server and the web server has already computed the request properties, so we can avoid dissection and populate the fields from the web server knowledge:

Field	Value
http.cookie	`session=8521F670545D7865F79C3D7BED C29CCE;-background=light`
http.host	`www.example.com`
http.referer
http.request.method	`GET`
http.request.uri	`/articles/index?section=539061&expand=comments`
http.request.uri.path	`/articles/index`
http.request.uri.query	`section=539061&expand=comments`
http.user_agent	`Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36`
http.x_forwarded_for
ip.src	`203.0.113.1`
ip.geoip.asnum	`64496`
ip.geoip.country	`GB`
ssl	`true`

With a table of HTTP request properties and an expression language that can provide a matching expression we were 90% of the way towards a solution! All we needed for the last 90% was the matching engine itself that would provide us with an answer to the question: Does this request match one of the expressions?

Enter wirefilter.

Wirefilter is the name of the Rust library that Cloudflare has created, and it provides:

The ability for Cloudflare to define a set of fields of types, i.e. ip.src is a field of type IPAddress
The ability to define a table of properties from all of the fields that are defined
The ability to parse an expression and to say whether it is syntactically valid, whether the fields in the expression are valid against the fields defined, and whether the operators used for a field are valid for the type of the field
The ability to apply an expression to a table and return a true|false response indicating whether the evaluated expression matches the request

It is named wirefilter as a hat tip towards Wireshark for inspiring our Wireshark-like expression language and also because in our context of the Cloudflare Firewall these expressions act as a filter over traffic.

The implementation of wirefilter allows us to embed this matching engine within our REST API which is written in Go:

// scheme stores the list of fields and their types that an expression can use
var scheme = filterexpr.Scheme{
	"http.cookie":                     filterexpr.TypeString,
	"http.host":                       filterexpr.TypeString,
	"http.referer":                    filterexpr.TypeString,
	"http.request.full_uri":           filterexpr.TypeString,
	"http.request.method":             filterexpr.TypeString,
	"http.request.uri":                filterexpr.TypeString,
	"http.request.uri.path":           filterexpr.TypeString,
	"http.request.uri.query":          filterexpr.TypeString,
	"http.user_agent":                 filterexpr.TypeString,
	"http.x_forwarded_for":            filterexpr.TypeString,
	"ip.src":                          filterexpr.TypeIP,
	"ip.geoip.asnum":                  filterexpr.TypeNumber,
	"ip.geoip.country":                filterexpr.TypeString,
	"ssl":                             filterexpr.TypeBool,
}

Later we validate expressions provided to the API:

// expression here is a string that may look like:
// `ip.src eq 203.0.113.1`
expressionHash, err := filterexpr.ValidateFilter(scheme, expression)
if fve, ok := err.(*filterexpr.ValidationError); ok {
	validationErrs = append(validationErrs, fve.Ascii)
} else if err != nil {
	return nil, stderrors.Errorf("failed to validate filter: %v", err)
}

This tells us whether the expression is syntactically correct and also whether the field operators and values match the field type. If the expression is valid then we can use the returned hash to determine uniqueness (the hash is generated inside wirefilter so that uniqueness can ignore blanks and minor differences).

The expressions are then published to our global network of PoPs and are consumed by Lua within our web proxy. The web proxy has the same list of fields that the API does, and is now responsible for building the table of properties from the context within the web proxy:

-- The `traits` table defines the mapping between the fields and
-- the corresponding values from the nginx evaluation context.
local traits = {
   ['http.host'] = field.str(function(ctx) return ctx.host end),
   ['http.cookie'] = field.str(function(ctx)
      local value = ctx.req_headers.cookie or ''
      if type(value) == 'table' then
         value = table.concat(value, ";")
      end
      return value
   end),
   ['http.referer'] = field.str(function(ctx) return ctx.req_headers.referer or '' end),
   ['http.request.method'] = field.str(function(ctx) return ctx.method end),
   ['http.request.uri'] = field.str(function(ctx)
      return ctx.rewrite_uri or ctx.request_uri
   end),
   ['http.request.uri.path'] = field.str(function(ctx)
      return ctx.uri or '/'
   end),
   ...

With this per-request table describing a request we can see test the filters. In our case what we’re doing here is:

Fetch a list of all the expressions we would like to match against the request
Check whether any expression, when applied via wirefilter to the table above, return true as having matched
For all matched expressions check the associated actions and their priority

The actions are not part of the matching itself. Once we have a list of matched expressions we determine which action takes precedence and that is the one that we will execute.

Wirefilter then, is a generic library that provides this matching capability that we’ve plugged into our Go APIs and our Lua web proxy, and we use that to power the Cloudflare Firewall.

We chose Rust for wirefilter as early in the project we recognised that if we attempted to make implementations of this in Go and Lua, that it would result in inconsistencies that attackers may be able to exploit. We needed our API and edge proxy to behave exactly the same. For this needed a library, both could call and we could choose one of our existing languages at the edge like C, C++, Go, Lua or even implement this not as a library but as a worker in JavaScript. With a mixed set of requirements of performance, memory safety, low memory use, and the capability to be part of other products that we’re working on like Spectrum, Rust stood out as the strongest option.

With a library in place and the ability to now match all HTTP traffic, how to get that to a public API and UI without diluting the capability? The problems that arose related to specificity and mutual exclusion.

In the past all of our firewall rules had a single dimension to them: i.e. act on IP addresses. And this meant that we had a single property of a single type and whilst there were occasionally edge cases for the most part there were strategies to answer the question “Which is the most specific rule?”. I.e. an IP address is more specific then a /24 which is more specific than a /8. Likewise with URI matching an overly simplistic strategy is that the longer a URI the more specific it is. And if we had 2 IP rules, then only 1 could ever have matched as a request does not come from 2 IPs at once so mutual exclusion is in effect.

The old system meant that given 2 rules, we could implicitly and trivially say “this rule is most specific so use the action associated with this rule”.

With wirefilter powering Firewall Rules, it isn’t obvious that an IP address is more or less specific when compared to a URI. It gets even more complex when a rule can have negation, as a rule that matches a /8 is less specific than a rule that does not match a single IP (the whole address space except this IP - one of the gotchas of Firewall Rules is also a source of it’s power; you can invert your firewall into a positive security model.

As we couldn’t answer specificity using the expression alone, we needed another aspect of the Firewall Rule to provide us this guidance and we realised that customers already had a mechanism to tell us which rules were important… the action.

Given a set of rules, we logically have ordered them according to their action (Log has highest priority, Block has lowest):

Log
Allow
Challenge (CAPTCHA)
JavaScript Challenge
Block

For the vast majority of scenarios this proves to be good enough.

What about when that isn’t good enough though? Do we have examples of complex configuration that break that approach? Yes!

Because the expression language within Firewall Rules is so powerful, and we can support many Firewall Rules, it means that we can now create different firewall configuration for different parts of a web site. i.e. /blog could have wholly different rules than /shop, or for different audiences, i.e. visitors from your office IPs might be allowed on a given URI but everyone else trying to access that URI may be blocked.

In this scenario you need the ability to say “run all of these rules first, and then run the other rules”.

In single machine firewalls like iptables, OS X Firewall, or your home router firewall, the firewall rules were explicitly ordered so that when you match the first rule it terminates execution and you won’t hit the next rule. When you add a new rule the entire set of rules is republished and this helps to guarantee this behaviour. But this approach does not work well for a Cloud Firewall as a large website with many web applications typically also has a large number of firewall rules. Republishing all of these rules in a single transaction can be slow and if you are adding lots of rules quickly this can lead to delays to the final state being live.

If we published individual rules and supported explicit ordering, we risked race conditions where two rules that both were configured in position 4 might exist at the same time and the behaviour if they matched the request would be non-determinable.

We solved this by introducing a priority value, where 1 is the highest priority and as an int32 you can create low priority rules all the way down to priority = 2147483647. Not providing a priority value is the equivalent of “lowest” and runs after all rules that have a priority.

Priority does not have to be a unique value within Firewall Rules. If two rules are of equal priority then we resort to the order of precedence of the actions as defined earlier.

This provides us a few benefits:

Because priority allows rules that share a priority to exist we can publish rules 1 at a time… when you add a new rule the speed at which we deploy that globally is not affected by the number of rules you already have.
If you do have existing rules in a system that does sequentially order the rules, you can import those into Firewall Rules and preserve their order, i.e. this rule should always run before that rule.
But you don’t have to use priority exclusively for ordering as you can also use priority for grouping. For example you may say that all spammers are priority=10000 and all trolls are priority = 5000.

Finally… let’s look at those fields again, http.request.path notice that http prefix? By following the naming convention Wireshark has (see their Display Filter Reference) we have not limited this firewall capability solely to a HTTP web proxy. It is a small leap to imagine that if a Spectrum application declares itself as running SMTP that we could also define fields that understand SMTP and allow filtering of traffic on other application protocols or even at layer 4.

What we have built in Firewall Rules gives us these features today:

Rich expression language capable of targeting traffic precisely and in real-time
Fast global deployment of individual rules
A lot of control over the management and organisation of Firewall Rules

And in the future, we have a product that can go beyond HTTP and be a true Cloud Firewall for all protocols…the Cloudflare Firewall with Firewall Rules.

New Firewall Tab and Analytics

Alex Cruz Farmer — Fri, 01 Mar 2019 10:00:00 GMT

At Cloudflare, one of our top priorities is to make our products and services intuitive so that we can enable customers to accelerate and protect their Internet properties. We're excited to launch two improvements designed to make our Firewall easier to use and more accessible, and helping our customers better manage and visualize their threat-related data.

New Firewall Tabs for ease of access

We have reorganised our features into meaningful pages: Events, Firewall Rules, Managed Rules, Tools, and Settings. Our customers will see an Overview tab, which contains our new Firewall Analytics, detailed below.

All the features you know and love are still available, and can be found in one of the four new tabs. Here is a breakdown of their new locations.

Feature	New Location
Firewall Event Log	Events (Overview for Enterprise only)
Firewall Rules	Firewall Rules
Web Application Firewall	Managed Ruleset
IP Access Rules (IP Firewall	Tools
Rate Limiting	Tools
User Agent Blocking	Tools
Zone Lockdown	Tools
Browser Integrity Check	Settings
Challenge Passage	Settings
Privacy Pass	Settings
Security Level	Settings

If the new sub navigation has not appeared, you may need to re-login to the dashboard or clear your browser’s cookies.

New Firewall Analytics for analysing events and maintaining optimal configurations

Insights into security events are critical for monitoring the health of your web applications. Furthermore, distinguishing between actual threats from false positives is essential for maintaining an optimal security configuration. Today, we are very pleased to announce our new Firewall Analytics which will help our Enterprise customers get detailed insights into firewall events, helping them to tailor their security configurations more effectively

Our new Firewall Analytics now enables our Enterprise customers to:

visualise and analyse Firewall Events in one place to better understand their threat landscape
identify, mitigate, and review attacks more effectively

After speaking with many of our customers, we learned a lot about their processes to identify and analyse attacks and the kinds of insights they needed to improve these processes. We then translated these learnings into useful features and charts that would help answer some of the most common questions such as ‘What kinds of security events occurred in a certain time frame?’ and ‘What caused a spike in a certain type of security event?’.

Firewall Analytics and Firewall Configuration can be found together in the Firewall tab. A tight feedback loop between Firewall configuration and the resulting events allow for rapid iteration, ideal for security-focused teams.

To best demonstrate the power of Firewall Analytics, here’s a workflow that would answer a popular question our customers ask: “Why did I have a spike in threats?”. In the screenshot below, we can see a set of activity which triggered a number of ‘Blocks’ events:

To minimize the possibility of polluting our TopN statistics with event types other than ‘Block’ and get the most accurate diagnostic information, we will need to filter down to just ‘Block’ actions.

Now that only Block events are displayed, checking the Service Breakdowns will help us to identify which of our Firewall features was triggered.

From the Events breakdown, we can see that the Block events were triggered by a Country Block configured within Access Rules. Digging deeper and looking at our TopN breakdowns, we start to get a much more granular understanding of which Networks, IPs, User-Agents, Paths etc, were targeted.

Looking at our TopN breakdowns, we start to get a much more granular understanding of which Networks, IPs, User-Agents, Paths etc, were targeted.

From here, we can see that there are two specific IP addresses which were targeting my application to “/”.

To get the most detailed information, we can drill down further in the refreshed Firewall Event log, now controlled inline.

Whilst these TopNs and filters are great for clearly identifiable threats, they can also help identify false positives. Using the power of Cloudflare’s filters, it is possible to add a user-defined filter, which can be a RayID, User-Agent or IP address.

This is just one example of how the new Firewall Analytics can help expedite the process of identifying and mitigating threats. Firewall Analytics is now live for all Enterprise customers. Let us know your feedback by reaching out to your Enterprise Account Team.