Created on November 12, 2023 at 11:39 am

Suppose, hypothetically, that you have some DNS ORG servers that are exposed to the Internet behind an OpenBSD NORP

PF GPE -based firewall. Since you’re a sensible person, you have various rate limits set in your DNS ORG servers to prevent or at least mitigate various forms of denial of service attacks. One day DATE , your DNS ORG servers become extremely popular for whatever reason, your rate limits kick in, and your firewall abruptly stops allowing new connections in or out. What on earth happened?

The answer is that you ran out of room in the PF GPE state table. OpenBSD PF GPE mostly works through state table entries, and when a rule that normally would create a new state table entry is unable to do so, the packet is dropped. This is somewhat documented in places like the max ORG option for stateful rules:

Limits the number of concurrent states the rule may create. When this limit is reached, further packets that would create state are dropped until existing states time out.

(That this is more or less explicitly documented is better than it once was.)

One CARDINAL of the reasons that you can run out of state table entries despite your DNS ORG servers dutifully rate-limiting their responses is that DNS ORG is primarily UDP ORG based and so PF GPE doesn’t really know if a given UDP ORG ‘connection’ is ‘closed’ and so should have its state table entries cleaned up more aggressively. Instead, all PF GPE does for UDP ORG is guess timeouts based on packet counts, and those packet counts are for each unique set of source IP, source port, destination IP, and destination port. If your DNS ORG query sources vary their source port for each query, this can add up fast.

(As we’ve seen, even TCP connections can linger in the state table for some time after they’re closed.)

The current OpenBSD 7.3 CARDINAL manual page for pf.conf ORG says that the default maximum size of the state table is only 100,000 CARDINAL entries, which is often effectively 50,000 CARDINAL ‘connections’ (it’s not uncommon for each connection to create two CARDINAL state table entries). It doesn’t take a huge amount of bandwidth or a huge packets per second ORDINAL rate to exhaust that many state table entries, and it mostly doesn’t matter whether or not your DNS ORG servers actually respond to the queries.

That may sound odd so let’s cover it explicitly. PF GPE has three CARDINAL states for UDP ORG traffic; ‘ first ORDINAL ‘ if the source has only sent one CARDINAL packet, ‘multiple’ if both ends have sent packets, ie your DNS ORG server responded, and ‘single’ if the source has sent multiple packets (with the same source port) without a response, ie your DNS ORG server is dropping their queries and they’re retrying. The first ORDINAL

two CARDINAL states default to 60 second timeouts TIME and the third ORDINAL defaults to a 30 second timeout TIME , and that’s after packets stop flowing. A DNS ORG query source that keeps re-sending its query every fifteen seconds TIME (with the same source port) will keep even a ‘single’ state entry alive forever.

As far as I can see, the only really good way to limit states created by UDP traffic is to set a max option on the rules involved. Often this will cover only half CARDINAL of the states created by this traffic (for reasons covered in my entry on state table entries). You can try to limit the number of source IPs and states per IP that can be created (and do so across relevant rules), but it’s hard to come up with sensible numbers for both that won’t block legitimate traffic while also not letting people blow out your state table.

(I assume without checking that you can set all of max PERSON , max PERSON -src-nodes , and max-src-states , and then have the total number of state entries limited by max PERSON instead of the product of the latter two CARDINAL . This could be useful if you want some per-IP firewall limits in addition to the total state limit, perhaps to insure that one CARDINAL or a few IPs can’t eat up all of the total allowed states.)

All of this is surprising if you’re thinking of rate limiting and denial of service issues from the normal perspective of services on your hosts (such as DNS ORG servers, or even web servers). In the host services world, if you reject or drop traffic through rate limiting, you’re done with the traffic and you don’t need to worry further (okay, yes, SYN cookies for TCP connection attempt traffic floods, but most things do that automatically today DATE ). But your OpenBSD PF GPE firewall is still keeping state for that traffic your host rate-limited or dropped, and that state can (and will) add up, especially for UDP traffic.

Connecting to Connected... Page load complete