pf sanity check

So I think I got my OpenBSD firewall woes worked out, but I could use a sanity check... What I have seems to be working, but I'd like some confirmation that my assumptions are true.

<LJ-CUT text=" Read on if you understand this crap. --More--( 6%) ">


  • OpenBSD 3.5 firewall PC with multiple interfaces;
  • Multiple 100baseT internal networks;
  • One uplink network to the T1.


  • Don't let any attempted bandwidth-hogging activity between any internal network and the T1 affect the stability of the webcast uplinks;
  • Beyond that, pretty much first-come first-serve.

How I did this (which may or may not be the only way):

  • From reading the documentation, I had been under the impression that if we did "pass in" with "keep state" to allow some packet to enter the firewall, then that packet could also automatically exit the firewall (pass through it on its way to the destination.)

    Everything I've read made it sound like once you did "keep state" to establish a state table entry, the state table took precedence over any rules: when there's a state table match, the rules aren't consulted at all.

    This doesn't seem to be the case. It seems that two rules must fire for each session: one to allow the first packet to enter the firewall on one interface (and establish a state table entry); and then a second rule to allow that same packet to exit the firewall on a different interface (and establish a second state table entry.) After that, subsequent packets are processed by the state table, and rules are never consulted for the rest of the session. But that first packet needs two rules.

    So, to ever make anything work, there needs to be a default "pass out" rule (either that, or two rules for every "session", which would be just ridiculously verbose.)

  • Secondly, it would seem that it is the "pass out" rules that must assign packets to queues. This is because my queues are intended to throttle packets that are going from the internal networks up the T1; this means they must be defined on the T1's network interface.

    So the way to accomplish this is, the "pass in" rules (wherein the knowledge of what connections we allow and what they are for is encoded) tag the packets with the name of the queue they are destined for. Then, the following "pass out" rules assign the packets to queues based on how they were tagged on the way in.

    This seems needlessly complicated to me: I don't understand why I can't assign the packets to the queues on the inbound packets (and have that be automatically carried over when the packets are outbound) but that seems to be how it works. (Right?)


    • mp3_queue -- Icecast uplink. Needs 128Kbps+.
    • real_queue -- RealProducer/RealVideo uplink. Needs 650Kbps+.
    • std_queue -- The default queue. Mostly this will be packets originating on office and DMZ hosts.
    • guest_queue -- All packets originating on kiosk and wireless networks. These are lowest priority.

    All queues have a minimum bandwidth guarenteed to them, and all are allowed to use more bandwidth if it is available/unused. Aside from that, they all have the same "priority", meaning that within their bandwidth limits, all packets are handled first-come-first-serve.

      altq on $ext_if cbq bandwidth 1.4Mb \
      queue { mp3_queue, real_queue, guest_queue, std_queue }
      queue mp3_queue bandwidth 160Kb cbq (borrow)
      queue real_queue bandwidth 700Kb cbq (borrow)
      queue std_queue bandwidth 200Kb cbq (borrow default)
      queue guest_queue bandwidth 200Kb cbq (borrow)


      block in log all # default deny

      pass out all keep state # goes to std_queue by default
      pass out on $ext_if tagged mp3_queue queue mp3_queue keep state
      pass out on $ext_if tagged real_queue queue real_queue keep state
      pass out on $ext_if tagged std_queue queue std_queue keep state
      pass out on $ext_if tagged guest_queue queue guest_queue keep state


      # icecast/mp3 uplink (tcp incoming)
      pass in proto tcp \
      from $external_icecast \
      to $internal_icecast port = 8000 \
      flags S/SA keep state \
      tag mp3_queue

      # realvideo uplink (tcp outgoing; udp both ways)
      pass in proto tcp \
      from $internal_real \
      to $external_real port { 554, 4040 } \
      flags S/SA keep state \
      tag real_queue

      pass in proto udp \
      from { $internal_real, $external_real } \
      to { $internal_real, $external_real } \
      keep state \
      tag real_queue


    Does that all make sense? Or could it be simplified?

Tags: , , , , ,
Current Music: Screamin

14 Responses:

  1. wire_on_fire says:

    I think most of your patrons are *well* acquanted with the "pass out" rule. ;)

    (I'd say something insightful, but I have yet to screw with those features on my OpenBSD+pf firewall)

  2. Okay, a caveat: I'm working from Darren Reed's IPF, not Theo de Raadt's pf. I understand the latter was designed to be mostly-a-lot-like the former. So I'm going to try to stick to broad-ish terms here.

    First, keep state. Yes, it would be nice if it worked the way that you believed that it should, but what you understand to be true is true. Note that most real people are totally okay with passing just about anything out (with the exception that you should not pass out the RFC-1918 networks that aren't supposed to be routable and might want to avoid passing out certain ICMP types). No, it has not escaped my notice that this is a clear case of it being very easy for the software to do what you meant, not what you said. Neither of us wants to have that argument with Theo (or Darren either, really).

    This tagging hoo-ha does seem needlessly complex (and might be a pf-specific concept, so I'm probably on thin ice here). Couldn't you do:

    pass in proto tcp \
    from $external_icecast \
    to $internal_icecast port = 8000 \
    flags S/SA keep state


    pass out from $icecast_if to $ext_if   queue mp3_queue   keep state

    Or is that exactly what you thought should work but didn't? Oh, wait, I get it: the Icecast and Real servers are in the same IP network. That is, what I'm suggesting above only works if the Icecast server comes in one Ethernet interface, the Real server in another, the default network in a third, and the guest network in a fourth, with the T1 on still another. (I actually assumed you were doing it my way. I'm spoiled by my 4-port Starfire cards.)

    The problem you're hitting is that you want to make decisions that pf does at the Ethernet layer (queues) based on information at the IP layer (source/dest IP addresses). You must explicitly transfer the information about the source IP network for that to make sense. Thus the tags. This complexity can be better hidden from you, but it can't not exist, because of the way you rightly ken keep state to behave. (I actually prefer this way, since I don't trust something that hides a transfer of information like this not to do it in the least efficient way possible. These "tags" will probably be tossed into the extra bits available in the headers on the Ethernet frames, which burns a bit of bandwidth, but is really the best way to do the job. That could all be verified in the code, of course.)

    Another way to do this is more like what Cisco does internally, where you've got virtual networks, which can exist at the IP layer for them. Then you make mp3_queue, real_queue, std_queue, and guest_queue each be a virtual network, bandwidth limit them inside your shiny Cisco hardware, and make all their default routes be the T1 (with longer-match routes for the other three networks, should you so desire). We're talking about network hardware that costs more than a new Kia here, if you actually want to use real bandwidth.

    Cheer up: there's a whole bunch of hidden complexity in this queue thing, which looks pretty hot, if it works as advertised. (Does it?) People (well, corporations, anyway) pay a lot of money for hardware devices that do "bandwidth shaping".

    Does your system work for getting the mp3 and real streams from within the real or guest networks? (It should.) Does it limit bandwidth there, or can a jackass with wireless in your bar bring your OpenBSD box, through which all those packets are being routed, to its knees demanding multiple mp3 streams? (Oh, dear... well, the server'd probably hit its outgoing limit on number of streams first.)

    • jwz says:

      Couldn't you

        pass in proto tcp from $external_icecast to $internal_icecast port = 8000 flags S/SA keep state


        pass out from $icecast_if to $ext_if queue mp3_queue keep state

      Or is that exactly what you thought should work but didn't?

      No, that works, but it doubles the number of rules. Let's say I have 100 rules right now, for the 100 distinct "A-to-B-port-X" paths I allow through the firewall. Doing it that way, now I have 200 (instead of 101), and I have to edit every change on two lines instead of one. That's broken.

      The problem you're hitting is that you want to make decisions that pf does at the Ethernet layer (queues) based on information at the IP layer (source/dest IP addresses). You must explicitly transfer the information about the source IP network for that to make sense.

      But that's exactly what "keep state" does: it is exactly for transferring information between layers. I can see no sensible reason that someone would use "keep state" and not expect it to automatically imply that the packet could actually, you know, get through. The current situation is that "keep state" lets packets 2-N through just fine, but you need to do a second little dance to make the second half of packet 1 get through. That's dumb.

      can a jackass with wireless in your bar bring your OpenBSD box, through which all those packets are being routed, to its knees demanding multiple mp3 streams?

      I have not tested this, but if they can, there's not a hell of a lot I could do about it anyway. "Never test for an error condition you don't know how to fix."

      • No, that works, but it doubles the number of rules.

        It's getting pretty late now, but I think you're misunderstanding what I was suggesting. My suggestion was a modification of what you were doing basing the casting into queues on the ethernet interface of the firewall box on which a packet came in, rather than basing it on the source IP address (or range). This shouldn't change number of rules, it just removes the tags. But it also requires the same number of interfaces on the inside as queues. (Which is why I backed away from it.)

        But that's exactly what "keep state" does: it is exactly for transferring information between layers.

        I'm sorry, but that's simply incorrect. I agree with a world in which that is what it meant, but what keep state actually means (to IPF and also to pf, from what I've read here and elsewhere) is to maintain TCP/IP packets with consecutive serial numbers as being part of the same two-way stream. It doesn't do anything with UDP packets, it doesn't do anything with Ethernet datagrams. pf's queues (and all bandwidth shaping) only do things with Ethernet datagrams. So, within pf, you really do have to communicate that fact down a layer on the ol' 4 (5, but who's actually counting) layer Internet stack, which they're doing with these tags.

        • jwz says:

          Yeah, I might be able to get away with dividing up queues by interface, but it wouldn't be exactly the same division of labor as I have now. It might be close enough, though. (That would mean that MP3 and Real would be in the same queue, along with a few other random hosts, but that's probably ok.)

          "keep state" does, in fact, work on UDP pseudo-connections, in both ipf and pf. (Maybe on ICMP too?) But I get your point.

          • Well, if what you have now actually works, it's probably not worth changing anything, especially if it means you get less useful manipulation of bandwidth.

            You're completely right on keep state (I guess I've just never cared about a connection-oriented UDP-based protocol... never mind ICMP):

                   state  keeps information about the flow of a communication
            session. State can be kept for TCP, UDP, and ICMP

            (That's from IPF.) The real trouble is whether you're handling and IP packet or an Ethernet datagram, and what your packet filter knows how to do with it at each layer of disencapsulation.

  3. edm says:

    Yes, the pf (and ipf) filtering policy is like the Cisco filtering policy, in that you filter traffic both into and out of the firewall. (Unlike linux which has the -- useful -- notion of "forward" for things that are just passing through, and hence you can do it all in one rule safe in the knowledge that no traffic into the host (firewall) will be affected.)

    It's useful to observe that "pass in all/pass out SELECTIVE" and "pass in SELECTIVE/pass out all" are logically equivilent in the "traffic passing through" situation (ie, the linux "forward" situation) providing you're not trying to, eg, send reject packets back; they are different only for traffic that is going into the firewall itself (and some overhead that the packet sees a little more processing if the "in" rule allows it but the "out" rule rejects it). So you could filter only outbound traffic, plus inbound traffic destined to an IP address bound to an interface on the firewall.

    Since you need to assign queues based on the outgoing traffic doing your filtering on the output side would seem to make sense and simplify things somewhat more than what you have at present.

    Incidentally "pass in quick" and "pass out quick" are your friend; they make the rule (if it applies) be the definitive word on what happens, instead of the pf (and ipf) default behaviour of using the last match found. In terms of performance it doesn't make _that_ much difference if you religiously use "keep state", but in terms of understandable firewall rulesets it can definitely help to (a) order your rules for "first match wins" and (b) mark them all "pass {in|out} quick". (Linux uses an implicit "pass {in|out} quick" approach -- first rule to match wins.)

    Otherwise it seems like what you're doing does make sense.


    PS: Sometimes in situations like these it makes sense to generate the pair of rules (in and out) from
    a single common source containing a table something like in-int:out-int:queue:protocol:.... which
    can be done a few lines of language-of-your-choice.

    • jwz says:

      Since you need to assign queues based on the outgoing traffic doing your filtering on the output side would seem to make sense and simplify things somewhat more than what you have at present.

      Well, there are only three possible ways to do it:

      1. Two rules for each "path" through the firewall:
        pass in from $a to $b keep state
        pass out from $a to $b keep state
      2. Passing in with global out:
        pass in from $a to $b keep state
        pass out all keep state
      3. Passing out with global in:
        pass in all keep state
        pass out from $a to $b keep state

      I think they are all functionally equivalent (except that I don't like #1 because it doubles the amount of typing by doubling the number of rules.)

      However, in case #2 (what I'm doing now) a small number of packets are allowed in to the firewall, and everything is allowed out. This is safe, because if hostile forces are able to generate arbitrary packets on the firewall itself then it's game over.

      Case #3 lets everything in unconditionally, then only lets certain things back out. That strikes me as potentially less safe, because why let hostile packets into the machine at all? Maybe the code is structured such that it doesn't actually make a difference, but I don't know enough about it to be sure.

      #1 or #3 would let me put the "queue" specification on the rules directly, and with #2 I have to play the "tag/tagged" games.

      • edm says:

        As I mentioned at the end of my previous comment, you can do your case 1 (in and out filtering rules) without additional typing with a short generation script.

        Given a table looking something like this:


        it's trivial (in the "5 lines of perl" sense) to generate something like this:

        pass in quick on $in-int proto $protocol from $srcip port = $srcport to $destip port = $destport keep state
        pass out quick on $out-int tagged $queue queue $queue proto $protocol from $srcip port = $srcport to $destip port = $destport keep state

        And then you can maintain the in/out/queue rules in a single central location and, eg, run "make" to compile the ruleset into something pf will load.


        PS: I'm not aware of anything in OpenBSD that would make your case 3 unsafe (the possible simplification I suggested) although I agree with you that in the event there were some combination of packet features that caused, eg, excessive CPU to be expended on the firewall it would be less safe than cases 1 or 2. (I do some variation on case 1 in the firewalls that are configured in most-paranoid mode.) But as I said, I'm not aware of a bug/feature of that nature in OpenBSD at present.

  4. eaterofhands says:

    Keep state doesn't imply that other rules won't apply. To put it simply, it means that all future packets in a connection will adhere to the rules applied to the first packet. The significance of this is that subsequent packets don't need to be processed through the entire ruleset. The state table basically short circuits the need to reprocess the rules for each packet.

    You're rules can probably be simplified. Personally I run a bridging firewall, as such my rule sets are a bit different. I explicitly pass everything out, but then restrict what is allowed in an interface. For example:

    pass out all
    pass in all

    pass out on $ext_if proto tcp from $ext_if to any flags S/SA \
    keep state queue (q_def, q_pri)

    pass in on $ext_if proto tcp from any to $ext_if flags S/SA \
    keep state queue (q_def, q_pri)
    # block all port 25 traffic
    block return in log on $wireless_if proto tcp from any to any port 25
    # but allow connections to
    pass in on $wireless_if proto tcp from any to port 25

  5. costaricakj says:

    Actually, tagging/queue assignments on the inbound interface should be sufficient, as long as you don't override them with a pass out (i.e. don't rewrite the tag/queue) rule

    Here's Daniel (dhartmei)'s reply:

    a) Yes, he's filtering each connection on two interfaces, so he needs to create two state entries. That doesn't mean he has to duplicate the number of rules or use tagging. A generic 'pass out on $ext_if keep state' rule is usually perfectly appropriate. Anything trying to pass out on $ext_if must have previously passed more detailed rules on one of the internal interfaces (unless the firewall itself initiates connections he doesn't trust). I'd also suggest using 'state-policy if-bound', but I don't think he wants to hear about the difference :)

    b) No, he doesn't need to tag for queueing on the external interface. Tagging on the internal interface rules ('pass in on $intif keep state queue mp3_queue') is sufficient, as long as he takes care to not override the tag with a queue option on a rule for the external interface. When queue option on an internal interface rule adds the mbuf tag, the tag will persist on the mbuf and affect queueing on the external interface. At least that's how I understand how it's supposed to work :)

    So, in short, there should be no need to use tagging or duplicate rules.

  6. scosol says:

    hmmm ive always done things like this with dummynet on freebsd (i think its available for openbsd but maybe not?)
    the webcast uplinks are at a set bitrite and do not change-
    if you can put the webcast uplinks on a different nic than the internal-net traffic then things become really easy-
    just set up a dummynet rule to limit bandwidth on the internal-net nic to some_value_x that allocates enough of the rest of the T1 for the webcasts

    yeah yeah i know thats kinda equivalent to "just use a different linux distribution" but it really is simple and easy with dummynet :)

  7. transiit says:

    I don't know how feasible (or sensible) this is, but I'll throw out the idea anyhow:

    Have you considered adding a second intarweb path (i.e., a relatively cheap DSL link?)

    Provided you've got the lines and meet all the distance req's, it seems like you'd get a lot more flexibility for pushing the mp3 and real streams out the side with the guaranteed bandwidth, and could push the website, the kiosks, and everything else out the other link.

    I'll confess that I'm not familiar with every possible way the telcos screw over the small businessman, so I don't know if this is even remotely cost-effective. It'd cost money. It'd take some time to get just right. It might make you hate computers/networking/sysadminning more.

    On the other hand, it might make things easier: Lock out everything on the T1 that isn't related to the mp3 or real streams (or maybe mp3+real+webserver). Be able to screw around with the DSL (or whatever) for the mostly internal (and easy to troubleshoot) traffic.

    It's just an idea.