paulgorman.org/technical

nftables

(February 2017, January 2018)

[There are murmurings that nftables may have missed the boat, and that most people will move from iptables directly to bpfilter. As fo 2018, it’s hard to know.]

Nftables is a linux firewall (a packet classification framework). Nftables intends to replace the older iptables (nftables is stable and production-ready now). Nftables reuses existing, familiar Netfilter kernel hooks, NAT, and userspace queuing and logging.

Compared to iptables, nftables has a simpler syntax, inspired by tcpdump, and adds features like sets to make rules more concise.

Like iptables, nftables has tables and chains. Chains group rules together. Tables group chains together. For example, iptables has filter, nat, and mangle tables, and the filter table includes INPUT, FORWARD, and OUTPUT chains. Unlike iptables, nftables doesn’t come with pre-defined tables or chains.

Is nftables a first-match-wins firewall (like iptables) or a last-match-wins firewall (like PF)? It’s not that simple. In nftables, a packet traverses a number of stages (“hooks” provided by the Netfilter framework that correspond to stages in the kernel’s networking stack). If a packet gets dropped at an early stage, it won’t be reevaluated at a later stage. In that way, nftables is a first-match-wins firewall (like iptables). Each hooks may hold multiple chains, and a chain contains multiple rules. Some rules in a chain have “terminal” statements while other statements are “non-terminal”. The accept and drop verdicts are terminal statements. If a packet matches a rule that ends in an accept or drop verdict, evaluation of rules in that chain ends. But nftables may attach several rule chains to each hook. Every chain on a hook evaluates each packet, so the decision to drop a packet by an early (low priority) chain might be overruled by a later (high priority) chain on the same hook. In that way, nftables is a last-match-wins firewall (like PF).

On Debian, install the user tools (i.e., the nft command) with apt-get install nftables. On RHEL/CentOS (7+), run yum install nftables.

The man page nft(8) is helpful. Also see /usr/share/doc/nftables/. The official docs seem to be at https://wiki.nftables.org.

Tables

A table simply contains chains for easier management (e.g., we can flush all chains grouped in a table with one command). The only restriction on which chains may be included in a particular table is that the chains must all affect the same protocol family. Nftables recognizes these five protocol families:

# nft list tables
# nftp add table inet foo
# nftp delete table inet foo
# nftp flush table inet foo

Chains and Hooks

A chain can have a type, hook, priority, and policy.

A chain is one of three types:

# nft add chain [<family>] <table-name> <chain-name> { type <type> hook <hook> priority <value> \; [policy <policy>] }

# nft add chain ip foo input { type filter hook input priority 0 \; }
# nft delete chain ip foo input
# nft flush chain foo input

(The escaped semi-colon is only necessary to not confuse the shell.)

A hook is callback into a particular stage in the kernel’s networking stack. A chain may register on one of the following hooks:

This diagram represents packet flow through the hooks:

                                                         Local
                                                        process
                                                          ^  |      .-----------.
                               .-----------.              |  |      |  Routing  |
                               |           |-----> input /    \---> |  Decision |----> output \
--> ingress --> prerouting --->|  Routing  |                        .-----------.              \
                               | Decision  |                                                     --> postrouting
                               |           |                                                    /
                               |           |---------------> forward ---------------------------
                               .-----------.

Chains not registered with a hooks do not get packets, but may be used to organize other chains.

N.B. — nftables evaluates each packet against all chains on the same hook. The last chain on the hook (i.e., the chain with the highest priority) that evaluates a packet wins (reversing any contrary decision by an earlier chain on that same hook).

Iptables comes standard with one chain for each hook, with the chains named the same as the hooks. Nftables does not have default chains. The nftables chains may or may not be named after the hooks, however the user desires.

Setting the priority of a chain sets its order. The priority may place the chain before or after some Netfilter internal operations, like:

Chains with a low priority (negative, zero) are evaluated before chains with a higher priority (positive). A chain with a higher priority can overrule an earlier chain on the same hook with a lower priority.

Chains have a base policy the applies to packet that haven’t matched an earlier rule: accept or drop.

Rules

Chains contain rules. Each rule has a “handle” and a “position”. The handle is an internal number that identifies the rule. The position is an internal number that places the rule before a particular handle (i.e., insert this rule right before this other rule).

# nft add rule mytable mychain ip daddr 8.8.8.8 counter
# nft add rule mytable mychain position 8 ip daddr 127.0.0.8 drop
# nft insert rule mytable mychain position 8 ip daddr 127.0.0.6 drop
# nft delete rule mytable mychain handle 5
# nft replace rule mytable mychain handle 9 ip daddr 127.0.0.3 drop
# nft list table mytable -n -a

(add places the rule after the position, insert places the rule before the position.)

In iptables, each rules has one target (e.g., -j ACCEPT or -j LOG). In nftables, one rules may perform several actions.

Nftables provides these operations for rules:

Remember to escape < and > in the shell, like \< and \>.

Match all incoming traffic not arriving on TCP port 22:

# nft add rule mytable mychain tcp dport != 22

Match traffic to high ports:

# nft add rule mytable mychain tcp dport >= 1024

Nftables provides a number of matching criteria. The available criteria vary somewhat by type (i.e., ip, tcp, ip6, udp, arp, ct, vlan, etc.). See https://wiki.nftables.org/wiki-nftables/index.php/Quick_reference-nftables_in_10_minutes#Rules. These examples are non-exhaustive:

# nft add rule ip length 333-435 drop
# nft add rule ip ttl > 200 drop
# nft add rule ip protocol icmp drop
# nft add rule ip saddr 192.168.2.0/24 accept
# nft add rule ip daddr { 192.168.0.1-192.168.0.250 } drop
# nft add rule tcp dport {telnet, http, https } allow
# nft add rule icmp type {echo-reply, destination-unreachable, redirect, echo-request} allow
# nft add rule ct status expected allow
# nft add rule ct helper "ftp" log
# nft add rule meta iifname "eth2" continue

The “statement” of a rule is the action performed on matching packets. A statement may be “terminal” or “non-terminal”. A rule may include several non-terminal statements but on one terminal statement. These verdict statements alter control flow in the ruleset and issue policy decisions for packets:

Additional actions:

# nft add rule filter input iif lo log tcp dport 22 accept
# nft add rule nat postrouting ip saddr 192.168.1.0/24 oif eth0 snat 1.2.3.4
# nft add rule mangle prerouting dup to 172.20.0.2
# nft add rule filter input ip protocol tcp counter

Sets

Nftables adds another basic type not round in iptables: sets. In many scenarios, the use of sets dramatically increase performance versus implementing the functionality with individual rules. Use sets liberally!

Sets can be anonymous or named. Anonymous sets are bound to a rule, and can’t be updated without replacing the rule.

# nft add rule filter output tcp dport { 22, 23 } counter

Named sets are not tied to rules and may be updated.

# nft add set filter myset { type ipv4_addr\;}
# nft add element filter myset { 192.168.3.4 }
# nft add element filter myset { 192.168.1.4, 192.168.1.5 }
# nft add rule ip input ip saddr @myset allow
# nft list set filter myset

Named sets can have several characteristics:

Saving and Restoring Rule Sets

With iptables, a common configuration method used a shell script to execute a series of iptables commands. Unfortunately, that was not an atomic operation. Nftables loads a rule file atomically with the -f flat:

# nft -f myrulefile

Save the current rules to a file:

# echo "nft flush ruleset" > myrulefile
# ntf list ruleset >> myrulefile

In a rule file, nftables treats a line beginning with # as a comment.

Example 0

Filter traffic for a workstation (so we don’t need a forward chain):

# nft add table ip filter
# nft add chain ip filter input { type filter hook input priority 0 \; policy drop \; }
# nft add chain ip filter output { type filter hook output priority 0 \; policy accept \; }
# nft add rule filter input ct state established,related accept
# nft add rule filter output ip daddr 8.8.8.8 counter

Example 1

flush ruleset

table inet filter {
	chain input {
		type filter hook input priority 0;
		iif lo accept
		ct state established,related accept
		tcp dport { 22, 80, 443 } ct state new accept
		ip6 nexthdr icmpv6 icmpv6 type { nd-neighbor-solicit,  nd-router-advert, nd-neighbor-advert } accept
		counter drop
	}
}

Example 1 Discussion

This is the simple example ruleset for a workstation found in /usr/share/doc/nftables/examples/syntax/workstation.

flush ruleset clears existing rules. Nftables can flush individual tables too (i.e., nft flush table mytable).

table inet filter { begins and declares a new table for IPv4 and IPv6 traffic (inet family) named “filter”.

chain input { created the “input” chain. The “input” here is simply a name.

type filter hook input priority 0; sets the chain’s type to filter, attaches the chain to the input hook, and sets its priority (zero is the expected priority for filtering).

iif lo accept accept all traffic coming in on the loopback lo interface. (Is this right? According to nft(8), iif should get an “interface index” while iifname gets a string/name.)

ct state established,related accept uses connection tracking accept traffic with an existing state (i.e., related to connections that originated from us).

tcp dport { 22, 80, 443 } ct state new accept lets us serve ssh and web traffic. Note use of a set to specify the ports.

ip6 nexthdr icmpv6 icmpv6 type { nd-neighbor-solicit, nd-router-advert, nd-neighbor-advert } accept allows IPv6 neighbor discovery (or else IPv6 breaks!). Note use of a set to specify the ICMPv6 message types.

counter drop counts and drops any traffic not covered by an earlier rule.

Example 2

flush ruleset

table firewall {
  chain incoming {
    type filter hook input priority 0; policy drop;
    ct state established,related accept
    iifname lo accept
    icmp type echo-request accept
    tcp dport {ssh, http} accept
  }
}

table ip6 firewall {
  chain incoming {
    type filter hook input priority 0; policy drop;
    ct state established,related accept
    ct state invalid drop
    iifname lo accept
    # routers may also want: mld-listener-query, nd-router-solicit
    icmpv6 type {echo-request,nd-neighbor-solicit} accept
    tcp dport {ssh, http} accept
  }
}

Example 2 Discussion

This is the simple firewall ruleset from https://wiki.nftables.org/wiki-nftables/index.php/Quick_reference-nftables_in_10_minutes.

flush ruleset clears existing rules.

table firewall { declares a new table. Note the declaration does not specify a protocol family, so it default to IPv4.

chain incoming { declares a new chain.

type filter hook input priority 0; policy drop; sets the chain’s type to filter, attaches the chain to the input hook, sets its priority (zero is the expected priority for filtering), and sets the default policy to drop.

ct state established,related accept uses connection tracking accept traffic with an existing state (i.e., related to connections that originated from us).

iifname lo accept accept all traffic coming in on the loopback lo interface.

icmp type echo-request accept allows pings.

tcp dport {ssh, http} accept allows ssh and web serving.

table ip6 firewall { declares a new table for IPv6.

chain incoming { declares a new chain named “incoming”.

type filter hook input priority 0; policy drop; sets the chain’s type to filter, attaches the chain to the input hook, sets its priority (zero is the expected priority for filtering), and sets the default policy to drop.

ct state established,related accept uses connection tracking accept traffic with an existing state (i.e., related to connections that originated from us).

ct state invalid drop drops invalid connections.

iifname lo accept accepts traffic coming in on the loopback interface.

icmpv6 type {echo-request,nd-neighbor-solicit} accept accept some IPv6 icmp traffic.

tcp dport {ssh, http} accept allows ssh and web serving.

A Brief Refresher on Connection Tracking

Connection tracking isn’t unique to nftables — Netfilter provides it. Connection tracking filters packets based on criteria that IP header information alone can not provide. In other words: stateful firewalling. Connection tracking keeps facts about a connection — its source and destination addresses, protocol, ports, timeout, etc. A connection may have one of the following states:

These states have nothing to do with TCP states; even UPD connection can be stateful in the sense of connection tracking.

Connection tracking works primarily at layer 3, although some of the modules operate at higher layers.

Connection tracking facilitates some application-layer protocols with hard-to-track properties, like FTP. A connection tracking “helper” has a set of expectations about the properties of connections. The FTP helper expects that, within a given time and from a given source and destination, that a passive FTP connection will open a second high-number port for data transmission. The helper inspects packet contents in order to find the necessary information. The helper is application-aware. In the case of FTP, the helper digs through packet payloads looking for the PORT reply from the server to the client. When its expectations are met, the helper establishes a new state.

Helpers exist for IRC, SIP, SNMP, H323, etc.