paulgorman.org/technical

nftables

(February 2017, January 2018)

Nftables is a linux firewall that replaces the older iptables. Nftables reuses existing Netfilter kernel hooks, NAT, and userspace queuing and logging.

The nftables syntax, inspired by tcpdump, adds features like sets to make rules more concise.

Like iptables, nftables organizes rules with tables and chains. Chains order rules. Tables logically group chains for administrative convenience. For example, iptables has filter, nat, and mangle tables; the filter table includes packet filtering rules from the INPUT, FORWARD, and OUTPUT chains. Unlike iptables, nftables isn’t limited to pre-defined tables or chains.

How does nftables make decisions about packets? A packet traverses the stages in the kernel’s networking stack, with Netfilter providing a hook for each stage. Nftables attaches user-defined rules to those hooks. At each stage, a packet is evaluated against each rule until it matches a terminal rule (accept/drop). A packet dropped at an early stage won’t be reevaluated at a later stage.

On Debian, install the user tools (i.e., the nft command) with apt-get install nftables. On RHEL/CentOS 7, run yum install nftables. RHEL 8 uses nftables by default.

The man page nft(8) is helpful. Also see /usr/share/doc/nftables/examples/ or /usr/share/doc/nftables/. The official docs seem to be at https://wiki.nftables.org.

Overview of a common configuration and packet flow

A host acting as a simple firewall and gateway may define only a small number of nft chains, each matching a kernel hook:

For configuration convenience and by convention, we group the input, output, and forward chains into a filter table. Most rules in setups like this attach to the forward chain.

If NAT is required, we follow the convention of creating a nat table to hold the prerouting and postrouting chains. Source-NAT rules (where we rewrite the packet source) attach to the postrouting chain, and destination-NAT rules (where we rewrite the packet’s destination) attach to the prerouting chain.

Packet flow is straightforward. Only one chain attaches to each hook. The first accept or drop rule a packet matches wins.

Hooks

A hook is a callback into a particular stage in the kernel’s networking stack. A chain may register on one of the following hooks:

◂ ◂ ◂ (e.g., loopback traffic) ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ↙ ┃ ┃ ▾┃ INPUT ○╮ ╭○ OUTPUT ┃▴ ┃ hook │ │ hook ┏┻┓ ▸ ▸ ▸ outbound ┏━━━━┻━━━━━━━━━┿━━┓ ┏━━┿━━━━━━━━━━┫╳┣━━━━━━┳━━━┿━━━ traffic ▸ ┃ ▸ ▸ ▸ ┃ ┃ ▸ ▸ ┗┯┛ ┃▴ │ ┃ ▾┃ ┃▴ │ ┃ │ ┃▴ local system │ ┃ ╰○ POSTROUTING inbound ▸ ┏┻┓ │ ┃ hook traffic ▸ ━━━━━┿━━┫╳┠── routing decision routing decision ┃ │ ┗┳┛ ┃ PREROUTING ○╯ ▾┃ ┃▴ hook ┗━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ ▸ ▸ │ ▸ ▸ ╰○ FORWARD hook

Example of simple NFT configuration

Allow everything out, and filter everything incoming except for SSH and pings:

🐚 ~ $ sudo apt install nftables
🐚 ~ $ sudo nft flush ruleset
🐚 ~ $ sudo nft add table inet filter
🐚 ~ $ sudo nft add chain filter input { type filter hook input priority 0\; policy accept\; }
🐚 ~ $ sudo nft add chain filter forward { type filter hook forward priority 0\; policy accept\; }
🐚 ~ $ sudo nft add chain filter output { type filter hook output priority 0\; policy accept\; }
🐚 ~ $ sudo nft add rule filter input ct state invalid drop
🐚 ~ $ sudo nft add rule filter input meta iif lo ct state new accept
🐚 ~ $ sudo nft add rule filter input ct state established,related accept
🐚 ~ $ sudo nft add rule filter input tcp dport ssh accept
🐚 ~ $ sudo nft add rule filter input icmp type echo-request accept
🐚 ~ $ sudo nft add rule inet filter input icmpv6 type { nd-neighbor-solicit, echo-request, nd-router-advert, nd-neighbor-advert } accept
🐚 ~ $ sudo nft add chain filter input { type filter hook input priority 0\; policy drop\; }
🐚 ~ $ sudo nft list ruleset
table ip filter {
	chain input {
		type filter hook input priority 0; policy drop;
		ct state invalid drop
		iif "lo" ct state new accept
		ct state established,related accept
		tcp dport ssh accept
		icmp type echo-request accept
	}

	chain forward {
		type filter hook forward priority 0; policy accept;
	}

	chain output {
		type filter hook output priority 0; policy accept;
	}
}
🐚 ~ $ sudo sh -c 'echo "flush ruleset" > /etc/nftables.conf'
🐚 ~ $ sudo sh -c 'nft list ruleset >> /etc/nftables.conf'
🐚 ~ $ sudo nft --check --file /etc/nftables.conf
🐚 ~ $ sudo systemctl enable nftables.service
🐚 ~ $ sudo systemctl start nftables.service

Families

Nftables recognizes these five protocol families. All traffic belongs to a family. Each rule applies to a particular family.

Tables

A table simply groups chains for easier management (e.g., we can flush all chains grouped in a table with one command). The only restriction on which chains may be included in a particular table is that the chains must all affect the same protocol family.

🐚 # nft list tables
🐚 # nft add table inet foo
🐚 # nft delete table inet foo
🐚 # nft flush table inet foo

The kernel doesn’t know about tables. Tables are a user convenience for grouping/categorizing chains. Like, we create a “filter” table to group together chains with filtering rules. We’re likely to include the “input” chain (if we have one) in the filter table, because we do lots of filtering on input. But we could include the input chain in other tables too.

Chains

A chain groups together rules, and attaches those rules to a Netfilter kernel hook for packet processing.

A chain can have a type, hook, priority, and policy.

A chain’s policy determines what happens to a packet that hasn’t matched a particular rules.

A chain is one of three types:

🐚 # nft add chain [<family>] <table-name> <chain-name> { type <type> hook <hook> priority <value> \; [policy <policy>] }
🐚 # nft add chain ip foo input { type filter hook input priority 0 \; }
🐚 # nft delete chain ip foo input
🐚 # nft flush chain foo input

(The escaped semi-colon is only necessary to not confuse the shell.)

Chains not registered with a hooks do not get packets, but may be used to organize other chains.

Iptables comes standard with one chain for each hook, with the chains named the same as the hooks. A user can give nftables chains any name, and create many chains per hook. In practice, however, many nftables configurations follow the convention of one chain per hook.

Nftables evaluates each packet against all chains on the same hook. The last chain on the hook (i.e., the chain with the highest priority) that evaluates a packet wins (reversing any contrary decision by an earlier chain on that same hook). Each chain has its own priority, which the user may set.

Note: a chain’s priority may place the chain before or after some Netfilter internal operations, like:

Chains with a low priority (negative, zero) are evaluated before chains with a higher priority (positive). A chain with a higher priority can overrule an earlier chain on the same hook with a lower priority.

Rules

A rule describes the type of traffic that matches it, and the action to take for matching traffic.

Each rules has a number that sets its order in relation to other rules on the same chain.

When ordering rules, a user may need to refer to either the rule’s “handle” or “position”. The handle is an internal ID that identifies the rule. The position is a number that places the rule before a particular handle (i.e., insert this rule right before this other rule).

🐚 # nft add rule mytable mychain ip daddr 8.8.8.8 counter
🐚 # nft add rule mytable mychain position 8 ip daddr 127.0.0.8 drop
🐚 # nft insert rule mytable mychain position 8 ip daddr 127.0.0.6 drop
🐚 # nft delete rule mytable mychain handle 5
🐚 # nft replace rule mytable mychain handle 9 ip daddr 127.0.0.3 drop
🐚 # nft list table mytable -n -a

(add places the rule after the position, insert places the rule before the position.)

In iptables, each rules has one target (e.g., -j ACCEPT or -j LOG). In nftables, one rules may perform several actions.

Nftables provides these operations for rules:

Remember to escape < and > in the shell, like \< and \>.

Match all incoming traffic not arriving on TCP port 22:

🐚 # nft add rule mytable mychain tcp dport != 22

Match traffic to high ports:

🐚 # nft add rule mytable mychain tcp dport >= 1024

Nftables provides a number of matching criteria. The available criteria vary somewhat by type (i.e., ip, tcp, ip6, udp, arp, ct, vlan, etc.). See https://wiki.nftables.org/wiki-nftables/index.php/Quick_reference-nftables_in_10_minutes#Rules. These examples are non-exhaustive:

🐚 # nft add rule ip length 333-435 drop
🐚 # nft add rule ip ttl > 200 drop
🐚 # nft add rule ip protocol icmp drop
🐚 # nft add rule ip saddr 192.168.2.0/24 accept
🐚 # nft add rule ip daddr { 192.168.0.1-192.168.0.250 } drop
🐚 # nft add rule tcp dport {telnet, http, https } allow
🐚 # nft add rule icmp type {echo-reply, destination-unreachable, redirect, echo-request} allow
🐚 # nft add rule ct status expected allow
🐚 # nft add rule ct helper "ftp" log
🐚 # nft add rule meta iifname "eth2" continue

The “statement” of a rule is the action performed on matching packets. A statement may be “terminal” or “non-terminal”. A rule may include several non-terminal statements but on one terminal statement. These verdict statements alter control flow in the ruleset and issue policy decisions for packets:

Additional actions:

🐚 # nft add rule filter input iif lo log tcp dport 22 accept
🐚 # nft add rule nat postrouting ip saddr 192.168.1.0/24 oif eth0 snat 1.2.3.4
🐚 # nft add rule mangle prerouting dup to 172.20.0.2
🐚 # nft add rule filter input ip protocol tcp counter

Sets

Nftables adds another basic type not round in iptables: sets. In many scenarios, the use of sets dramatically increase performance versus implementing the functionality with individual rules. Use sets liberally!

Sets can be anonymous or named. Anonymous sets are bound to a rule, and can’t be updated without replacing the rule.

🐚 # nft add rule filter output tcp dport { 22, 23 } counter

Named sets are not tied to rules and may be updated.

🐚 # nft add set filter myset { type ipv4_addr\;}
🐚 # nft add element filter myset { 192.168.3.4 }
🐚 # nft add element filter myset { 192.168.1.4, 192.168.1.5 }
🐚 # nft add rule ip input ip saddr @myset allow
🐚 # nft list set filter myset

Named sets can have several characteristics:

Example 0

Filter traffic for a workstation (so we don’t need a forward chain):

🐚 # nft add table ip filter
🐚 # nft add chain ip filter input { type filter hook input priority 0 \; policy drop \; }
🐚 # nft add chain ip filter output { type filter hook output priority 0 \; policy accept \; }
🐚 # nft add rule filter input ct state established,related accept
🐚 # nft add rule filter output ip daddr 8.8.8.8 counter

Example 1

flush ruleset

table inet filter {
	chain input {
		type filter hook input priority 0;
		iif lo accept
		ct state established,related accept
		tcp dport { 22, 80, 443 } ct state new accept
		ip6 nexthdr icmpv6 icmpv6 type { nd-neighbor-solicit,  nd-router-advert, nd-neighbor-advert } accept
		counter drop
	}
}

Example 1 Discussion

This is the simple example ruleset for a workstation found in /usr/share/doc/nftables/examples/syntax/workstation.

flush ruleset clears existing rules. Nftables can flush individual tables too (i.e., nft flush table mytable).

table inet filter { begins and declares a new table for IPv4 and IPv6 traffic (inet family) named “filter”.

chain input { created the “input” chain. The “input” here is simply a name.

type filter hook input priority 0; sets the chain’s type to filter, attaches the chain to the input hook, and sets its priority (zero is the expected priority for filtering).

iif lo accept accept all traffic coming in on the loopback lo interface. (Is this right? According to nft(8), iif should get an “interface index” while iifname gets a string/name.)

ct state established,related accept uses connection tracking accept traffic with an existing state (i.e., related to connections that originated from us).

tcp dport { 22, 80, 443 } ct state new accept lets us serve ssh and web traffic. Note use of a set to specify the ports.

ip6 nexthdr icmpv6 icmpv6 type { nd-neighbor-solicit, nd-router-advert, nd-neighbor-advert } accept allows IPv6 neighbor discovery (or else IPv6 breaks!). Note use of a set to specify the ICMPv6 message types.

counter drop counts and drops any traffic not covered by an earlier rule.

Example 2

flush ruleset

table firewall {
  chain incoming {
    type filter hook input priority 0; policy drop;
    ct state established,related accept
    iifname lo accept
    icmp type echo-request accept
    tcp dport {ssh, http} accept
  }
}

table ip6 firewall {
  chain incoming {
    type filter hook input priority 0; policy drop;
    ct state established,related accept
    ct state invalid drop
    iifname lo accept
    # routers may also want: mld-listener-query, nd-router-solicit
    icmpv6 type {echo-request,nd-neighbor-solicit} accept
    tcp dport {ssh, http} accept
  }
}

Example 2 Discussion

This is the simple firewall ruleset from https://wiki.nftables.org/wiki-nftables/index.php/Quick_reference-nftables_in_10_minutes.

flush ruleset clears existing rules.

table firewall { declares a new table. Note the declaration does not specify a protocol family, so it default to IPv4.

chain incoming { declares a new chain.

type filter hook input priority 0; policy drop; sets the chain’s type to filter, attaches the chain to the input hook, sets its priority (zero is the expected priority for filtering), and sets the default policy to drop.

ct state established,related accept uses connection tracking accept traffic with an existing state (i.e., related to connections that originated from us).

iifname lo accept accept all traffic coming in on the loopback lo interface.

icmp type echo-request accept allows pings.

tcp dport {ssh, http} accept allows ssh and web serving.

table ip6 firewall { declares a new table for IPv6.

chain incoming { declares a new chain named “incoming”.

type filter hook input priority 0; policy drop; sets the chain’s type to filter, attaches the chain to the input hook, sets its priority (zero is the expected priority for filtering), and sets the default policy to drop.

ct state established,related accept uses connection tracking accept traffic with an existing state (i.e., related to connections that originated from us).

ct state invalid drop drops invalid connections.

iifname lo accept accepts traffic coming in on the loopback interface.

icmpv6 type {echo-request,nd-neighbor-solicit} accept accept some IPv6 icmp traffic.

tcp dport {ssh, http} accept allows ssh and web serving.

A Brief Refresher on Connection Tracking

Connection tracking isn’t unique to nftables — Netfilter provides it. Connection tracking filters packets based on criteria that IP header information alone can not provide. In other words: stateful firewalling. Connection tracking keeps facts about a connection — its source and destination addresses, protocol, ports, timeout, etc. A connection may have one of the following states:

These states have nothing to do with TCP states; even UPD connection can be stateful in the sense of connection tracking.

Connection tracking works primarily at layer 3, although some of the modules operate at higher layers.

Connection tracking facilitates some application-layer protocols with hard-to-track properties, like FTP. A connection tracking “helper” has a set of expectations about the properties of connections. The FTP helper expects that, within a given time and from a given source and destination, that a passive FTP connection will open a second high-number port for data transmission. The helper inspects packet contents in order to find the necessary information. The helper is application-aware. In the case of FTP, the helper digs through packet payloads looking for the PORT reply from the server to the client. When its expectations are met, the helper establishes a new state.

Helpers exist for IRC, SIP, SNMP, H323, etc.

Saving and Restoring Rule Sets

With iptables, a common configuration method used a shell script to execute a series of iptables commands. Unfortunately, that was not an atomic operation. Nftables loads a rule file atomically with the -f flat:

🐚 # nft -f myrulefile

Most Linux distributions read nftables rules from /etc/nftables.conf. Save the current rules to this file so they persist after a reboot:

🐚 # echo "flush ruleset" > /etc/nftables.conf
🐚 # nft list ruleset >> /etc/nftables.conf

In a rule file, nftables treats a line beginning with # as a comment.

Check the syntax of a rules file:

🐚 $ /usr/sbin/nft --check --file /etc/nftables.conf

Under systemd, make sure to enable the nftables service so that the rules load on reboot:

🐚 # sudo systemctl enable nftables.service
🐚 # sudo systemctl start nftables.service