paulgorman.org/technical

Linux Bridges and Virtual Networking

(July 2017)

N.B.: ip commands do not persist across reboots. To persist changes, use an ifcfg file or run the ip commands from a script. On Red Hat, /etc/sysconfig/network-scripts/. On Debian, /etc/network/interfaces.d/. See INTERFACES(5). For example:

iface tap0 inet manual
	pre-up ip tuntap add tap0 mode tap user root
	pre-up ip addr add 192.168.10.2/24 dev tap0
	up ip link set dev tap0 up
	post-up ip route del 192.168.10.0/24 dev tap0
	post-up ip route add 192.168.10.2/32 dev tap0
	post-down ip link del dev tap0

Bridges

A linux bridge implements a layer-2 switch in software.

https://en.wikipedia.org/wiki/Network_switch#Layer_2 A network bridge, operating at the data link layer, interconnects a small number of network devices. This is a trivial case of bridging, in which the bridge learns the MAC address of each connected device. Classic bridges may also prevent loops in the LAN with a spanning tree protocol. Unlike routers, spanning tree topologies must bridge only one active path between two points.

Bridges connect with both physical (eth0) and viritual (vnet0) network devices.

Create a bridge:

# ip link add br0 type bridge

Plug an interface into the bridge:

# ip link set dev eth0 master br0

Unplug it from the bridge:

# ip link set dev eth0 nomaster

Show devices plugged into a bridge:

# ip link show master br0

See BRIDGE(8). The bridge utility manipulates bridges in various ways, like managing VLAN’s.

# bridge vlan show br0

TUN/TAP and veth Devices

TUN and TAP are very similar. Both virtual network device types move traffic between userland and the kernel. TUN (tunnel) devices operate at layer 3, reading and writing data as IP packets. TAP (network tap) devices operate at layer 2, reading and writing ethernet frames.

A TAP device only connects to one bridge at a time.

Libvirt can automatically create a virtual network device when starting a VM. Although named like “vnet0”, these are specifically TAP type devices.

List devices by type:

# ip link show type bridge
3: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
	link/ether bc:5f:f4:44:7e:56 brd ff:ff:ff:ff:ff:ff
4: lxcbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
	link/ether 00:16:3e:00:00:00 brd ff:ff:ff:ff:ff:ff
# ip link show type tun
6: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br0 state UNKNOWN mode DEFAULT group default qlen 1000
	link/ether fe:54:00:80:ef:da brd ff:ff:ff:ff:ff:ff

TUN and TAP devices often both report as “tun”. Distinguish between them, if necessary, like:

# sudo ip tuntap add vnet1 mode tun
$ ip tuntap show
vnet1: tun UNKNOWN_FLAGS:800
vnet0: tap vnet_hdr

veth type devices operate in pairs, like the ends of a patch cable. Veth devices are often used to connect two bridges or to cross two network namespaces.

# ip link add veth0 type veth peer name veth1

Dummy Devices

Dummy devices act as a virtual “stub” for an IP address, like a loopback interface. A service that needs to bind to an interface or IP address can bind to a dummy. Use of a dummy interface keeps the service up (when a physical interface might be down or change IP addresses).

# ip link add dummy1 type dummy
# ip addr add 10.0.1.30/24 dev dummy1

Network Namespaces

Linux has one set of network interfaces and routes. Network namespaces add the ability to have multiple, segregated sets of interfaces and routes.

$ ip netns show
# ip netns add orange
$ sudo ip netns show
orange
# ip addr add 192.168.111.10/24 dev dummy1
# ip link set dummy1 netns orange

After changing the namespace of “dummy1”, it no longer appears in the output of ip li sh (showing the default namespace), but:

# ip netns exec orange ip link show
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
5: dummy1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 5a:b3:ee:e4:a3:53 brd ff:ff:ff:ff:ff:ff

“dummy1” is also no longer pingable from the default namespace. Use a veth pair to cross namespace boundaries.

# ip link add veth0 type veth peer name veth1
# ip link set veth1 netns orange
# ip addr add 192.168.111.1/24 dev veth0
# ip link set veth0 up
# ip netns exec orange ip addr add 192.168.111.2/24 dev veth1
# ip netns exec orange ip link set veth1 up
# ip netns exec orange ip route add default via 192.168.111.2
# ip route add 192.168.111.10 via 192.168.111.1
# ip netns exec orange ip link set dummy1 up
# ip netns exec orange ip addr add 192.168.111.10/24 dev dummy1

…and we can ping 192.168.111.10 from the default namespace.

Connecting libvirt Guests

See virsh(1), specifically:

(The net-list, net-define, net-update, etc. commands apply to guests on a NAT’d virtual bridge, rather than guests fully bridged to an exposed physical interface.)

Also:

From a hypervisor with several guests:

# virsh iface-dumpxml br0
<interface type='bridge' name='br0'>
  <protocol family='ipv4'>
	<ip address='10.0.0.32' prefix='24'/>
  </protocol>
  <protocol family='ipv6'>
	<ip address='fd80:c96e:7a06:0:ec4:7aff:fe97:8938' prefix='64'/>
	<ip address='fe80::ec4:7aff:fe97:8938' prefix='64'/>
  </protocol>
  <bridge>
	<interface type='ethernet' name='eth0'>
	  <link speed='1000' state='up'/>
	  <mac address='0c:c4:7a:97:89:38'/>
	</interface>
	<interface type='ethernet' name='veth2TIP25'>
	  <link speed='10000' state='up'/>
	  <mac address='fe:13:55:ab:5f:cb'/>
	</interface>
	<interface type='ethernet' name='vnet2'>
	  <link state='unknown'/>
	  <mac address='fe:54:00:6c:4c:b6'/>
	</interface>
	<interface type='ethernet' name='vnet0'>
	  <link state='unknown'/>
	  <mac address='fe:54:00:fd:cd:6b'/>
	</interface>
	<interface type='ethernet' name='vethG88GDN'>
	  <link speed='10000' state='up'/>
	  <mac address='fe:f2:61:04:bd:a9'/>
	</interface>
	<interface type='ethernet' name='vnet1'>
	  <link state='unknown'/>
	  <mac address='fe:54:00:b9:2c:2f'/>
	</interface>
  </bridge>
</interface>

If we edit a guest’s interface XML and omit the MAC address, libvirt will helpfully auto-generate one.