IPsec provides security for IPv4 and IPv6 (access control, connectionless integrity, data origin authentication, replay detection, and confidentiality). Many platforms implement IPsec. Regardless of platform, these documents provide a good grounding in IPsec concepts:
ipsec(4), the OpenBSD man page
Two protocols provide most of the security for IPsec: Authentication Header (AH) and Encapsulating Security Payload (ESP). ESP provides encryption, authentication, and integrity (or optionally it can provide only encryption while relying on AH for authentication and integrity). AH provides only authentication and integrity (and can optionally work with ESP for encryption).
Besides ESP and AH, the other protocol close associated with IPsec is IKEv2 (Internet Key Exchange). IKE handles peer authentication and connection negotiation.
IPsec operates in either Transport or Tunnel mode.
Transport mode is generally used for end-to-end (i.e., client-server or client-client) communication. In Transport mode, AH provides authentication and integrity guarantees for packets, but not confidentiality (it doesn’t encrypt IP headers or data payloads) — the traffic can be seen by third parties but not modified. In Transport mode, ESP provides confidentiality by encrypting the data payload of packets; it can configured to also provide authentication and integrity itself, or it can rely upon AH for authentication and integrity. Transport mode (with AH or ESP) does not encrypt the IP headers at all.
Tunnel mode is generally used for network-to-network or host-to-network communication (e.g., VPN’s between routers). Tunnel mode encrypts both the IP headers and payload. Tunnel mode can use ESP alone to provide encryption, authentication, and integrity; or ESP with AH to provide authentication and integrity.
flow esp in from 10.0.46.0/24 to 10.0.0.0/24 peer 203.0.113.190 srcid 10.0.0.50/32 dstid 20n.nn.219.190/32
flow esp out from 10.0.0.0/24 to 10.0.46.0/24 peer 203.0.113.190 srcid 10.0.0.50/32 dstid 20n.nn.219.190/32
esp tunnel from 203.0.113.190 to 10.0.0.50 spi 0x7247a33a auth hmac-sha1 enc aes
esp tunnel from 10.0.0.50 to 203.0.113.190 spi 0xff779a61 auth hmac-sha1 enc aes
The policies in the SPD define what must be done with packets. The security associations in the SAD define how to transform the packets.
† An upcall “allows the kernel to execute a function in userspace, and potentially be returned information as a result.”
“Phase 1” and “phase 2” are the two stages of IKE negotiation.
Phase 1 is the Diffie-Hellman exchange (and similar), where peers that do not yet trust each other create a shared secret for subsequent communication. I.e., by merely exchanging unencrypted non-sensitive information, the two peers each create one secret they both know, without either of them disclosing the secret to each other or any eavesdropping third party.
After phase 1 completes, the two peer can communicate securely in phase 2. During phase 2, IKE negotiates between the peers to agree on the security parameters necessary to create SA’s.
Then end result of completing phase 1 and phase 2 is that the peers each hold a matched pair of security associations they can use to securely exchange packets.
IKE is a protocol to manage keying for IPsec. IKE authenticates peers, negotiates keying material, and creates/updates/deletes SA’s.
Prefer IKEv2 instead of IKEv1.
RFC 5996: Internet Key Exchange Protocol Version 2 (IKEv2)
The OS/kernel tests each packet against policies in a SPD. If a packet matches, the kernel transforms the packet according the security association tied to the security policy. If a packed doesn’t match any policy in the SPD, the kernel either drops the packet for forwards it according to no-IPsec system policies.
Theoretically, perhaps, but in most practical implementations, no.
Dead peer detection periodically (e.g., every 10 seconds) sends “are you alive?” queries to the peer. If the peer fails to respond to a number of consecutive queries, IPsec deletes the SA from the SAD.
RFC 3706: A Traffic-Based Method of Detecting Dead Internet Key Exchange (IKE) Peers
When two peers communicate with IKE  and IPSec , the situation may arise in which connectivity between the two goes down unexpectedly. This situation can arise because of routing problems, one host rebooting, etc., and in such cases, there is often no way for IKE and IPSec to identify the loss of peer connectivity. As such, the SAs can remain until their lifetimes naturally expire, resulting in a “black hole” situation where packets are tunneled to oblivion. It is often desirable to recognize black holes as soon as possible so that an entity can failover to a different peer quickly. Likewise, it is sometimes necessary to detect black holes to recover lost resources.
[…] consider two DPD peers A and B. If there is ongoing valid IPSec traffic between the two, there is little need for proof of liveliness. The IPSec traffic itself serves as the proof of liveliness. If, on the other hand, a period of time lapses during which no packet exchange occurs, the liveliness of each peer is questionable. Knowledge of the peer’s liveliness, however, is only urgently necessary if there is traffic to be sent. For example, if peer A has some IPSec packets to send after the period of idleness, it will need to know if peer B is still alive. At this point, peer A can initiate the DPD exchange.
To this end, each peer may have different requirements for detecting proof of liveliness. Peer A, for example, may require rapid failover, whereas peer B’s requirements for resource cleanup are less urgent. In DPD, each peer can define its own “worry metric” - an interval that defines the urgency of the DPD exchange. Continuing the example, peer A might define its DPD interval to be 10 seconds. Then, if peer A sends outbound IPSec traffic, but fails to receive any inbound traffic for 10 seconds, it can initiate a DPD exchange.
Peer B, on the other hand, defines its less urgent DPD interval to be 5 minutes. If the IPSec session is idle for 5 minutes, peer B can initiate a DPD exchange the next time it sends IPSec packets to A.
It is important to note that the decision about when to initiate a DPD exchange is implementation specific. An implementation might even define the DPD messages to be at regular intervals following idle periods.
The DPD exchange is a bidirectional (HELLO/ACK) Notify message.
In addition to the SPD and SAD, IPsec-v3 specifies a third database type: the Peer Authorization Database. The PAD holds information needed during peer authentication, linking IPsec with the key management protocol (e.g., IKE). Neither Linux nor the BSD’s appear to use a PAD (as of 2018).
IPsec creates a boundary between unprotected and protected interfaces, for a host or a network (see Figure 1 below). Traffic traversing the boundary is subject to the access controls specified by the user or administrator responsible for the IPsec configuration. These controls indicate whether packets cross the boundary unimpeded, are afforded security services via AH or ESP, or are discarded.
Figure 1 from RFC 4301:
Unprotected ^ ^ | | +-------------|-------|-------+ | +-------+ | | | | |Discard|<--| V | | +-------+ |B +--------+ | ................|y..| AH/ESP |..... IPsec Boundary | +---+ |p +--------+ | | |IKE|<----|a ^ | | +---+ |s | | | +-------+ |s | | | |Discard|<--| | | | +-------+ | | | +-------------|-------|-------+ | | V V Protected
The protection offered by IPsec is based on requirements defined by a Security Policy Database (SPD) […] packets are selected for one of three processing actions based on IP and next layer header information (“Selectors”, Section 126.96.36.199) matched against entries in the SPD. Each packet is either PROTECTed using IPsec security services, DISCARDed, or allowed to BYPASS IPsec protection, based on the applicable SPD policies identified by the Selectors.
Each policies in the SPD specifies: security protocol (AH or ESP), mode (transport or tunnel), security service options, cryptographic algorithms, and in what combinations to use those protocols and services.
Security Associations (SA’s).
The concept of a “Security Association” (SA) is fundamental to IPsec. Both AH and ESP make use of SAs, and a major function of IKE is the establishment and maintenance of SAs.
An SA is a simplex “connection” that affords security services to the traffic carried by it. Security services are afforded to an SA by the use of AH, or ESP […] To secure typical, bi-directional communication between two IPsec-enabled systems, a pair of SAs (one in each direction) is required. IKE explicitly creates SA pairs in recognition of this common usage requirement.
For an SA used to carry unicast traffic, the Security Parameters Index (SPI) by itself suffices to specify an SA. […for multicast traffic, the SPI isn’t sufficient…] Each entry in the SA Database (SAD) (Section 4.4.2) must indicate whether the SA lookup makes use of the destination IP address, or the destination and source IP addresses, in addition to the SPI.
(The Index maps a policy in the SPD to an SA.)
RFC 4301 conceives of three databases in an IPsec implementation:
IPsec decisions preempt system routing:
The IPsec model described here embodies a clear separation between forwarding (routing) and security decisions, to accommodate a wide range of contexts where IPsec may be employed. Forwarding may be trivial, in the case where there are only two interfaces, or it may be complex, e.g., if the context in which IPsec is implemented employs a sophisticated forwarding function. IPsec assumes only that outbound and inbound traffic that has passed through IPsec processing is forwarded in a fashion consistent with the context in which IPsec is implemented. Support for nested SAs is optional; if required, it requires coordination between forwarding tables and SPD entries to cause a packet to traverse the IPsec boundary more than once.
The Security Policy Database (SPD)
An SA is a management construct used to enforce security policy for traffic crossing the IPsec boundary. Thus, an essential element of SA processing is an underlying Security Policy Database (SPD) that specifies what services are to be offered to IP datagrams and in what fashion. The form of the database and its interface are outside the scope of this specification. However, this section specifies minimum management functionality that must be provided, to allow a user or system administrator to control whether and how IPsec is applied to traffic transmitted or received by a host or transiting a security gateway. The SPD, or relevant caches, must be consulted during the processing of all traffic (inbound and outbound), including traffic not protected by IPsec, that traverses the IPsec boundary. This includes IPsec management traffic such as IKE. An IPsec implementation MUST have at least one SPD, and it MAY support multiple SPDs, if appropriate for the context in which the IPsec implementation operates. There is no requirement to maintain SPDs on a per-interface basis, as was specified in RFC 2401 [RFC2401]. However, if an implementation supports multiple SPDs, then it MUST include an explicit SPD selection function that is invoked to select the appropriate SPD for outbound traffic processing. The inputs to this function are the outbound packet and any local metadata (e.g., the interface via which the packet arrived) required to effect the SPD selection function. The output of the function is an SPD identifier (SPD-ID).
The SPD is an ordered database, consistent with the use of Access Control Lists (ACLs) or packet filters in firewalls, routers, etc. The ordering requirement arises because entries often will overlap due to the presence of (non-trivial) ranges as values for selectors. Thus, a user or administrator MUST be able to order the entries to express a desired access control policy. There is no way to impose a general, canonical order on SPD entries, because of the allowed use of wildcards for selector values and because the different types of selectors are not hierarchically related.
Processing Choices: DISCARD, BYPASS, PROTECT
An SPD must discriminate among traffic that is afforded IPsec protection and traffic that is allowed to bypass IPsec. This applies to the IPsec protection to be applied by a sender and to the IPsec protection that must be present at the receiver. For any outbound or inbound datagram, three processing choices are possible: DISCARD, BYPASS IPsec, or PROTECT using IPsec.
Each SPD entry specifies packet disposition as BYPASS, DISCARD, or PROTECT. The entry is keyed by a list of one or more selectors. The SPD contains an ordered list of these entries. The required selector types are defined in Section 188.8.131.52. These selectors are used to define the granularity of the SAs that are created in response to an outbound packet or in response to a proposal from a peer.
For traffic protected by IPsec, the Local and Remote address and ports in an SPD entry are swapped to represent directionality, consistent with IKE conventions. In general, the protocols that IPsec deals with have the property of requiring symmetric SAs with flipped Local/Remote IP addresses.
How to Derive the Values for an SAD Entry
For each selector in an SPD entry, the entry specifies how to derive the corresponding values for a new SA Database (SAD, see Section 4.4.2) entry from those in the SPD and the packet. The goal is to allow an SAD entry and an SPD cache entry to be created based on specific selector values from the packet, or from the matching SPD entry.
An SA may be fine-grained or coarse-grained, depending on the selectors used to define the set of traffic for the SA. For example, all traffic between two hosts may be carried via a single SA, and afforded a uniform set of security services. Alternatively, traffic between a pair of hosts might be spread over multiple SAs, depending on the applications being used (as defined by the Next Layer Protocol and related fields, e.g., ports), with different security services offered by different SAs. Similarly, all traffic between a pair of security gateways could be carried on a single SA, or one SA could be assigned for each communicating host pair. The following selector parameters MUST be supported by all IPsec implementations to facilitate control of SA granularity. Note that both Local and Remote addresses should either be IPv4 or IPv6, but not a mix of address types.
- Remote IP Address(es) (IPv4 or IPv6): […] a single IP address (via a trivial range), or a list of addresses (each a trivial range), or a range of addresses […]
- Local IP Address(es) (IPv4 or IPv6): [as above]
- Next Layer Protocol: […] Local and Remote Ports […]
- Name: This is not a selector like the others above. It is not acquired from a packet. [It’s inherited from a named SPD, and might be used, e.g., to identify a “road warrior”.]
IPsec doesn’t route. That is, policy-based IPsec traffic doesn’t hit the kernel routing tables. How, then, do we provide redundancy?
IPsec has two aspects:
Establishing the connection happens with IKE (internet key exchange). IKE has two phases:
On OpenBSD, phase two results in the creation of a pair of security associations (SA’s) on each peer. Each SA in the pair applies to a single protocol (AH or ESP) for a single direction of information flow. So, a functioning connection includes two SA’s: one to send (encrypt) data and another to receive (decrypt) it. OpenBSD stores these SA’s in its security association database (SAD).
How does IPsec know which packets to send down which tunnels (and which packets to leave alone)? Phase 2 also results in the creation of security policies (which OpenBSD also calls “flows”). When a new packet comes in, OpenBSD checks the packet against each security policy stored in its security policy database (SPD) to see if it matches.
For site-to-site VPN’s between two routers, we typically deploy IPsec in tunnel mode. A pair of security associations (SA’s) at each end — one to send/encrypt, another to receive/decrypt — connect the two sites. An SA describes how the connection will be achieved, and includes the destination IP address. IPsec stores SA’s in a security association database (SAD).
IPsec also keeps a security policy database (SPD). The SPD contains rules that IPsec tests against all packets. If a packet matches, IPsec applies the policy.
OpenBSD calls the rules in the SPD “flows”. Here we see the flows/SPD rules and the SAD entries:
bsd # ipsecctl -s all FLOWS: flow esp in from 10.0.46.0/24 to 10.0.0.0/24 peer 20n.nn.219.190 srcid 10.0.0.50/32 dstid 20n.nn.219.190/32 type use flow esp out from 10.0.0.0/24 to 10.0.46.0/24 peer 20n.nn.219.190 srcid 10.0.0.50/32 dstid 20n.nn.219.190/32 type require flow esp in from 10.0.10.0/24 to 10.0.0.0/24 peer 20n.nn.217.230 srcid 10.0.0.50/32 dstid 20n.nn.217.230/32 type use flow esp out from 10.0.0.0/24 to 10.0.10.0/24 peer 20n.nn.217.230 srcid 10.0.0.50/32 dstid 20n.nn.217.230/32 type require flow esp in from 10.0.1.0/24 to 10.0.0.0/24 peer 20n.nn.204.134 srcid 10.0.0.50/32 dstid 20n.nn.204.134/32 type use flow esp out from 10.0.0.0/24 to 10.0.1.0/24 peer 20n.nn.204.134 srcid 10.0.0.50/32 dstid 20n.nn.204.134/32 type require SAD: esp tunnel from 20n.nn.217.230 to 10.0.0.50 spi 0x338d7414 auth hmac-sha1 enc 3des-cbc esp tunnel from 10.0.0.50 to 20n.nn.217.230 spi 0x58db0d95 auth hmac-sha1 enc 3des-cbc esp tunnel from 20n.nn.204.134 to 10.0.0.50 spi 0x5dfa6164 auth hmac-sha1 enc des-cbc esp tunnel from 20n.nn.219.190 to 10.0.0.50 spi 0x7247a33a auth hmac-sha1 enc aes esp tunnel from 10.0.0.50 to 20n.nn.204.134 spi 0x96c0978c auth hmac-sha1 enc des-cbc esp tunnel from 10.0.0.50 to 20n.nn.219.190 spi 0xff779a61 auth hmac-sha1 enc aes
The life of an IPsec tunnel looks something like this:
Note that all packets are checked against the security policies. A packet that matches one of the policies goes into the IPsec flow. A packet that does not match any of the policies goes on to normal kernel packet flow, including system routing.
What if we need redundant tunnels or more complex routing than security policies provide?
Use system/kernel routing, with static routes or a routing protocol. To main ways exist to do this: either wrap IPsec in a GRE tunnel, or create a virtual tunnel interface (VTI).
Peer Authorization Database (PAD)
The Peer Authorization Database (PAD) provides the link between the SPD and a security association management protocol such as IKE. It embodies several critical functions:
- identifies the peers or groups of peers that are authorized to communicate with this IPsec entity
- specifies the protocol and method used to authenticate each peer
- provides the authentication data for each peer
- constrains the types and values of IDs that can be asserted by a peer with regard to child SA creation, to ensure that the peer does not assert identities for lookup in the SPD that it is not authorized to represent, when child SAs are created
- peer gateway location info, e.g., IP address(es) or DNS names, MAY be included for peers that are known to be “behind” a security gateway
The PAD provides these functions for an IKE peer when the peer acts as either the initiator or the responder.
To perform these functions, the PAD contains an entry for each peer or group of peers with which the IPsec entity will communicate. An entry names an individual peer (a user, end system or security gateway) or specifies a group of peers (using ID matching rules defined below). The entry specifies the authentication protocol (e.g., IKEv1, IKEv2, KINK) method used (e.g., certificates or pre- shared secrets) and the authentication data (e.g., the pre-shared secret or the trust anchor relative to which the peer’s certificate will be validated).
Six types of IDs are supported for entries in the PAD, consistent with the symbolic name types and IP addresses used to identify SPD entries. The ID for each entry acts as the index for the PAD, i.e., it is the value used to select an entry. All of these ID types can be used to match IKE ID payload types. The six types are:
- DNS name (specific or partial)
- Distinguished Name (complete or sub-tree constrained)
- RFC 822 email address (complete or partially qualified)
- IPv4 address (range)
- IPv6 address (range)
- Key ID (exact match only)
The first three name types can accommodate sub-tree matching as well as exact matches. A DNS name may be fully qualified and thus match exactly one name, e.g., foo.example.com. Alternatively, the name may encompass a group of peers by being partially specified, e.g., the string “.example.com” could be used to match any DNS name ending in these two domain name components.
Once an entry is located based on an ordered search of the PAD based on ID field matching, it is necessary to verify the asserted identity, i.e., to authenticate the asserted ID. For each PAD entry, there is an indication of the type of authentication to be performed. This document requires support for two required authentication data types:
- X.509 certificate
- pre-shared secret
Once an IKE peer is authenticated, child SAs may be created. Each PAD entry contains data to constrain the set of IDs that can be asserted by an IKE peer, for matching against the SPD. Each PAD entry indicates whether the IKE ID is to be used as a symbolic name for SPD matching, or whether an IP address asserted in a traffic selector payload is to be used.
During the initial IKE exchange, the initiator and responder each assert their identity via the IKE ID payload and send an AUTH payload to verify the asserted identity. One or more CERT payloads may be transmitted to facilitate the verification of each asserted identity.
When an IKE entity receives an IKE ID payload, it uses the asserted ID to locate an entry in the PAD, using the matching rules described above. The PAD entry specifies the authentication method to be employed for the identified peer. This ensures that the right method is used for each peer and that different methods can be used for different peers. The entry also specifies the authentication data that will be used to verify the asserted identity. This data is employed in conjunction with the specified method to authenticate the peer, before any CHILD SAs are created.
Child SAs are created based on the exchange of traffic selector payloads, either at the end of the initial IKE exchange or in subsequent CREATE_CHILD_SA exchanges. The PAD entry for the (now authenticated) IKE peer is used to constrain creation of child SAs; specifically, the PAD entry specifies how the SPD is searched using a traffic selector proposal from a peer. There are two choices: either the IKE ID asserted by the peer is used to find an SPD entry via its symbolic name, or peer IP addresses asserted in traffic selector payloads are used for SPD lookups based on the remote IP address field portion of an SPD entry. It is necessary to impose these constraints on creation of child SAs to prevent an authenticated peer from spoofing IDs associated with other, legitimate peers.
The default automated key management protocol selected for use with IPsec is IKEv2.
When an automated SA/key management protocol is employed, the output from this protocol is used to generate multiple keys for a single SA. This also occurs because distinct keys are used for each of the two SAs created by IKE. If both integrity and confidentiality are employed, then a minimum of four keys are required. Additionally, some cryptographic algorithms may require multiple keys, e.g., 3DES.
Consider a situation in which a remote host (SH1) is using the Internet to gain access to a server or other machine (H2) and there is a security gateway (SG2), e.g., a firewall, through which H1’s traffic must pass. An example of this situation would be a mobile host crossing the Internet to his home organization’s firewall (SG2). This situation raises several issues:
- How does SH1 know/learn about the existence of the security gateway SG2?
- How does it authenticate SG2, and once it has authenticated SG2, how does it confirm that SG2 has been authorized to represent H2?
- How does SG2 authenticate SH1 and verify that SH1 is authorized to contact H2?
- How does SH1 know/learn about any additional gateways that provide alternate paths to H2?
To address these problems, an IPsec-supporting host or security gateway MUST have an administrative interface that allows the user/administrator to configure the address of one or more security gateways for ranges of destination addresses that require its use. This includes the ability to configure information for locating and authenticating one or more security gateways and verifying the authorization of these gateways to represent the destination host. (The authorization function is implied in the PAD.) This document does not address the issue of how to automate the discovery/verification of security gateways.
“The Security Policy Database (SPD)”, the SPD (or associated caches) MUST be consulted during the processing of all traffic that crosses the IPsec protection boundary, including IPsec management traffic. If no policy is found in the SPD that matches a packet (for either inbound or outbound traffic), the packet MUST be discarded. To simplify processing, and to allow for very fast SA lookups (for SG/BITS/BITW), this document introduces the notion of an SPD cache for all outbound traffic (SPD-O plus SPD-S), and a cache for inbound, non-IPsec-protected traffic (SPD-I). (As mentioned earlier, the SAD acts as a cache for checking the selectors of inbound IPsec-protected traffic arriving on SAs.) There is nominally one cache per SPD. For the purposes of this specification, it is assumed that each cached entry will map to exactly one SA.
For inbound IPsec traffic, the SAD entry selected by the SPI serves as the cache for the selectors to be matched against arriving IPsec packets, after AH or ESP processing has been performed.
Generic routing encapsulation (GRE) provides encapsulation of packets but no encryption. GRE is truly generic; it can encapsulate almost any protocol (any OSI layer 3 protocol). For example, GRE enables transmission of non-IP protocols like IPX or AppleTalk over IP networks. GRE can tunnel any layer-3 protocols, routing protocols, multicast, etc.
GRE is light-weight. It simply wraps the tunneled protocol at one end with a new IP header, and strips that additional header at the other end of the GRE tunnel. That amounts to only an additional 24 bytes per packet (though additional GRE options can add up to another 12 bytes).
Configuration of a GRE tunnel includes:
Additionally, each end of the tunnel is assigned an “inside” IP address. The addresses for both ends of the tunnel must share a subnet. These addresses are only used to route traffic down the GRE tunnel.
“IPsec over GRE” encrypts data in IPsec, then sends that encrypted data through a GRE tunnel. This exposes any data communicated outside IPsec, potentially including routing protocols.
“GRE over IPsec” sends data into a GRE tunnel, then encrypts all the data with IPsec.
Virtual tunnel interfaces (VTI) creates a fake/software network interface. For our use case, routing IPsec, VTI works much like a GRE tunnel, but without the minor overhead of encapsulation.
VTI devices are a local feature. Because VTI adds no extra encapsulation, the other end doesn’t necessarily need to have a matching VTI interface so long as it has appropriate IPsec policies.
$ ip tunnel add vti20 remote 203.0.113.123 mode vti key 20 $ ip -s tunnel show vti20 $ ip tunnel del vti20
The device name can be anything but should start with
Linux treats devices named
vti… special in a few cases, like when it reports statistics.
“xfrm” is “IPsec transform”.
Virtual xfrm interfaces, added to the Linux kernel in 2018, address some limitation of VTI:
But xfrm doesn’t seem like it replaces VTI for the user. Are the changes just in the kernel?