VMware, like any overlay, imposes additional overhead on traffic that traverses the network. This section first describes the overhead added in a traditional IPsec network and how it compares with VMware, which is followed by an explanation of how this added overhead relates to MTU and packet fragmentation behaviors in the network.
IPsec Tunnel Overhead
- Padding
- AES encrypts data in 16-byte blocks, referred to as "block" size.
- If the body of a packet is smaller than or indivisible by block size, it is padded to match the block size.
- Examples:
- A 1-byte packet will become 16-bytes with 15-bytes of padding.
- A 1400-byte packet will become 1408-bytes with 8-bytes of padding.
- A 64-byte packet does not require any padding.
- IPsec headers and trailers:
- UDP header for NAT Traversal (NAT-T).
- IP header for IPsec tunnel mode.
- ESP header and trailer.
Element | Size in Bytes |
---|---|
IP Header | 20 |
UDP Header | 8 |
IPsec Sequence Number | 4 |
IPsec SPI | 4 |
Initialization Vector | 16 |
Padding | 0 – 15 |
Padding Length | 1 |
Next Header | 1 |
Authentication Data | 12 |
Total | 66-81 |
VMware Tunnel Overhead
In order to support Dynamic Multipath Optimization™ (DMPO), VMware encapsulates packets in a protocol called the VeloCloud Multipath Protocol (VCMP). VCMP adds 31-bytes of overhead for user packets to support resequencing, error correction, network analysis, and network segmentation within a single tunnel. VCMP operates on an IANA-registered port of UDP 2426. To ensure consistent behavior in all potential scenarios (unencrypted, encrypted and behind a NAT, encrypted but not behind a NAT), VCMP is encrypted using transport mode IPsec and forces NAT-T to be true with a special NAT-T port of 2426.
Packets sent to the Internet via the SD-WAN Gateway are not encrypted by default, since they will egress to the open Internet upon exiting the Gateway. As a result, the overhead for Internet Multipath traffic is less than VPN traffic.
VPN Traffic
Element | Size in Bytes |
---|---|
IP Header | 20 |
UDP Header | 8 |
IPsec Sequence Number | 4 |
IPsec SPI | 4 |
VCMP Header | 23 |
VCMP Data Header | 8 |
Initialization Vector | 16 |
Padding | 0 – 15 |
Padding Length | 1 |
Next Header | 1 |
Authentication Data | 12 |
Total | 97 – 112 |
Internet Multipath Traffic
Element | Size in Bytes |
---|---|
IP Header | 20 |
UDP Header | 8 |
VCMP Header | 23 |
VCMP Data Header | 8 |
Total | 59 |
Path MTU Discovery
After it is determined how much overhead will be applied, the SD-WAN Edge must discover the maximum permissible MTU in order to calculate the effective MTU for customer packets. To find the maximum permissible MTU, the Edge performs Path MTU Discovery:
- For public Internet WAN links:
- Path MTU discovery is performed to all Gateways.
- The MTU for all tunnels will be set to the minimum MTU discovered.
- For private WAN links:
- Path MTU discovery is performed to all other Edges in the customer network.
- The MTU for each tunnel is set based on the results of Path MTU discovery.
The Edge will first attempt RFC 1191 Path MTU discovery, where a packet of the current known link MTU (Default: 1500 bytes) is sent to the peer with the "Don’t Fragment" (DF) bit set in the IP header. If this packet is received on the remote Edge or Gateway, an acknowledgement packet of the same size is returned to the Edge. If the packet cannot reach the remote Edge or Gateway due to MTU constraints, the intermediate device is expected to send an ICMP destination unreachable (fragmentation needed) message. When the Edge receives the ICMP unreachable message, it will validate the message (to ensure the MTU value reported is sane) and once validated, adjust the MTU. The process then repeats until the MTU is discovered.
In some cases (e.g. USB LTE dongles), the intermediate device will not send an ICMP unreachable message even if the packet is too large. If RFC 1191 fails (the Edge did not receive an acknowledgement or ICMP unreachable), it will fall back to RFC 4821 Packetization Layer Path MTU Discovery. The Edge will attempt to perform a binary search to discover the MTU.
When an MTU is discovered for a peer, all tunnels to this peer are set to the same MTU. That means that if an Edge has one link with an MTU of 1400 bytes and one link with an MTU of 1500 bytes, all tunnels will have an MTU of 1400 bytes. This ensures that packets can be sent on any tunnel at any time using the same MTU. We refer to this as the Effective Edge MTU. Based on the destination (VPN or Internet Multipath) the overhead outlined above is subtracted to compute the Effective Packet MTU. For Direct Internet or other underlay traffic, the overhead is 0 bytes, and because link failover is not required, the effective Packet MTU is identical to the discovered WAN Link MTU.
VPN Traffic and MTU
Now that the SD-WAN Edge has discovered the MTU and calculated the overheads, an effective MTU can be computed for client traffic. The Edge will attempt to enforce this MTU as efficiently as possible for the various potential types of traffic received.
TCP Traffic
The Edge automatically performs TCP MSS (Maximum Segment Size) adjustment for TCP packets received. As SYN and SYN|ACK packets traverse the Edge, the MSS is rewritten based on the Effective Packet MTU.
Non-TCP Traffic without DF bit set
If the packet is larger than the Effective Packet MTU, the Edge automatically performs IP fragmentation as per RFC 791.
Non-TCP Traffic with DF bit set
If the packet is larger than the Effective Packet MTU:
- The first time a packet is received for this flow (IP 5-tuple), the Edge drops the packet and sends an ICMP Destination unreachable (fragmentation needed) as per RFC 791.
- If subsequent packets are received for the same flow which are still too large, these packets are fragmented into multiple VCMP packets and reassembled transparently before handoff at the remote end.