Consider the requirements for the configuration of Tier-0 and Tier-1 gateways for implementing BGP routing in VMware Cloud Foundation, and the best practices for having optimal traffic routing on a standard or stretched cluster in a environment with a single or multiple VMware Cloud Foundation instances.

BGP Routing

The BGP routing design has the following characteristics:

  • Enables dynamic routing by using NSX.

  • Offers increased scale and flexibility.

  • Is a proven protocol that is designed for peering between networks under independent administrative control - data center networks and the NSX SDN.

Note:

These design recommendations do not include BFD. However, if faster convergence than BGP timers is required, you must enable BFD on the physical network and also on the NSX Tier-0 gateway.

BGP Routing Design Requirements

You must meet the following design requirements for standard and stretched clusters in your routing design for a single VMware Cloud Foundation instance. For NSX Federation, additional requirements exist.

Table 1. BGP Routing Design Requirements for VMware Cloud Foundation

Requirement ID

Design Requirement

Justification

Implication

VCF-NSX-BGP-REQD-CFG-001

To enable ECMP between the Tier-0 gateway and the Layer 3 devices (ToR switches or upstream devices), create two VLANs.

The ToR switches or upstream Layer 3 devices have an SVI on one of the two VLANS, and each Edge node in the cluster has an interface on each VLAN.

Supports multiple equal-cost routes on the Tier-0 gateway and provides more resiliency and better bandwidth use in the network.

Additional VLANs are required.

VCF-NSX-BGP-REQD-CFG-002

Assign a named teaming policy to the VLAN segments to the Layer 3 device pair.

Pins the VLAN traffic on each segment to its target edge node interface. From there, the traffic is directed to the host physical NIC that is connected to the target top of rack switch.

None.

VCF-NSX-BGP-REQD-CFG-003

Create a VLAN transport zone for edge uplink traffic.

Enables the configuration of VLAN segments on the N-VDS in the edge nodes.

Additional VLAN transport zones might be required if the edge nodes are not connected to the same top of rack switch pair.

VCF-NSX-BGP-REQD-CFG-004

Deploy a Tier-1 gateway and connect it to the Tier-0 gateway.

Creates a two-tier routing architecture.

Abstracts the NSX logical components which interact with the physical data center from the logical components which provide SDN services.

A Tier-1 gateway can only be connected to a single Tier-0 gateway.

In cases where multiple Tier-0 gateways are required, you must create multiple Tier-1 gateways.

VCF-NSX-BGP-REQD-CFG-005

Deploy a Tier-1 gateway to the NSX Edge cluster.

Enables stateful services, such as load balancers and NAT, for SDDC management components.

Because a Tier-1 gateway always works in active-standby mode, the gateway supports stateful services.

None.

Table 2. BGP Routing Design Requirements for Stretched Clusters in VMware Cloud Foundation

Requirement ID

Design Requirement

Justification

Implication

VCF-NSX-BGP-REQD-CFG-006

Extend the uplink VLANs to the top of rack switches so that the VLANs are stretched between both availability zones.

Because the NSX Edge nodes will fail over between the availability zones, ensures uplink connectivity to the top of rack switches in both availability zones regardless of the zone the NSX Edge nodes are presently in.

You must configure a stretched Layer 2 network between the availability zones by using physical network infrastructure.

VCF-NSX-BGP-REQD-CFG-007

Provide this SVI configuration on the top of the rack switches.

  • In the second availability zone, configure the top of rack switches or upstream Layer 3 devices with an SVI on each of the two uplink VLANs.

  • Make the top of rack switch SVI in both availability zones part of a common stretched Layer 2 network between the availability zones.

Enables the communication of the NSX Edge nodes to the top of rack switches in both availability zones over the same uplink VLANs.

You must configure a stretched Layer 2 network between the availability zones by using the physical network infrastructure.

VCF-NSX-BGP-REQD-CFG-008

Provide this VLAN configuration:

  • Use two VLANs to enable ECMP between the Tier-0 gateway and the Layer 3 devices (top of rack switches or Leaf switches).

  • The ToR switches or upstream Layer 3 devices have an SVI to one of the two VLANS and each NSX Edge node has an interface to each VLAN.

Supports multiple equal-cost routes on the Tier-0 gateway, and provides more resiliency and better bandwidth use in the network.

  • Extra VLANs are required.

  • Requires stretching uplink VLANs between availability zones

VCF-NSX-BGP-REQD-CFG-009

Create an IP prefix list that permits access to route advertisement by any network instead of using the default IP prefix list.

Used in a route map to prepend a path to one or more autonomous system (AS-path prepend) for BGP neighbors in the second availability zone.

You must manually create an IP prefix list that is identical to the default one.

VCF-NSX-BGP-REQD-CFG-010

Create a route map-out that contains the custom IP prefix list and an AS-path prepend value set to the Tier-0 local AS added twice.

  • Used for configuring neighbor relationships with the Layer 3 devices in the second availability zone.

  • Ensures that all ingress traffic passes through the first availability zone.

You must manually create the route map.

The two NSX Edge nodes will route north-south traffic through the second availability zone only if the connection to their BGP neighbors in the first availability zone is lost, for example, if a failure of the top of the rack switch pair or in the availability zone occurs.

VCF-NSX-BGP-REQD-CFG-011

Create an IP prefix list that permits access to route advertisement by network 0.0.0.0/0 instead of using the default IP prefix list.

Used in a route map to configure local-reference on learned default-route for BGP neighbors in the second availability zone.

You must manually create an IP prefix list that is identical to the default one.

VCF-NSX-BGP-REQD-CFG-012

Apply a route map-in that contains the IP prefix list for the default route 0.0.0.0/0 and assign a lower local-preference , for example, 80, to the learned default route and a lower local-preference, for example, 90 any routes learned.

  • Used for configuring neighbor relationships with the Layer 3 devices in the second availability zone.

  • Ensures that all egress traffic passes through the first availability zone.

You must manually create the route map.

The two NSX Edge nodes will route north-south traffic through the second availability zone only if the connection to their BGP neighbors in the first availability zone is lost, for example, if a failure of the top of the rack switch pair or in the availability zone occurs.

VCF-NSX-BGP-REQD-CFG-013

Configure the neighbors of the second availability zone to use the route maps as In and Out filters respectively.

Makes the path in and out of the second availability zone less preferred because the AS path is longer and the local preference is lower. As a result, all traffic passes through the first zone.

The two NSX Edge nodes will route north-south traffic through the second availability zone only if the connection to their BGP neighbors in the first availability zone is lost, for example, if a failure of the top of the rack switch pair or in the availability zone occurs.

Table 3. BGP Routing Design Requirements for NSX Federation in VMware Cloud Foundation

Requirement ID

Design Requirement

Justification

Implication

VCF-NSX-BGP-REQD-CFG-014

Extend the Tier-0 gateway to the second VMware Cloud Foundation instance.

  • Supports ECMP north-south routing on all nodes in the NSX Edge cluster.

  • Enables support for cross-instance Tier-1 gateways and cross-instance network segments.

The Tier-0 gateway deployed in the second instance is removed.

VCF-NSX-BGP-REQD-CFG-015

Set the Tier-0 gateway as primary for all VMware Cloud Foundation instances.

  • In NSX Federation, a Tier-0 gateway lets egress traffic from connected Tier-1 gateways only in its primary locations.

  • Local ingress and egress traffic is controlled independently at the Tier-1 level. No segments are provisioned directly to the Tier-0 gateway.

  • A mixture of network spans (local to a VMware Cloud Foundation instance or spanning multiple instances) is enabled without requiring additional Tier-0 gateways and hence edge nodes.

  • If a failure in a VMware Cloud Foundation instance occurs, the local-instance networking in the other instance remains available without manual intervention.

None.

VCF-NSX-BGP-REQD-CFG-016

From the global Tier-0 gateway, establish BGP neighbor peering to the ToR switches connected to the second VMware Cloud Foundation instance.

  • Enables the learning and advertising of routes in the second VMware Cloud Foundation instance.

  • Facilitates a potential automated failover of networks from the first to the second VMware Cloud Foundation instance.

None.

VCF-NSX-BGP-REQD-CFG-017

Use a stretched Tier-1 gateway and connect it to the Tier-0 gateway for cross-instance networking.

  • Enables network span between the VMware Cloud Foundation instances because NSX network segments follow the span of the gateway they are attached to.

  • Creates a two-tier routing architecture.

None.

VCF-NSX-BGP-REQD-CFG-018

Assign the NSX Edge cluster in each VMware Cloud Foundation instance to the stretched Tier-1 gateway. Set the first VMware Cloud Foundation instance as primary and the second instance as secondary.

  • Enables cross-instance network span between the first and second VMware Cloud Foundation instances.

  • Enables deterministic ingress and egress traffic for the cross-instance network.

  • If a VMware Cloud Foundation instance failure occurs, enables deterministic failover of the Tier-1 traffic flow.

  • During the recovery of the inaccessible VMware Cloud Foundation instance, enables deterministic failback of the Tier-1 traffic flow, preventing unintended asymmetrical routing.

  • Eliminates the need to use BGP attributes in the first and second VMware Cloud Foundation instances to influence location preference and failover.

You must manually fail over and fail back the cross-instance network from the standby NSX Global Manager.

VCF-NSX-BGP-REQD-CFG-019

Assign the NSX Edge cluster in each VMware Cloud Foundation instance to the local Tier-1 gateway for that VMware Cloud Foundation instance.

  • Enables instance-specific networks to be isolated to their specific instances.

  • Enables deterministic flow of ingress and egress traffic for the instance-specific networks.

You can use the service router that is created for the Tier-1 gateway for networking services. However, such configuration is not required for network connectivity.

VCF-NSX-BGP-REQD-CFG-020

Set each local Tier-1 gateway only as primary in that instance. Avoid setting the gateway as secondary in the other instances.

Prevents the need to use BGP attributes in primary and secondary instances to influence the instance ingress-egress preference.

None.

BGP Routing Design Recommendations

In your routing design for a single VMware Cloud Foundation instance, you can apply certain best practices for standard and stretched clusters. For NSX Federation, additional recommendations are available.

Table 4. BGP Routing Design Recommendations for VMware Cloud Foundation

Recommendation ID

Design Recommendation

Recommendation Justification

Recommendation Implication

VCF-NSX-BGP-RCMD-CFG-001

Deploy an active-active Tier-0 gateway.

Supports ECMP north-south routing on all Edge nodes in the NSX Edge cluster.

Active-active Tier-0 gateways cannot provide stateful services such as NAT.

VCF-NSX-BGP-RCMD-CFG-002

Configure the BGP Keep Alive Timer to 4 and Hold Down Timer to 12 or lower between the top of tack switches and the Tier-0 gateway.

Provides a balance between failure detection between the top of rack switches and the Tier-0 gateway, and overburdening the top of rack switches with keep-alive traffic.

By using longer timers to detect if a router is not responding, the data about such a router remains in the routing table longer. As a result, the active router continues to send traffic to a router that is down.

These timers must be aligned with the data center fabric design of your organization.

VCF-NSX-BGP-RCMD-CFG-003

Do not enable Graceful Restart between BGP neighbors.

Avoids loss of traffic.

On the Tier-0 gateway, BGP peers from all the gateways are always active. On a failover, the Graceful Restart capability increases the time a remote neighbor takes to select an alternate Tier-0 gateway. As a result, BFD-based convergence is delayed.

None.

VCF-NSX-BGP-RCMD-CFG-004

Enable helper mode for Graceful Restart mode between BGP neighbors.

Avoids loss of traffic.

During a router restart, helper mode works with the graceful restart capability of upstream routers to maintain the forwarding table which in turn will forward packets to a down neighbor even after the BGP timers have expired causing loss of traffic.

None.

VCF-NSX-BGP-RCMD-CFG-005

Enable Inter-SR iBGP routing.

In the event that an edge node has all of its northbound eBGP sessions down, north-south traffic will continue to flow by routing traffic to a different edge node.

None.

VCF-NSX-BGP-RCMD-CFG-006

Deploy a Tier-1 gateway in non-preemptive failover mode.

Ensures that after a failed NSX Edge transport node is back online, it does not take over the gateway services thus preventing a short service outage.

None.

VCF-NSX-BGP-RCMD-CFG-007

Enable standby relocation of the Tier-1 gateway.

Ensures that if an edge failure occurs, a standby Tier-1 gateway is created on another edge node.

None.

Table 5. BGP Routing Design Recommendations for NSX Federation in VMware Cloud Foundation

Recommendation ID

Design Recommendation

Justification

Implication

VCF-NSX-BGP-RCMD-CFG-008

Use Tier-1 gateways to control the span of networks and ingress and egress traffic in the VMware Cloud Foundation instances.

Enables a mixture of network spans (isolated to a VMware Cloud Foundation instance or spanning multiple instances) without requiring additional Tier-0 gateways and hence edge nodes.

To control location span, a Tier-1 gateway must be assigned to an edge cluster and hence has the Tier-1 SR component. East-west traffic between Tier-1 gateways with SRs need to physically traverse an edge node.

VCF-NSX-BGP-RCMD-CFG-009

Allocate a Tier-1 gateway in each instance for instance-specific networks and connect it to the stretched Tier-0 gateway.

  • Creates a two-tier routing architecture.

  • Enables local-instance networks that are not to span between the VMware Cloud Foundation instances.

  • Guarantees that local-instance networks remain available if a failure occurs in another VMware Cloud Foundation instance.

None.