The routing design considers different levels of routing within the environment from which to define a set of principles for designing a scalable routing solution.


The Provider Logical Router (PLR) handles the North-South traffic to and from a tenant and management applications inside of application virtual networks.


Internal East-West routing at the layer beneath the PLR deals with the application workloads.

Table 1. Design Decisions on the Routing Model of NSX

Decision ID

Design Decision

Design Justification

Design Implications


Deploy a minimum of two NSX Edge services gateways (ESGs) in an ECMP configuration for North-South routing.

  • You use an NSX ESG for directing North-South traffic. Using ECMP provides multiple paths in and out of the SDDC.

  • Failover is faster than deploying ESGs in HA mode.

ECMP requires 2 VLANS in each availability zone and region for uplinks which adds an extra VLAN over traditional HA ESG configurations.


Deploy a single NSX DLR in HA mode to provide East-West routing.

Using the DLR reduces the hop count between nodes attached to it to 1. This reduces latency and improves performance.

UDLRs are limited to 1,000 logical interfaces. If that limit is reached, you must deploy a new UDLR.


Use BGP as the dynamic routing protocol inside the SDDC.

Using BGP as opposed to OSPF eases the implementation of dynamic routing. There is no need to plan and design access to OSPF area 0 inside the SDDC. OSPF area 0 varies based on customer configuration.

BGP requires configuring each ESG and DLR with the remote router that it exchanges routes with.


Configure BGP Keep Alive Timer to 1 and Hold Down Timer to 3 between the DLR and all ESGs that provide North-South routing.

With Keep Alive and Hold Timers between the DLR and ECMP ESGs set low, a failure is detected quicker, and the routing table is updated faster.

If an ESXi host becomes resource constrained, the ESG running on that ESXi host might no longer be used even though it is still up.


Configure BGP Keep Alive Timer to 4 and Hold Down Timer to 12 between the ToR switches and all ESGs providing North-South routing.

This provides a good balance between failure detection between the ToRs and the ESGs and overburdening the ToRs with keep alive traffic.

By using longer timers to detect when a router is dead, a dead router stays in the routing table longer and continues to send traffic to a dead router.


Create one or more static routes on ECMP enabled edges for subnets behind the DLR with a higher admin cost than the dynamically learned routes.

When the DLR control VM fails over router adjacency is lost and routes from upstream devices, such as the ToR switches, to subnets behind the DLR are lost.

You must configure each ECMP edge device with static routes to the DLR. If any new subnets are added behind the DLR, the routes must be updated on the ECMP edges.


Disable Graceful Restart on all ECMP Edges and Logical Router Control Virtual Machines.

Graceful Restart maintains the forwarding table which in turn will forward packets to a down neighbor even after the BGP timers have expired causing loss of traffic.



In the consolidated cluster, do not create an anti-affinity rule to separate ECMP edges and Logical Router Control Virtual Machines.

  • Because these clusters contain four hosts, creating an anti-affinity rule that contains four virtual machines results in not being able to enter maintenance mode to perform life cycle activities.

  • During a host failure, vSphere HA cannot restart the virtual machine because of the anti-affinity rule.

If the active Logical Router control virtual machine and an ECMP edge reside on the same host and that host fails, a dead path in the routing table appears until the standby Logical Router control virtual machine starts its routing process and updates the routing tables.

To avoid this situation, add an additional host to the cluster and create an anti-affinity rule to keep these virtual machines separated.

Transit Network and Dynamic Routing

Dedicated networks are needed to facilitate traffic between the universal dynamic routers and edge gateways, and to facilitate traffic between edge gateways and the top of rack switches. These networks are used for exchanging routing tables and for carrying transit traffic.

Table 2. Design Decisions on the Transit Network

Decision ID

Design Decision

Design Justification

Design Implications


Create a virtual switch for use as the transit network between the DLR and ESGs.

The virtual switch allows the DLR and ESGs to exchange routing information. The DLR provides East-West routing in the stack while the ESGs provide North-South routing.

A virtual switch for use as a transit network is required.


Create two VLANs to enable ECMP between the North-South ESGs and the Layer 3 device (ToR or upstream device). The ToR switches or upstream Layer 3 devices have an SVI on one of the two VLANS and each North-South ESG has an interface on each VLAN.

This enables the ESGs to have multiple equal-cost routes and provides more resiliency and better bandwidth use in the network.

Extra VLANs are required.