Network Design for Private AI Ready Infrastructure for VMware Cloud Foundation

Private AI ready infrastructure requires multiple networks. The network design includes choosing the physical network devices and creating physical network setup for running AI workloads.

The requirement for network devices for AI workloads depend on the specific task, dataset size, model complexity, or performance expectations.

Table 1. Recommended Physical Network Devices
Category	Hardware	Description	Example of Optimal Configuration (Based on NVIDIA DGX)
Management network	VMware Compatibility Guide - NICs	10 Gbps, 25 Gbps, or above. Host baseboard management controller (BMC) with RJ45.	NIC Broadcom 57504, Mellanox ConnectX-4, or Intel similar products Switch Broadcom StrataXGS Switch Solutions BCM56080 Series or similar products
Workloads and VMware services such as vSphere vMotion, vSAN, network overlays, and so on.	VMware Compatibility Guide – NICs with SR-IOV and RoCE v2	LLM inference and fine-tuning within a single host is sufficient with standard 2*25 Gbps Ethernet ports. For fine-tuning models larger than 40B parameters, efficient multi-node communication requires low latency, and optimal performance requires 100 Gbps or higher RDMA network (for example, RoCE or InfiniBand).	RDMA over Converged Ethernet (RoCE) NIC Broadcom 5750X, NVIDIA Mellanox ConnectX-5/6/7, or similar products RoCE Switch Broadcom StrataXGS Switch Solutions (Trident4-X11C/BCM56890 Series) or similar products InfiniBand Host Channel Adapter (HCA) NVIDIA Mellanox ConnectX-5/6/7 VPI InfiniBand Switch NVIDIA QM9700

Note: For optimal speed of data transfer for the devices, during the installation of NICs or HCAs on servers, consider PCIe generation and lane compatibility on the server motherboard.

Management Domain VLANs

For the management network, each network type is associated with a specific VLAN. See VMware Cloud Foundation Design Guide.

Workload Domain Network VLANs

The workload network on the workload cluster is configured with dedicated switch and network adapters for optimal performance. You deploy all vSphere with Tanzu workloads to overlay-backed NSX segments. NSX Edge nodes in the shared edge and workload vSphere cluster are deployed to VLAN-backed port groups.

The Tier-0 gateway connects to the five different Tier-1 gateways to the different overlay-backed NSX segments. — Figure 1. Networks for vSphere with Tanzu in a Workload Domain

Table 2. Networks Used by vSphere with Tanzu
Network	Routable / NAT	Usage
Supervisor Control Plane network	Routable	Used by the Supervisor control plane nodes.
Pod Networks	NAT	Used by Kubernetes pods that run in the cluster. Any Tanzu Kubernetes Grid Clusters instantiated in the Supervisor also use this pool. For LLM inferencing tasks and fine-tuning tasks within a single host, existing 25-Gb Ethernet network infrastructure is sufficient to accommodate the bandwidth requirements of textual data. For fine-tuning larger models with more than 40B parameters among GPUs across different nodes, the substantial demand for information exchange (including weights) requires the adoption of RDMA networking (RoCE/InfiniBand) with 100 Gb or higher bandwidth for optimal performance.
Service IP Pool Network	NAT	Used by Kubernetes applications that need a service IP address.
Ingress IP Pool Network	Routable	Used by NSX to create an IP pool for load balancing.
Egress IP Pool Network	Routable	Used by NSX to create an IP pool for NAT endpoint use.
Namespace Networks	NAT	When you create a namespace, a /28 overlay-backed NSX segment and corresponding IP pool is instantiated to service pods in that namespace. If that IP space runs out, an additional /28 overlay-backed NSX segment and IP pool are instantiated.
Tanzu Kubernetes Grid Networks	NAT	When you create a Tanzu Kubernetes Grid cluster, an NSX Tier-1 Gateway is instantiated in NSX. On that NSX Tier-1 Gateway, a /28 overlay-backed NSX segment and IP pool is also instantiated.

Design Decisions on the Network Design for Private AI Ready Infrastructure

Table 3. Design Decisions on Networking for Private AI Ready Infrastructure for VMware Cloud Foundation
Decision ID	Design Decision	Design Justification	Design Implication
AIR-TZU-NET-001	Set up networking for 100 Gbps or higher if possible.	100 Gbps networking provides enough bandwidth and very low latency for inference and fine-tuning use cases backed by vSAN ESA.	The cost of the solution is increased.
AIR-TZU-NET-002	Add a /28 overlay-backed NSX segment for use by the Supervisor control plane nodes.	Supports the Supervisor control plane nodes.	You must create the overlay-backed NSX segment.
AIR-TZU-NET-003	Use a dedicated /20 subnet for pod networking.	A single /20 subnet is sufficient to meet the design requirement of 2000 pods.	You must set up a private IP space behind a NAT that you can use in multiple Supervisors.
AIR-TZU-NET-004	Use a dedicated /22 subnet for services.	A single /22 subnet is sufficient to meet the design requirement of 2000 pods.	Private IP space behind a NAT that you can use in multiple Supervisors.
AIR-TZU-NET-005	Use a dedicated /24 or larger subnet on your corporate network for ingress endpoints.	A /24 subnet is sufficient to meet the design requirement of 2000 pods in most cases.	This subnet must be routable to the rest of the corporate network. A /24 subnet will be sufficient for most use cases, but you should evaluate your ingress needs before deployment.
AIR-TZU-NET-006	Use a dedicated /24 or larger subnet on your corporate network for egress endpoints.	A /24 subnet is sufficient to meet the design requirement of 2000 pods in most cases.	This subnet must be routable to the rest of the corporate network. A /24 subnet will be sufficient for most use cases, but you should evaluate your egress needs before to deployment.