Nova provisions virtual compute instances on top of the SDDC infrastructure. Nova comprises a set of daemons running as Kubernetes Pods on top of the Tanzu Kubernetes Grid cluster to provide the compute provisioning service.
Nova compute consists of the following daemon processes:
nova-api: Accepts and responds to end-user compute API requests such VM boot, reset, resize, and so on.
nova-compute: Creates and terminates VM instances.
nova-scheduler: Takes a VM instance request from the queue and determines where compute should run it on.
nova-conductor: Handles requests that need coordination, and acts as a database proxy. Nova conductor communicates between Nova processes.
Nova Compute
Unlike a traditional KVM-based approach where each hypervisor is represented as a nova compute, the VMware vCenter driver enables the nova-compute service to communicate with a VMware vCenter Server instance.
The VMware vCenter Driver aggregates all ESXi hosts within each cluster and presents one large hypervisor to the nova scheduler. VIO deploys a nova-compute Pod for each vSphere ESXi cluster that it manages. Because individual ESXi hosts are not exposed to the nova scheduler, Nova scheduler assigns hypervisor compute hosts at granularity of the vSphere clusters, and vCenter selects the actual ESXi host within the cluster based on advanced DRS placement settings. Both automated and partially automated DRS are supported for standard VM workloads. DRS must be deactivated in the case of SR-IOV.
Nova Host Aggregates
A nova host aggregate is a grouping of hypervisor or nova computes. Groupings can be done based on the host hardware similarity. For example, all clusters with SSD storage backing can be grouped into one aggregate, clusters with magnetic storage into another aggregate. If hardware attributes are similar, grouping can also be based on the physical location of the cluster in the form of availability zones. If there are N data centers , all ESXi clusters within a data center can be grouped into a single aggregate. An ESXi cluster can be in more than one host aggregate.
Host Aggregates provide a mechanism to allow administrators to assign key-value pairs, also called metadata, to compute groups. The nova scheduler can use this key-value pair and metadata to pick the hardware that matches the client request. Host aggregates are visible only to administrators. Users consume aggregates based on the VM flavor definition and availability zone.
Nova Scheduler
VMware Integrated OpenStack uses the nova-scheduler service to determine where to place a new workload or a modification to an existing workload request, for example, during a live migration or when a new VM starts up. A nova-scheduler is described simply as a filter. Based on the type of request, it eliminates nova-computes that cannot satisfy the workload request and returns those that can.
Nova scheduler also controls host CPU, memory, and disk over-subscription. Over-subscription places multiple devices to the same physical resource to optimize usage. Over-subscription can be defined based on host aggregate. The following filters when enabled controls aggregates-level over-subscription management:
AggregateCoreFilter: Filters hosts by CPU core numbers with a per-aggregate cpu_allocation_ratio value.
AggregateDiskFilter: Filters hosts by disk allocation with a per-aggregate disk_allocation_ratio value.
AggregateRamFilter: Filters hosts by RAM allocation of instances with a per-aggregate ram_allocation_ratio value.
If the per-aggregate value is not found, the value falls back to the global setting. If the host is in more than one aggregate and thus more than one value is found, the minimum value is used.
At the opposite end of the spectrum, instead of over-subscription, Cloud Administrators can assign dedicated compute hosts by OpenStack tenants. You can use the AggregateMultiTenancyIsolation filter to control VM placement based on OpenStack tenant. Tenant is defined as an OpenStack project in this context. If an aggregate has the filter_tenant_id metadata key, the hosts in the aggregate create instances only from that tenant or list of tenants. No other tenant is allowed on these hosts.
As stated in the Nova Compute section, individual ESXi hosts are not exposed to the nova scheduler. Therefore, ensure that the nova schedule filters align with the underlying vSphere resource allocation.
Nova Compute Scaling
As workloads increase, your cluster must scale to meet new capacity demands. While vCenter Server can have a maximum cluster size of 64 ESXi hosts, VIO Cluster scaling can be different depending on the use case. You can add new capacity to a VIO deployment in two ways:
Vertical scaling: Increase the number of hosts in a Cluster.
Horizontal scaling: Deploy a new vCenter Server Cluster and add the Cluster as a new nova-compute Pod to an existing or new nova compute aggregate.
The implementation of vertical or horizontal scaling must be based on the use case and the number of concurrent operations against the OpenStack API. The following table outlines the following most frequently used deployment scenarios:
Use Case |
Expected Parallel OpenStack Operations |
Scaling Model |
---|---|---|
Traditional Enterprise |
Low |
Horizontal or Vertical |
Direct API access to the infrastructure Example: CICD workflow |
High |
Horizontal |
Direct API access to the infrastructure Example: Terraform Automation workflow |
Low |
Horizontal or Vertical |
NFV deployment |
Low |
Horizontal or Vertical |
Cloud Native workload running on top of Kubernetes |
Low |
Horizontal or Vertical |
Nova flavors and Compute Performance Tuning
To consume a nova host aggregate, cloud admins must create and expose VM offering so that users can request VMs that match their application vCPU, memory, and disk requirements. In OpenStack, a flavor is used to represent various types of VM offerings. An OpenStack flavor defines the compute, memory, and storage capacity of the computing instances. In addition to capacity, a flavor such as SSD, spinning disks, CPU types, CPU family, and so on can also indicate the hardware profile. Hardware profiles are often implemented through flavor extra-specs. Nova flavor extra-specs are key-value pairs that define which compute or host aggregate a flavor can run on. Based on extra-spec, the nova-scheduler locates the hardware that matches the corresponding key-value pairs on the compute node.
Data plane intensive workloads require VM-level parameters set for maximum performance. VM-level parameters can also be handled using nova flavors. The vCloud NFV Performance Tuning Guide outlines the recommended VM-level parameters to be set when deploying data plane intensive workloads:
Virtual CPU Pinning
NUMA alignment
CPU/Memory reservation setting
Selective vCPU Pinning
Tenant vDC
Huge Page
Passthrough Networking
VIO fully supports configuring VM-level parameters required for VNF performance tuning and is summarized in the following table:
VM Parameter |
Flavor Metadata Category |
Metadata Values |
---|---|---|
CPU Pinning |
CPU Pinning policy |
hw:cpu_policy=dedicated |
VMware Policies |
vmware:latency_sensitivity_level=high. |
|
VMware Quota |
quota:cpu_reservation_percent and quota:memory_reservation_percent=100 |
|
Selective vCPU Pinning |
Custom |
vmware:latency_sensitivity_per_cpu_high="<cpu-id1>,<cpu-id2>" |
CPU Pinning policy |
hw:cpu_policy=dedicated |
|
VMware Quota |
quota:cpu_reservation_percent and quota:memory_reservation_percent=100 |
|
VMware Policies |
vmware:latency_sensitivity_level=high. |
|
NUMA |
VMware Policies |
numa.nodeAffinity="numa id" |
Huge Pages |
Guest Memory Backing |
hw:mem_page_size="size" |
VMware Quota |
quota:memory_reservation_percent=100 |
|
Tenant vDC |
VMware Policies |
vmware:tenant_vdc=UUID |
CPU Memory Reservation |
VMware Quota |
For details, see Supported Flavor Extra Spec.` |
Nova Compute Design Recommendations.
Design Recommendation |
Design Justification |
Design Implication |
---|---|---|
Create nova host aggregates to group vSphere clusters sharing similar characteristics. |
Host Aggregates provide a mechanism to allow administrators to group compute clusters |
None |
Assign key-value pairs to host aggregates based on hardware profile and data center affinity |
The nova scheduler uses the key-value pair and metadata to pick the hardware that matches the client request. |
None |
Use AggregateFilter to control nova compute (vSphere cluster ) oversubscription. |
Over-subscription leads to greater resource utilization efficiency. |
You must ensure that the oversubscription ratio does not conflict with Tenant vDC reservation. Oversubscription can lead to degraded SLA. |
Use AggregateMultiTenancyIsolation filter to place VMs based on Tenant ID. |
Used when an entire cluster needs to be reserved for a specific OpenStack Tenant. |
Unbalanced resource consumption. |
When using Tenant vDC, create a new default vDC with no reservation and map all default flavors to the new VDC. |
From a vSphere resource hierarchy perspective, VMs created using default flavors belong to the same resource hierarchy as Tenant vDC. If Tenant vDCs and VMs share the parent resource pool, it is not guaranteed that VMs with no resource reservation will not use CPU share from Tenant vDC with full reservation. Mapping default VIO flavors to a child vDC without resource resolution alleviates this issue. |
None |
Use OpenStack Metadata extra specs to set VM-level parameters for data plane workloads in the compute flavor. |
Metadata extra specs translate to VM settings on vSphere. |
None |
When adding more ESXi resources, consider building new vSphere clusters instead of adding to the existing cluster in an environment with a large churn. |
New vSphere clusters introduce more parallelism when supporting a large number of concurrent API requests. |
New clusters introduce new objects to manage in the OpenStack database. |