Troubleshoot issues with host profiles and TNPs when they are used to auto deploy stateless clusters.

Scenario Description

When multiple VMkernel adapters enabled to support Management, vMotion and other traffic are migrated to the same logical switch, VMkernel adapters get migrated to logical switch after reboot. But the service on one VMkernel adapter is enabled on a different adapter.

For example, before migration, vmk0 is enabled to support Management traffic and vmk1 is enabled for vMotion traffic. After host reboot, vmk0 supports vMotion traffic and vmk1 supports Management traffic. This results in non-compliant error after reboot.

Workaround: None. There is no impact as both VMkernel adapters are on the same logical switch.
Host preparation progress is stuck at 60% while the node status displays UP. Issue: When a TNP is applied on a cluster, NSX-T is successfully installed on the host and node status displays UP, but GUI still shows 60% progress.

Workaround: Reapply the TNP or TN configuration without any change in the config. This will fix the status to 100% on the GUI.

Even though VMkernel migration is successful there was a validation error on the TN before host switches are removed.

Issue: When you migrate vmk0 the management interface from vSwitch to a logical switch, NSX-T is successfully installed on the host. VMkernel migration is successful, but TN status shows Partial Success with error.

Validation before host switches removal failed: [error: No management vmk will have PNIC after ['vmk1'] in ['9a bb eb c1 04 81 40 e2-bc 3f 3e aa bd 14 62 1e'] lose all PNICs.]; LogicalSwitch full-sync: LogicalSwitch full-sync realization query skipped.

Workaround: None. Ignore the error message as VMkernel migration is successful.

Reapplying a TNP where the Network Mapping for Install lists vmk0 results in host losing connectivity. Issue: When a TNP configuration consists of vmk0 in the Networking Mapping for Install, the hosts loses connectivity.

Workaround: Instead of reapplying the TNP, reboot the host with necessary configurations in TNP.

Cannot apply the host profile because MUX user password policy and password were not reset.

Issue: Only on hosts running versions earlier than vSphere 6.7 U3. Host remediation and host profile application on hosts might fail unless the mux_user password is reset.

Workaround: Under Policies & Profiles, edit the host profile to modify the mux_user password policy and reset the mux_user password.

Host Profile is not portable.

Issue: None of the vCenter servers can use the host profile containing NSX-T configuration.

Workaround: None.

Auto Deploy Rule Engine

Issue: Host profile cannot be used in auto deploy rules to deploy new clusters. If new clusters are deployed, the hosts get deployed with basic networking and remain in maintenance mode.

Workaround: Prepare each cluster from NSX-T GUI. See Apply TNP on Stateless Cluster.

Check compliance errors.

Issue: Host profile remediation cannot fix the compliance errors related to the NSX-T configuration.

  • Physical NICs configured on Host Profile and TNP are different.
  • Mapping between vNIC to LS mapping. Host Profile finds a mismatch in the logical switch to vNIC mapping with the TNP profile.
  • VMkernel connected to N-VDS mismatch on Host Profile and TNP.
  • Opaque switch mismatch on Host Profile and TNP.

Workaround: Ensure the NSX-T configuration matches on Host Profile and TNP. Reboot the host to realize the configuration changes. The host comes up.

Remediation

Issue: If there are any NSX-T specific compliance errors, host profile remediation on that cluster is blocked.

Incorrect configuration:

  • Mapping between vNIC to LS mapping
  • Mapping of physical NICs

Workaround: Ensure that the NSX-T configuration matches on Host Profile and TNP. Reboot the host to realize the configuration changes. The host comes up.

Attach

Issue: In a cluster configured with NSX-T, host profile cannot be attached at the host-level.

Workaround: None.

Detach

Issue: Detaching and attaching a new host profile in a cluster configured with NSX-T does not remove the NSX-T configuration. Even though the cluster is compliant with newly attach the host profile, it still has the NSX-T configuration from a previous profile.

Workaround: None.

Update

Issue: If the user has changed NSX-T configuration in the cluster, then extract a new host profile. Update the host profile manually for all the settings that were lost.

Workaround: None.

Host-level transport node configuration

Issue: After anportsport node was auto-deployed, it acts as individual entity. Any update to that transport node might not match with the TNP.

Workaround: Update the cluster. Any update in a standalone transport node cannot persist its migration specification. The migration might fail to post the reboot.

PeerDNS configuration is not supported on the VMkernel adapter selected for migration to the NVDS switch.

Issue: If a VMkernel adapter selected for migration to NVDS is peer-DNS enabled, then host profile application fails.

Workaround: Edit the extracted host profile by disabling peer-DNS setting on the VMkernel adapter that must be migrated to an NVDS switch. Alternatively, ensure that you do not migrate peer-DNS enabled VMkernel adapters to an NVDS switch.

DHCP address of the VMkernel NIC address not retained

Issue: If the reference host is stateful, then any stateless hosts using profile extracted from the stateful reference host cannot retain their VMkernel management MAC address derived from PXE started MAC. It results in DHCP addressing issues.

Workaround: Edit extracted host profile of stateful host and modify the 'Determine how MAC address for vmknic should be decided' to 'Use the MAC address from which the system was PXE started'.

Host Profile application failure in vCenter can lead to NSX configuration errors on the host.

Issue: If host profile application fails in vCenter, NSX configuration might also fail.

Workaround: In vCenter, verify that host profile was successfully applied. Fix the errors and try again.

LAGS are not supported on stateless ESXi hosts.

Issue: The uplink profile configured as LAGs in NSX is not supported in a stateless ESXi host managed by a vCenter Server or in NSX.

Workaround: None.

A stateless host does not boot up with MAC address of PXE NIC when it is applied with a host profile extracted from a stateful host. Issue: If a stateless host is attached with a host profile extracted from a statelful host, then the VMkernel adapter (vmknic) of the stateless host does not boot up with the MAC address of PXE NIC of the host because a stateful host does not boot up as a PXE-enabled system.

Workaround: When you are setting up autodeployment of stateless hosts, ensure that the host profile extracted is from a from a host that boots up as a PXE-enabled system.