vSphere ESX Agent Manager deploys VIBs onto ESXi hosts.
The deployment on hosts requires that DNS be configured on the hosts, vCenter Server, and NSX Manager. Deployment does not require an ESXi host reboot, but any update or removal of VIBs requires an ESXi host reboot.
VIBs are hosted on NSX Manager and are also available as a zip file.
The file can be accessed from https://<NSX-Manager-IP>/bin/vdn/nwfabric.properties. The downloadable zip file differs based on NSX and ESXi version. For example, in NSX 6.3.0, vSphere 6.0 hosts use the file https://<NSX-Manager-IP>/bin/vdn/vibs-6.3.0/6.0-buildNumber/vxlan.zip.
# 5.5 VDN EAM Info VDN_VIB_PATH.1=/bin/vdn/vibs-6.3.0/5.5-4744075/vxlan.zip VDN_VIB_VERSION.1=4744075 VDN_HOST_PRODUCT_LINE.1=embeddedEsx VDN_HOST_VERSION.1=5.5.* # 6.0 VDN EAM Info VDN_VIB_PATH.2=/bin/vdn/vibs-6.3.0/6.0-4744062/vxlan.zip VDN_VIB_VERSION.2=4744062 VDN_HOST_PRODUCT_LINE.2=embeddedEsx VDN_HOST_VERSION.2=6.0.* # 6.5 VDN EAM Info VDN_VIB_PATH.3=/bin/vdn/vibs-6.3.0/6.5-4744074/vxlan.zip VDN_VIB_VERSION.3=4744074 VDN_HOST_PRODUCT_LINE.3=embeddedEsx VDN_HOST_VERSION.3=6.5.* # Single Version associated with all the VIBs pointed by above VDN_VIB_PATH(s) VDN_VIB_VERSION=18.104.22.16844320 # Legacy vib location. Used by code to discover avaialble legacy vibs. LEGACY_VDN_VIB_PATH_FS=/common/em/components/vdn/vibs/legacy/ LEGACY_VDN_VIB_PATH_WEB_ROOT=/bin/vdn/vibs/legacy/
The VIB names are:
[root@esx-01a:~] esxcli software vib list | grep -e vsip -e vxlan esx-vsip 6.0.0-0.0.3771165 VMware VMwareCertified 2016-04-20 esx-vxlan 6.0.0-0.0.3771165 VMware VMwareCertified 2016-04-20
Common Issues During Host Preparation
During the preparation of hosts typical kinds of issues that can be encountered are as follows:
EAM fails to deploy VIBs.
Might be due to misconfigured DNS on hosts.
Might be due to a firewall blocking required ports between ESXi, NSX Manager, and vCenter Server.
A previous VIB of an older version is already installed. This requires user intervention to reboot hosts.
NSX Manager and vCenter Server experience communication issues:
The Host Preparation tab in the Networking and Security Plug-in not showing all hosts properly.
Check if vCenter Server can enumerate all hosts and clusters.
Host Preparation (VIBs) Troubleshooting
Check communication channel health for the host. See Checking Communication Channel Health.
Check vSphere ESX Agent Manager for errors.
vCenter home > Administration > vCenter Server Extensions > vSphere ESX Agent Manager
On vSphere ESX Agent Manager, check the status of agencies that are prefixed with “VCNS160”. If an agency has a bad status, select the agency and view its issues.
On the host that is having an issue, run the tail /var/log/esxupdate.log command.
Host Preparation (UWA) Troubleshooting
NSX Manager configures two user world agents on all hosts in a cluster:
Messaging bus UWA (vsfwd)
Control plane UWA (netcpa)
In rare cases, the installation of the VIBs succeeds but for some reason one or both of the user world agents is not functioning correctly. This could manifest itself as:
The firewall showing a bad status.
The control plane between hypervisors and the Controllers being down. Check NSX Manager System Events.
If more than one ESXi host is affected, check the status of message bus service on NSX Manager Appliance web UI under the Summary tab. If RabbitMQ is stopped, restart it.
If the message bus service is active on NSX Manager:
Check the messaging bus user world agent status on the hosts by running the /etc/init.d/vShield-Stateful-Firewall status command on ESXi hosts.
[root@esx-01a:~] /etc/init.d/vShield-Stateful-Firewall status vShield-Stateful-Firewall is running
Check the message bus user world logs on hosts at /var/log/vsfwd.log.
Run the esxcfg-advcfg -l | grep Rmq command on ESXi hosts to show all Rmq variables. There should be 16 Rmq variables.
[root@esx-01a:~] esxcfg-advcfg -l | grep Rmq /UserVars/RmqIpAddress [String] : Connection info for RMQ Broker /UserVars/RmqUsername [String] : RMQ Broker Username /UserVars/RmqPassword [String] : RMQ Broker Password /UserVars/RmqVHost [String] : RMQ Broker VHost /UserVars/RmqVsmRequestQueue [String] : RMQ Broker VSM Request Queue /UserVars/RmqPort [String] : RMQ Broker Port /UserVars/RmqVsmExchange [String] : RMQ Broker VSM Exchange /UserVars/RmqClientPeerName [String] : RMQ Broker Client Peer Name /UserVars/RmqHostId [String] : RMQ Broker Client HostId /UserVars/RmqHostVer [String] : RMQ Broker Client HostVer /UserVars/RmqClientId [String] : RMQ Broker Client Id /UserVars/RmqClientToken [String] : RMQ Broker Client Token /UserVars/RmqClientRequestQueue [String] : RMQ Broker Client Request Queue /UserVars/RmqClientResponseQueue [String] : RMQ Broker Client Response Queue /UserVars/RmqClientExchange [String] : RMQ Broker Client Exchange /UserVars/RmqSslCertSha1ThumbprintBase64 [String] : RMQ Broker Server Certificate base64 Encoded Sha1 Hash
Run the esxcfg-advcfg -g /UserVars/RmqIpAddress command on ESXi hosts. The output should display the NSX Manager IP address.
[root@esx-01a:~] esxcfg-advcfg -g /UserVars/RmqIpAddress Value of RmqIpAddress is 192.168.110.15
Run the esxcli network ip connection list | grep 5671 command on ESXi hosts to check for active messaging bus connection.
[root@esx-01a:~] esxcli network ip connection list | grep 5671 tcp 0 0 192.168.110.51:29969 192.168.110.15:5671 ESTABLISHED 35505 newreno vsfwd tcp 0 0 192.168.110.51:29968 192.168.110.15:5671 ESTABLISHED 35505 newreno vsfwd
To determine the reason for the netcpa user world agent being down:
Check the netcpa user world agent status on hosts by running the /etc/init.d/netcpad status command on ESXi hosts.
[root@esx-01a:~] /etc/init.d/netcpad status netCP agent service is running
Check the netcpa user world agent configurations /etc/vmware/netcpa/config-by-vsm.xml. The IP addresses of the NSX Controllers should be listed.
[root@esx-01a:~] more /etc/vmware/netcpa/config-by-vsm.xml <config> <connectionList> <connection id="0000"> <port>1234</port> <server>192.168.110.31</server> <sslEnabled>true</sslEnabled> <thumbprint>A5:C6:A2:B2:57:97:36:F0:7C:13:DB:64:9B:86:E6:EF:1A:7E:5C:36</thumbprint> </connection> <connection id="0001"> <port>1234</port> <server>192.168.110.32</server> <sslEnabled>true</sslEnabled> <thumbprint>12:E0:25:B2:E0:35:D7:84:90:71:CF:C7:53:97:FD:96:EE:ED:7C:DD</thumbprint> </connection> <connection id="0002"> <port>1234</port> <server>192.168.110.33</server> <sslEnabled>true</sslEnabled> <thumbprint>BD:DB:BA:B0:DC:61:AD:94:C6:0F:7E:F5:80:19:44:51:BA:90:2C:8D</thumbprint> </connection> </connectionList> ...
Run the esxcli network ip connection list | grep 1234 command to verify the Controller TCP connections.
>[root@esx-01a:~] esxcli network ip connection list | grep 1234 tcp 0 0 192.168.110.51:16594 192.168.110.31:1234 ESTABLISHED 36754 newreno netcpa-worker tcp 0 0 192.168.110.51:46917 192.168.110.33:1234 ESTABLISHED 36754 newreno netcpa-worker tcp 0 0 192.168.110.51:47891 192.168.110.32:1234 ESTABLISHED 36752 newreno netcpa-worker