Use the information in this topic to debug issues that are associated with NSX Distributed Malware Prevention service deployment, health status of service instances, ESXi Agencies, and other issues.

Verify ESX Agent Manager Health Status

To verify whether the health status of vSphere ESX Agent Manager (EAM) is normal, do these steps:
  1. In the vSphere Client, navigate to Administration > vCenter Server Extensions. Click vSphere ESX Agent Manager.
  2. Click the Configure tab.

    This page shows the health status of ESX Agencies on the hosts for the NSX Malware Prevention solution and issues (if any) that are detected for the Agencies.

ESXi Agency Manager (EAM) service must be up and running. Verify whether the following URL is accessible.

https://vCenter_Server_IP_Address/eam/mob

Replace vCenter_Server_IP_Address with the IP address of the VMware vCenter in your network.

Verify Connectivity of Port Groups, Interfaces, and Context Multiplexer

Do the following steps in the vSphere Client:
  • Select the name of the service virtual machine, and then click the Networks tab. Verify that the vmservice-vhsield-pg Port Group is listed.
  • Right-click the service virtual machine name, and click Edit Settings. On the Virtual Hardware page, verify that network adapter 1 and network adapter 2 are connected. Network adapter 1 connects the SVM to the Management network, and network adapter 2 connects the SVM to the vmservice-vshield-pg Port Group, which NSX has autocreated during the service deployment. Network adapter 2 is the control interface of the SVM that is used for communication between the Context Multiplexer (MUX) and the SVM. For NSX Malware Prevention SVM, the control interface IP is 169.254.1.22.

Context Multiplexer service must be running on each ESXi host. To verify whether the nsx-context-mux service is running on the host, log in to the CLI of each ESXi host as a root user and run the following CLI command:

# /etc/init.d/nsx-context-mux  status

If the service is not running, start or restart the service with the following CLI command:

/etc/init.d/nsx-context-mux start

Or

/etc/init.d/nsx-context-mux restart
Note: It is safe to restart this service during production hours because restarting the service does not have a significant impact. The service restarts in a couple of seconds.

Resolve ESX Agent Manager Issues

The ESX Agent Manager notifies NSX Manager about error details when it detects issues in the ESX Agencies. You can click Resolve in the NSX Manager UI to resolve the issues. The following table describes the ESX Agent Manager issues.

Issue Category Description Resolution

Cannot Access Agent OVF

VM Not Deployed

An agent virtual machine is expected to be deployed on a host, but the agent virtual machine cannot be deployed because the ESXi Agent Manager is unable to access the OVF package for the agent. It might happen because the web server providing the OVF package is down. The web server is often internal to the solution that created the Agency.

ESXi Agency Manager (EAM) service retries the OVF download operation. Click Resolve.

Incompatible Host Version

VM Not Deployed

An agent virtual machine is expected to be deployed on a host. However, because of compatibility issues the agent was not deployed on the host.

Upgrade either the host or the solution to make the agent compatible with the host. Check the compatibility of the SVM. Click Resolve.

Insufficient Resources

VM Not Deployed

An agent virtual machine is expected to be deployed on a host. However, ESXi Agency Manager (EAM) service did not deploy the agent virtual machine because the host has less CPU or memory resources.

ESXi Agency Manager (EAM) service attempts to redeploy the virtual machine. Ensure that CPU and memory resources are available. Check the host and free up some resources. Click Resolve.

Insufficient Space

VM Not Deployed

An agent virtual machine is expected to be deployed on a host. However, the agent virtual machine was not deployed because the agent datastore on the host did not have enough free space.

ESXi Agency Manager (EAM) service attempts to redeploy the virtual machine. Free up some space on the datastore. Click Resolve.

No Agent VM Network

VM Not Deployed

An agent virtual machine is expected to be deployed on a host, but the agent cannot be deployed because the agent network has not been configured on the host.

Add one of the networks listed in custom agent VM network to the host. The issue resolves automatically after the datastore is available.

OVF Invalid Format

VM Not Deployed

An Agent virtual machine is expected to be provisioned on a host, but it failed to do so because the provisioning of the OVF package failed. The provisioning is unlikely to succeed until the solution that provides the OVF package has been upgraded or patched to provide a valid OVF package for the agent virtual machine.

ESXi Agency Manager (EAM) service attempts to redeploy the SVM. Ensure that a valid OVF package is used for service deployment. Click Resolve.

Missing Agent IP Pool

VM Powered Off

An agent virtual machine is expected to be powered on, but the agent virtual machine is powered off because there are no IP addresses defined on the agent's virtual machine network.

Define the IP address on the virtual machine network. Click Resolve.

No Agent VM Datastore

VM Powered Off

An agent virtual machine is expected to be deployed on a host, but the agent cannot be deployed because the agent datastore has not been configured on the host.

Add one of the datastores listed in custom agent VM datastore to the host. The issue resolves automatically after the datastore is available.

No Custom Agent VM Network

No Agent VM Network

An agent virtual machine is expected to be deployed on a host, but the agent cannot be deployed because the agent network has not been configured on the host.

Add the host to one of the networks listed in a custom agent VM network. The issue resolves automatically after a custom VM network is available.

No Custom Agent VM Datastore

No Agent VM Datastore

An agent virtual machine is expected to be deployed on a host, but the agent cannot be deployed because the agent datastore has not been configured on the host.

Add the host to one of the datastores listed in a custom agent VM datastore. The issue resolves automatically.

Orphaned DvFilter Switch

Host Issue

A dvFilter switch exists on a host but no agents on the host depend on dvFilter. It happens if a host is disconnected when an agency configuration changed.

Click Resolve. ESXi Agency Manager (EAM) service attempts to connect the host before the agency configuration is updated.

Unknown Agent VM

Host Issue

An agent virtual machine has been found in the vCenter Server inventory that does not belong to any agency in this vSphere ESX Agent Manager server instance.

Click Resolve. ESXi Agency Manager (EAM) service attempts to place the virtual machine to the inventory it belongs to.

OVF Invalid Property

VM Issue

An agent virtual machine must be powered on, but an OVF property is either missing or has an invalid value.

Click Resolve. ESXi Agency Manager (EAM) service attempts to reconfigure the correct OVF property.

VM Corrupted

VM Issue

An agent virtual machine is corrupt.

Click Resolve. ESXi Agency Manager (EAM) service attempts to repair the virtual machine.

VM Orphaned

VM Issue

An agent virtual machine exists on a host, but the host is no longer part of scope for the agency. It happens if a host is disconnected when the agency configuration is changed.

Click Resolve. ESXi Agency Manager (EAM) service attempts to connect the host back to the agency configuration.

VM Deployed

VM Issue

An agent virtual machine is expected to be removed from a host, but the agent virtual machine has not been removed. The specific reason why vSphere ESX Agent Manager was unable to remove the agent virtual machine, such as the host is in maintenance mode, powered off or in standby mode.

Click Resolve. ESXi Agency Manager (EAM) service attempts to remove the agent virtual machine from the host.

VM Powered Off

VM Issue

An agent virtual machine is expected to be powered on, but the agent virtual machine is powered off.

Click Resolve. ESXi Agency Manager (EAM) service attempts to power on the virtual machine.

VM Powered On

VM Issue

An agent virtual machine is expected to be powered off, but the agent virtual machine is powered on.

Click Resolve. ESXi Agency Manager (EAM) service attempts to power off the virtual machine.

VM Suspended

VM Issue

An agent virtual machine is expected to be powered on, but the agent virtual machine is suspended.

Click Resolve. ESXi Agency Manager (EAM) service attempts to power on the virtual machine.

VM Wrong Folder

VM Issue

An agent virtual machine is expected to be in a designated agent virtual machine folder, but is found in a different folder.

Click Resolve. ESXi Agency Manager (EAM) service attempts to place the agent virtual machine to the designated folder.

VM Wrong Resource Pool

VM Issue

An agent virtual machine is expected to be located in a designated agent virtual machine resource pool, but is found in a different resource pool.

Click Resolve. ESXi Agency Manager (EAM) service attempts to place the agent virtual machine to a designated resource pool.

VM Not Deployed

Agent Issue

An agent virtual machine is expected to be deployed on a host, but the agent virtual machine has not been deployed. Specific reasons why ESXi Agent Manager was unable to deploy the agent, such as being unable to access the OVF package for the agent or a missing host configuration. This issue can also happen if the agent virtual machine is explicitly deleted from the host.

Click Resolve to deploy the agent virtual machine.

Resolve NSX Manager Issue

Issue
Unable to allocate static IP addresses from the IP Pool.
Description
Either the IP addresses from the pool are exhausted or there are no more IP addresses left to allocate.
Resolution
Fix the IP Pool problem, and then click Resolve to fix the issue.

Verify Health Status of Service Instances

NSX Manager receives the health status details of each service instance. The latest timestamp when the health status is received is shown in the NSX Manager UI. You might have to refresh the Service Instances page a few times to retrieve the latest health status.

Health of a service instance on an ESXi host depends on the following factors:
Solution status
Status of the NSX Distributed Malware Prevention solution that is running on an SVM. Up status indicates that the solution is correctly running.
Connectivity between NSX Guest Introspection agent and Context engine
Status is Up when NSX Guest Introspection agent (Context Multiplexer) is connected to the NSX Ops agent, which includes the Context engine. The Context Multiplexer forwards health information of SVMs to the Context engine. The MUX and the NSX Ops agent also share SVM-VM configuration between each other to know which workload VMs are protected by the SVM.
Service VM protocol version
Transport protocol version used internally for troubleshooting issues.
NSX Guest Introspection agent information
Represents protocol version compatibility between NSX Guest Introspection agent and SVM.
To view the health status of service instances, do these steps in NSX Manager:
  1. Navigate to System > Service Deployments > Service Instances.
  2. In the Health Status column, click the icon next to Up or Down.

View Alarms in NSX Manager

Alarms are displayed on the Alarms page of the NSX Manager UI for the following situations:
  • Connectivity between NSX Context Multiplexer and NSX Malware Prevention SVM is down.
  • NSX Context Multiplexer is down or reboots.

You can also view alarms on the Service Instances page at System > Service Deployments > Service Instances.

To view alarms about the health of the NSX Malware Prevention feature, do these steps:

  1. In NSX Manager, navigate to the Alarm Definitions page by clicking Home > Alarms > Alarm Definitions.
  2. Click in the Filter by Name, Path, and more text area, and then click Feature.
  3. Select the Malware Prevention Health check box.

For documentation about the NSX Malware Prevention health events, see the NSX Event Catalog.

View Component Issues on Security Overview Dashboard

The Malware Prevention widget on the Security Overview dashboard shows issues when any of the components in the NSX Distributed Malware Prevention service is down or not working. To view this UI widget in the NSX Manager UI, navigate to Security > Security Overview > Configuration.

For example:
  • The Bar chart shows an issue when the Security Hub on the NSX Malware Prevention service virtual machine (SVM) is down. Point to the bar to view the following details:
    • Number of NSX Malware Prevention SVMs that are impacted.
    • Number of workload VMs on the host that have lost malware security protection due to the Security Hub going down.
  • The Donut chart shows the following details:
    • Number of workload VMs where the NSX File Introspection driver is running.
    • Number of workload VMs where the NSX File Introspection driver is not running.

    For both these metrics, only the workload VMs on the host clusters that are activated for NSX Distributed Malware Prevention are considered.

Name the Key Pairs Correctly for Easy Identification

SSH access to the admin user of the SVM is key-based (public-private key pair). A public key is needed when you are deploying the service on an ESXi host cluster, and a private key is needed when you want to start an SSH session to the SVM.

NSX Distributed Malware Prevention service deployment is done at the level of a host cluster. So, a key pair is tied to a host cluster. You can create either a new public-private key pair for a service deployment on each cluster, or use a single key pair for service deployments on all the clusters.

If you plan to use a different public-private key pair for service deployment on each cluster, ensure that the key pairs are named correctly for easy identification.

A good practice is to identify each service deployment with a "compute cluster id" and specify the cluster id in the name of the key pair. For example, let us assume that the cluster id is "1234-abcd". For this cluster, you can specify the service deployment name as "MPS-1234-abcd", and name the key pair to access this service deployment as "id_rsa_1234_abcd.pem". This practice makes it easy for you to maintain and associate keys for each service deployment.

Important: Store the private key securely. Loss of the private key can lead to a loss of SSH access to the NSX Malware Prevention SVM.

If the private key is lost, the SVM continues to function without any issues, but you cannot log in to the SVM and download the log file for troubleshooting purposes.

Collect Support Bundles

For a detailed troubleshooting of the following components that contribute to NSX Distributed Malware Prevention service, you can collect support bundles for analysis or send them to VMware Support:
  • NSX Manager appliances
  • ESXi hosts
  • VMware Tools on workload VMs
  • NSX Malware Prevention SVM

To collect a support bundle that contains log files for ESXi hosts and NSX Manager appliances, use the support bundle feature in NSX. For instructions about collecting a support bundle in NSX, see Collect Support Bundles.

To collect a support bundle that contains log files for components and services running on VMware vCenter, see the vCenter Server Configuration documentation. For example, you can collect log files for VMware Tools with the VMware vCenter support bundle.

(In NSX 4.1.2 or later): To collect support bundles for NSX Malware Prevention SVMs running on vSphere host clusters that are activated for NSX Distributed Malware Prevention service, you can run CLI commands on the SVMs. For more information, see Collect Support Bundle for an NSX Malware Prevention Service Virtual Machine.

(In NSX 4.1.1 or earlier): NSX CLI commands are not supported on the NSX Malware Prevention SVM. However, you can log in to each NSX Malware Prevention SVM by using an SSH connection, and copy the syslog file from the /var/log directory on the SVM. For instance, you can use the sftp or the scp command to collect the SVM syslog file at a particular time. If multiple syslog files are available at this location, they are compressed and stored at the same path.