Information in this section might be useful in troubleshooting issues during the migration.

Import Configuration Problems

Problem Solution
Import configuration fails. Click Retry to try importing again. Only the failed import steps are retried.

Host Migration Problems

Problem Solution
Host migration fails due to a missing compute manager configuration.

The compute manager configuration is a prerequisite for migration. However, if the compute manager configuration is removed from the NSX Manager after the migration is started, the migration coordinator retains the setting. The migration proceeds until the host migration step, which fails.

Add a compute manager to NSX Manager and enter the same vCenter Server details that were used for the initial NSX-V configuration import.

Host migration fails due to stale dvFilters present.

Example error message: Stale dvFilters present: ['port 33554463 (disconnected)', 'port 33554464 (disconnected)'] Stale dvfilters present. Aborting ]

Log in to the host which failed to migrate, identify the disconnected ports, and either reboot the appropriate VM or connect the disconnected ports. Then you can retry the Host Migration step.

  1. Log into the command-line interface of the host which failed to migrate.
  2. Run summarize-dvfilter and look for the ports reported in the error message.
    world 1000057161 vmm0:2-vm_RHEL-srv5.6.0.9-32-local-258-963adcb8-ab56-41d6-bd9e-2d1c329e7745 vcUuid:'96 3a dc b8 ab 56 41 d6-bd 9e 2d 1c 32 9e 77 45'
     port 33554463 (disconnected)
      vNic slot 2
      name: nic-1000057161-eth1-vmware-sfw.2
     agentName: vmware-sfw
       state: IOChain Detached
       vmState: Detached
       failurePolicy: failClosed
       slowPathID: none
       filter source: Dynamic Filter Creation
  3. Locate the affected VM and port.
    For example, the error message says port 33554463 is disconnected.
    1. Find the section of the summarize-dvfilter output that corresponds to this port. The VM name is listed here. In this case, it is 2-vm_RHEL-srv5.6.0.9-32-local-258-963adcb8-ab56-41d6-bd9e-2d1c329e7745.
    2. Look for the name entry to determine which VM interface is disconnected. In this case, it is eth1. So the second interface of 2-vm_RHEL-srv5.6.0.9-32-local-258-963adcb8-ab56-41d6-bd9e-2d1c329e7745 is disconnected.
  4. Resolve the issue with this port. Do one of the following steps:
    • Reboot the affected VM.
    • Connect the disconnected vnic port to any network.
  5. On the Migrate Hosts page, click Retry.

After host migration using vMotion, VMs might experience traffic outage if SpoofGuard is enabled in NSX-V.

Symptoms:

The vmkernel.log file on the host at /var/run/log/ shows a drop in traffic due to SpoofGuard.

For example, the log file shows: WARNING: swsec.throttle: SpoofGuardMatchWL:296:[nsx@6876 comp="nsx-esx" subcomp="swsec"]Filter 0x8000012 [P]DROP sgType 4 vlan 0 mac 00:50:56:84:ee:db

Cause:

The logical switch and the logical switch port configuration are migrated through the migration coordinator, which migrates the SpoofGuard configuration. However, the discovered port bindings are not migrated through vMotion. Therefore, SpoofGuard drops the packets.

If SpoofGuard is enabled in NSX-V before migration, do any one of these workaround steps after vMotion of VMs:
  • Disable SpoofGuard policies.
  • Add the port IP and MAC address bindings as manual bindings.
  • If ARP snooping is enabled, wait for the VM IP addresses to be snooped by ARP.

In the first two options, network traffic is restored immediately.

In the third option:
  • Traffic downtime is observed until the VM sends an ARP request or reply.
  • If DHCP snooping is also enabled and the VM IP address was assigned by the DHCP server, then it will most likely be snooped as an ARP first and later as a DHCP-snooped IP address.

In the middle of a cluster migration, host migration has failed due to some hardware failure in the host.

For example, let us say that a cluster has 10 hosts, and four hosts have migrated successfully. The fifth host has a hardware failure and the host migration fails.

If the host hardware failure cannot be fixed, skip this failed host for migration, and retry the host migration. Complete the following workaround steps:
  1. In the vCenter Server UI, remove the failed host from the inventory.

    Wait for a few minutes until the host is removed.

  2. Log in to the NSX Manager appliance where the migration coordinator service is running, and run the following API request:

    GET https://{nsxt-policy-ip}/api/v1/migration/migration-unit-groups?component_type=HOST&sync=true

  3. Return to the NSX NSX Manager UI, and refresh the browser. Observe that the failed host is no longer visible.
  4. Click Retry to restart the host migration.
If you need to restart the migration coordinator service for any reason, the clusters that are already migrated to NSX become available for migration again on the Migrate Hosts page. This behavior is a known issue. In this case, the workaround is to skip the migrated clusters by doing these steps:
  1. Open an SSH session to the NSX NSX Manager appliance where the migration coordinator service is running.
  2. Edit the /var/log/migration-coordinator/v2t/clusters-to-migrate.json file to remove the clusters that are already migrated.

    For example, if the file has the following content and cluster-1 has been migrated, then remove the element {"modId":"domain-c9", "name":"cluster-1"}.

    "clusters":[
       {
         "modId":"domain-c9",
         "name":"cluster-1"
       },
       {
         "modId":"domain-c19",
         "name":"cluster-2"
       }
     ]
  3. Run the same API request on the NSX Manager appliance as mentioned in the earlier workaround.
  4. Return to the NSX NSX Manager UI, and refresh the browser. Go to the Migrate Hosts page, and observe that the clusters that you removed from the clusters-to-migrate.json file are shown as Do not migrate.
  5. Click Retry to restart the host migration.
Host migration is blocked after the recommendation is accepted because NSX-V controller VM is in power-off state. In the host migration step, the feedback recommends that you abort the migration. If you accept the recommendation, the migration will fail. Because the Edge cutover is done, you can change the action to skip and continue the migration with the following steps:
  1. Make the following API call and search the result for NoNsxvControllerInRunningSate to find the feedback request and get its ID:
    GET https://$NSX_MANAGER_IP/api/v1/migration/feedack-requests?state=UNRESOLVED
  2. Accept all recommendations by making the following API call:
    POST https://$NSX_MANAGER_IP/api/v1/migration/feedback-response?action=accept-recommended
  3. Provide a feedback response with the action skip with the following API call (note that $FEEDBACK_ID is the ID you obtained in step 1):
    PUT https://$NSX_MANAGER_IP/api/v1/migration/feedback-response -d '{"response_list":[{"id": $FEEDBACK_ID, "action": "skip" }]}'

Rolling Back a Migration

Problem Solution
With some NSX-V OSPF deployments, if you perform a rollback after the Edge migration phase, you might see the error "Reason: NSCutover failed with '400: Configuration failed on NSX Edge VM vm-XXXX". Re-deploy the relevant NSX-V Edge VM. After the VM is successful re-deployed, perform the rollback again.

Retrying a Migration

Problem Solution
If a host reboots for any reason during a migration, retrying the migration fails with an error such as "The requested object : TransportNode/42178ba8-49fb-9545-2b78-5e9c64fddda7 could not be found. Object identifiers are case sensitive." Perform the following steps:
  1. From the VC UI, remove the host from its cluster and make it a standalone host.
  2. From the NSX Manager UI, configure NSX on the standalone host using the same VDS. Make the transport node join the same overlay and VLAN transport zones that other migrated hosts join.
  3. From the NSX Manager UI, go back to the migration screen, refresh it to make sure that the host is not in the cluster being migrated. Retry the migration of the cluster.
  4. After the migration, add the host back to the cluster.

Removing Stale VTEP Data

Problem Solution
If the migration is aborted after migrating Edge Services Gateways, there might be stale VTEP tables in NSX. If there are transport nodes in NSX, their tunnel status will remain down for these stale VTEPs. To remove the stale VTEP data, make the following API call:
GET https://<nsx-manager-IP>/api/v1/global-configs/SwitchingGlobalConfig
If the parameter global_replication_mode_enabled in the result payload is true, take this payload, set global_replication_mode_enabled to false, and use the payload to make the following API call:
PUT https://<nsx-manager-IP>/api/v1/global-configs/SwitchingGlobalConfig
.

Partner Service Migration Problems

Problem Solution

Migration coordinator does not display the feedback messages for the Service Insertion category on the Resolve Configuration page even though the Security Policies in your NSX-V environment contain Network Introspection rules.

This problem occurs when you are migrating a combination of Guest Introspection and Network Introspection services from the same partner. If a service profile for the partner service is already created in NSX, migration coordinator does not initiate the migration of the Network Introspection rules.

Check whether a service profile is already created in your NSX environment. If yes, do these steps:
  1. Roll back the migration.
  2. Delete the partner service profile and service reference in NSX.
  3. Restart the migration.

Post-Migration Issues

Problem Solution

After a migration, and after ESGs are removed from the network, NSX raises alarms about OSPF neighbors being down for these ESGs. If you resolve the alarms, they are raised again.

Acknowledge the alarms but do not resolve them. This will keep the alarms from being raised again.