The import process consists of four phases.
Phase 1
Retrieve all the resources from Manager API. Filter the resources based on Kubernetes cluster tag (ncp/cluster) or shared resources specified in the user-spec.yaml. Start making request bodies to be sent to the migration server. If any request cannot be generated, NCP does not migrate the cluster and exit.
- Resources cannot be retrieved from Manager API because of connectivity issue
Solution: Retry after fixing the connectivity issue
- Kubernetes does not contain a resource that is retrieved from Manager API
Solution: Run NCP in Manager mode again until it achieves an idle state, meaning that is is not performing any CRUD (create, read, update, delete) operations. You should wait at least 10 minutes because that is the maximum time interval until NCP sends retry requests. If no errors in NCP logs, the issue should be fixed.
Phase 2
Start sending the importation requests created in Phase 1 to the migration API. Once a request is processed successfully, record the manager_ids contained in the request on the local disk of the client. If any request fails, rollback the already imported resources using their manager_ids stored on the local disk. If migration API tells it is a duplicate request, the importer will remove its manager_id from the request body and send the request again.
- Connectivity issue
Solution: Retry after fixing the connectivity issue
- Migration API returns error
Solution: Retry after some time as it can be Policy API that's at fault or migration API. If the issue persists, roll back all the imported resources if the importer stops unexpectedly using the rollback_imported_resources option in config.yaml. By default, the importer will roll back if any issue occurs in this phase. However if there is an issue during rollback, you have to manually try again. If the rollback using mp_to_policy_importer is unsuccessful, you must restore the NSX Manager from a backup to the state before importing the Kubernetes cluster.
Note: If DFW Sections and Rules have been imported and the importation requests fails for resources after, you must restore the state of the Manager using the created Backup before initiating the cluster importation again.
Phase 3
Infer the tags that are needed to be added/removed on the resources in Policy for all the imported resources. If any tag cannot be inferred (reason could be missing corresponding Kubernetes resource), the importer will rollback the already imported resources using their manager_ids stored on the local disk. This could happen when NCP in Manager mode was stopped in the middle of transaction. So, you should start NCP in Manager mode again and wait for a while.
- Kubernetes does not contain a resource that is retrieved from Manager API
Solution: After rollback, run NCP in Manager mode again until it achieves an idle state. You should wait at least 10 minutes because that is the maximum time interval until NCP sends retry requests. If no errors in NCP logs, the issue should be fixed.
Phase 4
This is the most crucial phase. It is highly recommended that no unexpected failure occurs in this phase. In this phase, the importer will update the resources on Policy with new tags and/or additional information (eg, the importer will update display_name of segments). If the resource cannot be updated at the time, the importer will store the updated policy resource body and policy resource URL on the local disk of the client and ask you to try again after fixing the issue (the issue is on the Policy API or a connectivity issue).
- Connectivity issue
Solution: Retry after fixing the connectivity issue
In all the four phases, there is also risk of unexpected power failure and other issues which are handled as discussed in Failure and Recovery.