VMware Cloud Foundation 2.1.3 Release Notes

|

VMware Cloud Foundation 2.1.3 | 25 MAY 2017 | Build 5600653

Release notes last updated: 25 MAY 2017
Check for additions and updates to these release notes.

What's in the Release Notes

The release notes cover the following topics:

What's New

The VMware Cloud Foundation 2.1.3 release includes the following:

  • Fresh install support that includes updated versions of ESXi 6.0 0U3, vCenter 6.0 U3b and NSX 6.2.6
  • Full support for Cisco UCS 240C rack servers
  • Security upgrades to address potential security vulnerabilities in third-party components, including Java Runtime Environment, Kerberos, libxml2, OpenSSL, and Python
  • Serviceability enhancements, including the new VMware Cloud Foundation Pre-Upgrade Check utility. This utility audits the system to identify associated actions that the user needs to take prior to running LCM upgrades, thereby reducing LCM domain upgrade interruptions. For details, see Knowledge Base Article 2150030
  • Critical SDDC Manager bug fixes

Cloud Foundation is a software stack that deploys the VMware SDDC Software Stack. Older version releases are 2.1, 2.1.1, 2.1.2. For information about what is new in those products, as well as their known issues and resolved issues, see the release notes for those software versions. You can locate their release notes from their documentation landing pages at pubs.vmware.com.

VMware Software Versions and Build Numbers

You can install Cloud Foundation 2.1.3 either directly or by upgrading from your existing 2.1.2 deployment. See the Installation and Upgrades Information section.

The Cloud Foundation 2.1.3 software product is installed and deployed by completing two phases: Imaging phase (phase one) and Bring-Up with Automated Deployment phase (phase two). The following sections list the VMware software versions and builds that are involved in each phase.

Phase One: Imaging with VIA

In this phase, hardware is imaged using the following VMware software build:

Software Component Version Date Build Number
VIA (the imaging appliance) 2.1.3 25 May 2017 5578536

Phase Two: Bring-Up With Automated Deployment

In this phase, the Cloud Foundation software product enables automated deployment of the following software Bill-of-Materials (BOM). This BOM is interoperable and compatible.

As shown below, the Cloud Foundation 2.1.3 software BOM is identical to the Cloud Foundation 2.1.1 software BOM, except for the VMware vCenter Server, VMware vSphere (ESXi), VMware SDDC Manager, VMware NSX for vSphere, and VIA components, all of which have been updated to version 2.1.3.

Software Component Version Date Build Number
VMware Cloud Foundation Bundle 2.1.3 25 MAY 2017 5600653
VMware SDDC Manager 2.1.3 25 MAY 2017 5577884
VMware Platform Services Controller 6.0 Update 3b 13 APR 2017 5326177
VMware vCenter Server on vCenter Server Appliance 6.0 Update 3b 13 APR 2017 5326177
VMware vSphere (ESXi) 6.0 Update 3 24 FEB 2017 5050593
Updated VMware Virtual SAN 6.2 (as shipped with ESXi 6.0 Update 3) 24 FEB 2017 5050593
VMware NSX for vSphere 6.2.6 02 FEB 2017 4977495
VMware vRealize Operations 6.2.1 25 APR 2016 3774215
VMware vRealize Log Insight 3.3.1 14 MAR 2016 3644329
VMware vRealize Log Insight Agent 3.3.1 17 MAR 2016 3669972
VMware NSX content pack for vRealize Log Insight 3.3 18 APR 2016  
VMware Virtual SAN content pack for vRealize Log Insight 2.0 18 APR 2016  
VMware Tools 10.1.5 23 FEB 2017 5055683
VMware Horizon 6 version 6.2 08 SEP 2015 3005627
View management pack for vRealize Operations 6.2 21 AUG 2015 3005627
View content pack for vRealize Log Insight 1.0    
VMware App Volumes 2.10 24 NOV 2015  

VMware Software Edition License Information

The VIA and SDDC Manager software is licensed under the Cloud Foundation license. As part of this product, the SDDC Manager software deploys specific VMware software products.

The following VMware software deployed by SDDC Manager is licensed under the Cloud Foundation license:

  • VMware vSphere
  • VMware Virtual SAN
  • VMware NSX

The following VMware software deployed by SDDC Manager is licensed separately:

  • VMware vCenter Server
  • VMware vRealize Log Insight
  • VMware vRealize Operations
  • Content packs for Log Insight
  • Management packs for vRealize Operations
  • VMware Horizon 6
  • VMware App Volumes

For details about the specific VMware software editions that are licensed under the licenses you have purchased, see the VMware Software Versions and Build Numbers section above.

For more general information, see the Cloud Foundation product page.

Supported Hardware

For details on the hardware requirements for a Cloud Foundation environment, including manufacturers and model numbers, see the VMware Cloud Foundation Compatibility Guide.

Network Switch Operating System Versions

The network operating systems for the networking switches are:

  • Management switches running Cumulus Linux 2.5.8
  • Cisco ToR 9372 and Spine 9332 switch running NX-OS 7.0(3)I2(4)
  • Cisco ToR 93180 switch running NX-OS 7.0(3)I4(2)
  • All other Cisco ToR and spine switches running Cisco OS 7.0(3)I2(4)
  • Arista ToR and spine switches running Arista OS 4.17.0F
Network Switch Network Operating System
Management switches Cumulus Linux 2.5.8
Cisco Nexus 9372 (ToR) NX-OS 7.0(3)I2(4)
Cisco Nexus 9332 (Spine) NX-OS 7.0(3)I2(4)
Cisco Nexus 93180 (ToR) NX-OS 7.0(3)I4(2)
All other Cisco ToR and spine switches NX-OS 7.0(3)I2(4)
Arista ToR and spine switches Arista EOS 4.17.0F

Documentation

To access the Cloud Foundation 2.1.3 documentation, go to the VMware Cloud Foundation documentation landing page.

To access the documentation for VMware software products that SDDC Manager can deploy, see their documentation landing pages and use the drop-down menus on the page to choose the appropriate version:

Browser Compatibility and Screen Resolutions for the Cloud Foundation Web-Based User Interfaces

The following Web browsers can be used to view the Cloud Foundation Web-based user interfaces:

  • Mozilla Firefox: Basic Version 39 and Stabled Version 43.0.4
  • Google Chrome: Basic Version 42 and Stabled Versions 47.0.2526.70 (iOS) and 47.0.2526.106 (Windows, Linux)
  • Internet Explorer: 11.0.25 for Windows systems, with all security updates installed
  • Safari: Basic Version 9.0.3 on Mac only

For the Web-based user interfaces, the supported standard resolution is 1024 by 768 pixels. For best results, use a screen resolution within these tested resolutions:

  • 1024 by 768 pixels (standard)
  • 1366 by 768 pixels
  • 1280 by 1024 pixels
  • 1680 by 1050 pixels

Resolutions below 1024 by 768, such as 640 by 960 or 480 by 800, are not supported.

Installation and Upgrades Information

You can install Cloud Foundation 2.1.3 directly. For a fresh installation of this release:

  1. Read the VIA User's Guide for guidance on setting up your environment, deploying VIA, and imaging an entire rack.
  2. Read the Cloud Foundation Overview and Bring-Up Guide for guidance on deploying the VMware Cloud Foundation software Bill-of-Materials (BOM) stack.

For instructions on upgrading to Cloud Foundation 2.1.3, see Lifecycle Management in the Administering VMware Cloud Foundation guide.

Supported Upgrade Paths

A Cloud Foundation upgrade is a sequential upgrade. To upgrade to a specific version, your datacenter must be at a Cloud Foundation release that is one version before the target release.

If you are upgrading from an existing deployment, only the following upgrade paths are supported:

  • 2.1.2 to 2.1.3
  • 2.1 to 2.1.1 to 2.1.2 to 2.1.3

Upgrade procedures are documented in the Administering VMware Cloud Foundation guide.

LCM Upgrade Bundles

The LCM upgrade bundles are hosted on the VMware Depot site and available only via the Lifecycle Management feature in the SDDC Manager. See Lifecycle Management in the Administering VMware Cloud Foundation guide.

Software Component Version Date
VMware Cloud Foundation Bundle 2.1.3 25 MAY 2017
VMware Software Update Bundle 2.1.3 25 MAY 2017

Updated Security Update

VMware vRealize Operation 6.2.1 Apache Strut Vulnerability security patch is available and supported for manual application. Because there are no upgrade dependencies, the patch can be applied at any time. See Knowledge Base Article KB2149591.

Known Issues

The known issues are grouped as follows:

Imaging Known Issues

  • Vendor drop-down in Imaging tab does not include HDS
    On the Imaging tab, the Vendor drop-down list does not display HDS, preventing you from selecting it as an option.

    Workaround: For HDS servers, make the following selections on the Imaging tab:

    • For Vendor, select Quanta
    • For Model, select D51B-2U
  • When retrying a stopped imaging run in VIA, failed tasks may incorrectly display as completed. The imaging process continues with subsequent tasks but later shows the earlier tasks as failed
    This issue can occur when you use the Stop button to stop an in-progress imaging run and then click Retry within a short period to start imaging again. Sometimes clicking Retry shortly after stopping imaging results in two running threads for the same imaging job. Even though the second thread successfully completes, resulting in the Completed status, the first thread is still running and fails because it was superseded by the second thread, resulting in the Failed status displaying on the screen.

    Workaround: If you need to stop an in-progress imaging run, wait for few minutes before attempting to retry.

Bring-Up Known Issues

  • If the time sync process fails on a host but POSV (Power On System Validation) passes with no issues, you are not prevented from continuing the system bring-up process even though tasks later on in the bring-up process might fail
    If the time sync process indicates failure on a host, you can continue in the UI to the POSV screen and run the POSV process to help identify issues on the host that might have caused the failure. However, due to this issue, if the POSV process subsequently passes, the Continue button is available and allows you to proceed in the bring-up process even though the time was not synchronized on that host. If you click Continue to proceed instead of Retry to rerun the time sync process, bring-up tasks that are performed later might fail, such as setting the NTP or deploying the PSC and vCenter Server appliances.

    Workaround: To avoid any unexpected issues, if the time sync process indicates a host has failed but the POSV process passes, click Retry after the POSV is done to ensure the time synchronization is rerun.

  • Alerts raised during POSV do not contain a rack name in their description
    Because the rack name is not specified by the user until after the IP allocation step in the system configuration wizard, a rack name is not available to display in alerts raised prior to that step. Alerts that are raised subsequent to that step do contain the user-specified rack name.

    Workaround: None.

  • After clicking Next in the bring-up wizard's Review step, the Component IP Allocation step fails to appear and refreshing the browser results in the wizard returning to the Create Superuser step and does not retain the previously entered information
    After you have progressed through the bring-up wizard and entered the requested information for the superuser, rack name, domain, networking information, and so on, the Review step displays the information you entered. The next step of the bring-up wizard is the Component IP Allocation step. However, when you click Next at the Review step, the display changes to a blue spinner that continues spinning and fails to display the Component IP Allocation screen or continue to that wizard step. When you refresh the browser to clear the display, instead of displaying the Component IP Allocation step, the bring-up wizard returns to the Create Superuser step, without retaining the previously entered information. You must re-enter the information into the wizard screens from that step onward.

    Workaround: None. If you cannot proceed in the bring-up wizard from clicking Next at the Review step, refresh the browser and re-enter the information in the wizard.

  • System bring-up process might fail at task VC: Apply vCenter License
    Due to intermittent timing issues, sometimes the system bring-up process might fail at the VC: Apply vCenter License task with the following exception
    java.lang.RuntimeException: Can't create VsphereClient instance

    Workaround: In the bring-up user interface, click the Retry button to perform the task and proceed with the bring-up process.

  • System bring-up process might fail at task Network: Assign Public Management IP address to Switches
    If intermittent connectivity to the switches occurs during the bring-up process, the process might fail at the Network: Assign Public Management IP address to Switches task. The vrm log file will have error messages stating the updates of the default route configuration on the switches did not receive responses from the HMS Aggregator, and the HMS log will have lines showing response code 500 for the PUT API request making the route update on the switches.

    Workaround: In the bring-up user interface, click the Retry button to perform the task and proceed with the bring-up process.

  • System bring-up process might fail at task ESX: Configure Power Management
    If intermittent connectivity to an ESXi host occurs during the bring-up process, the process might fail at the ESX: Configure Power Management task with the following exception
    com.vmware.vrack.vrm.core.error.EvoWorkflowException: Unable to access the ESXi host

    Workaround: In the bring-up user interface, click the Retry button to perform the task and proceed with the bring-up process.

Multi-Rack Bring-Up Known Issues

  • If the first rack’s ESXi host N1 is down when you start the bring-up process on an added rack, the starting Time Sync user interface screen for the new rack’s bring-up process appears blank and no progress is visible
    At the start of the bring-up process on the new rack, the VRM virtual machine on the new rack tries to connect to the N1 host in the first rack to obtain the host's time.To ensure that time is synchronized, the system reads the time of a host in the first rack to set the time sync process on the new rack. If the first rack's N1 host is down, the system continues to try to connect to that host for more than 10 minutes before attempting to obtain the time from another host in the first rack. During this retry time, the Time Sync screen appears blank, the bring-up process is delayed, and you cannot tell if progress is being made.

    Workaround: If the user interface appears blank when you start the bring-up process on an additional rack, first wait at least 15 minutes and then reopen to the added rack's bring-up starting URL https://192.168.100.40:8443/vrm-ui. If the screen shows the time set successfully for the listed components, then you can proceed as usual.

Post Bring-Up Known Issues

  • When you use the vSphere Web Client to view the vCenter Server clusters associated with the management domains or workload domains, you might see alarms related to Virtual SAN HCL
    As described in KB article 2109262, the Virtual SAN Health Service has built-in Hardware Compatibility (HCL) health check that uses a JSON file as its HCL database, to inform the service of the hardware and firmware that is supported for Virtual SAN. These alarms are raised if the HCL health check fails. However, because the set of supported hardware and firmware is constantly being updated as support for new hardware and firmware are added, if the check fails, the first step is to obtain the most recent Virtual SAN HCL data and use the vSphere Web Client to update the HCL database.

    Workaround: The steps to update the Virtual SAN HCL database are described in KB article 2145116.

    1. Download the latest Virtual SAN HCL data by opening a browser to https://partnerweb.vmware.com/service/vsan/all.json.
    2. Save the data to a file named all.json.
    3. Copy the all.json file to a location that is reachable by your Cloud Foundation installation.
    4. Log in to the vCenter Server instance using the vSphere Web Client and select the vRack-Cluster for the management domain or workload domain in which you are seeing the HCL health alarms.
    5. Navigate to the Manage tab, click Settings, and select Health and Performance in the Virtual SAN section.
    6. In the HCL Database area, click Update from file, browse to and select the all.json file you saved in step 2.
    7. Retest the health by navigating from the Manage tab to the Monitor tab and clicking Virtual SAN > Health > Retest.

    After granting the NSX Manager Enterprise Administrator role to the superuser account, you can now log in to the vSphere Web Client with that account and the Networking & Security features are available to that user.

  • The standard vCenter Server alarm named "License inventory monitoring" is raised for overprovisioning of the ESXi hosts, even though the ESXi hosts have the appropriate license key applied
    Under the standard licensing terms for the VMware Cloud Foundation product, all of the ESXi hosts in a Cloud Foundation installation are licensed using the same key. In the vCenter Server Licenses pane in the vSphere Web Client, in the Product column for this key, you see the associated product name is VMware vSphere 6 Enterprise for Embedded OEMs (CPUs). Under the VMware licensing terms, that type of key is allowed to be overprovisioned. However, due to this issue, when the vCenter Server sees this key as overprovisioned, it is incorrectly raising the standard vSphere "License inventory monitoring" alarm. You can use the vSphere Web Client to see the alarm definition for that alarm, by selecting the vCenter Server object in the left hand navigation area, and clicking the Manage tab > Alarm Definitions and clicking License inventory monitoring in the list.

    Workaround: None. Ignore these vCenter Server license inventory monitoring alarms about the overprovisioning of license capacity of the ESXi hosts in your Cloud Foundation installation.

  • When you log in to the vSphere Web Client using the superuser SSO account defined during bring-up, you do not see all of the NSX Manager features in the Networking & Security area
    During the bring-up process, in the Create a Superuser Account screen, you entered a user name and password. Using that input, the system created a superuser account as your main login account for your system. Due to this issue, the system does not automatically assign the NSX Manager Enterprise Administrator role to that superuser account. The NSX Manager Enterprise Administrator role grants permissions for NSX operations and security to a vCenter Server user.

    Workaround:

    1. Log in to the SDDC Manager client using the system-managed SSO administrator account credentials, and then open the vSphere Web Client using the vCenter launch link on the management domain’s details page.
    2. In the vSphere Web Client, navigate to the Networking & Security > NSX Managers page, click the listed NSX Manager, click Manage, and click Users.
    3. Click the green + icon and use the Assign Role dialog to assign the Enterprise Administrator role to your system’s superuser account.

    After granting the NSX Manager Enterprise Administrator role to the superuser account, you can now log in to the vSphere Web Client with that account and the Networking & Security features are available to that user.

    For more details about NSX Manager roles, see these topics in the NSX 6 Documentation Center:

  • Host sometimes hangs after doing a normal reboot
    Due to a known Virtual SAN issue, in a Cloud Foundation installation that has Dell R630 or Dell x730d servers and with certain controllers, sometimes a host hangs after doing a normal reboot. For a complete list of affected controllers, see VMware Knowledge Base article 2144936.

    Workaround:

    1. Ensure the storage-controller drivers and firmware are updated on each host according to the information in the VMware Compatibility Guide for Virtual SAN for PERC H370:http://www.vmware.com/resources/compatibility/detail.php?deviceCategory=vsanio&productid=34853&deviceCategory=vsanio&details=1&vsan_type=vsanio&keyword=h730
    2. Apply the settings as described in the VMware Knowledge Base article 2144936.
    3. Log into the host through its iDRAC web interface and reset the server.
    4. If the issue persists, run the Hardware Diagnostics, also from iDRAC web interface:
      1. From the menu bar, select Next Boot > Lifecycle Controller.
      2. When prompted, click OK to confirm the selection.
      3. From the menu bar, select Power > Power Cycle System (cold boot).
      4. When prompted, click Yes to confirm the action.
        The system is power cycled and rebooted to the Lifecycle Controller. If Settings or any other page displays, click Back or Cancel to go to the Hardware Diagnostics page.
      5. From the Lifecycle Controller, select Hardware Diagnostics in the navigaton panel and click Run Hardware Diagnostics.
      6. When prompted, click Yes to confirm the action.
        The hardware diagnostics process starts and lasts for an hour or longer.
      7. When the hardware diagnostics process completes, repeat the Power > Power Cycle System (cold boot) command.
        The system should now boot without any hang time.

SDDC Manager Known Issues

  • VI workload domain deletion fails with hostkey error
    During the physical host restore task, the VI workload domain deletion workflow may fail with the following error message "Exception while taking SSH connection com.jcraft.jsch.JSchException: reject HostKey: rack-1-backuplcm-1.vxrsddc.prci.com".

    Workaround: Refresh the SSH keys.

    • Log in to the SDDC Manager VM as root.
    • Run the ssh-keyscan command.

      ssh-keyscan -t ECDSA -4 rack-1-backuplcm-1.vxrsddc.prci.com >> ~vrack/known_hosts

      This ensures the SSH connection succeeds.

    • Retry the VI workload domain deletion.
  • After the SDDC Manager client times out in your browser session and displays the login screen, when you try to log in after a few hours, an error message about the required user name and password is displayed instead of the expected message about the expired SAML request
    Authentication to the SDDC Manager client uses SAML (Security Assertion Markup Language). When the SDDC Manager client is idle for a period of time, it automatically logs you out and displays the login screen. The URL in the browser holds the original SAML authentication request. After a longer period of time, on the order of hours, the SAML authentication request expires, by design. As a result, if you return to the screen without refreshing the browser session to get a new SAML authentication request, the request fails by design. However, instead of an error message informing you of the expired SAML request, an error message stating "User name and password are required" is displayed.

    Workaround: If you encounter this issue, open a new browser session to the virtual IP address of your SDDC Manager, such as https://vrm.subdomain.root-domain:8443/vrm-ui, as described in the Administering VMware Cloud Foundation Guide.

  • The Rack Details screen might display a warning with code VRM040901E and message "Error loading rack details" when backend services are restarted in the VRM virtual machine
    Some administrative operations, such as rotating passwords, require you to stop the vrm-watchdogserver and vrm-tcserver services in a rack’s VRM virtual machine and then restart those services after you have completed the administrative operation. During the restart process, the system gathers hardware inventory information, which can take up to 15 minutes. If you navigate to the Rack Details screen within 15 minutes of restarting the services, the warning message might appear because the system is still gathering the up-to-date hardware information.

    Workaround: After restarting those services in the VRM VM, wait at least 15 minutes before navigating to the Rack Details screen.

  • In the Workflows screens, for a workload domain deletion workflow, the "Destroy vms on hosts" subtask does not display the details of the VMs and ESXi hosts affected by that subtask
    For a workflow, you can navigate to see the details of the workflow's subtasks from the System Status > Workflows screen in the SDDC Manager client, and when you expand a subtask, the screen usually displays details about the system artifacts that are being operated on by that subtask. For the workflow in which a workload domain is being deleted, the user interface does not display any details about the VMs or ESXi hosts on which the "Destroy vms on hosts" subtask is operating, such as the number of VMs that are deleted from which ESXi host.

    Workaround: None. You can examine the log files to determine the details of the VMs and ESXi hosts that are involved in the subtask. The /home/vrack/vrm/logs/vrack-vrm-debug.log file, which has details about the VMs and ESXi hosts.

  • An expansion workflow that involves adding more than one ESXi host to a management or workload domain is marked successful, even though when the hosts were added to the domain's vCenter Server cluster, the NSX Manager Host Preparation process failed to complete on one or more hosts
    During an expansion workflow, the hosts are added to the vCenter Server cluster that underlies the management or workload domain. When hosts are added to a vCenter Server cluster that has NSX enabled on the cluster, one of the tasks involves preparing the newly added hosts, as described in the Prepare Hosts on the Primary NSX Manager topic in the NSX 6.2 documentation. Part of this host preparation process involves a scan of each added ESXi host prior to installing the required NSX software on that host. If the scan on a particular host fails for some transient reason, the NSX Manager host preparation process fails for that host. However, this failure condition is not reported to the expansion workflow and the workflow appears as successful in the SDDC Manager client.

    Workaround: When performing an expansion workflow that involves multiple hosts and when the SDDC Manager client indicates the workflow has completed, perform the follow steps to verify the NSX host preparation was successful for each added host, and if not, resolve the issues reported by NSX.

    1. Using the vSphere Web Client, log in to the vCenter Server instance for the management or workload domain that was expanded.
    2. In the vSphere Web Client, examine the NSX Manager host preparation state by navigating to Networking & Security > Installation and clicking the Host Preparation tab.
    3. On the Host Preparation tab, expand the cluster if it is not already expanded, and examine the data reported for each host in the Installation Status column and VXLAN column:
      • If the Installation Status column reports green checkmarks and "Configured" in the VXLAN column for all hosts, the added hosts were successfully prepared.
      • If the Installation Status column displays "Not Ready" and the corresponding VXLAN column displays "Error" for a host, resolve the error by right-clicking on the VXLAN column's "Error" and clicking Resolve. This action also applies the VXLAN distributed switch port group to that host.
  • Because no unique identifier is used to identify a Cloud Foundation installation, when you deploy more than one installation in your networking environment, you cannot use the same level of Active Directory (AD) integration for both systems
    For the first Cloud Foundation installation deployed in your environment, you would configure AD authentication by adding your AD as an identity source to the Platform Security Controller instances using the Active Directory (Integrated Windows Authentication) option and joining the vCenter Single Sign-On server to the AD domain. Due to this issue, for additional installations, you cannot do that same configuration.

    Workaround: For additional installations, the Active Directory as an LDAP Server option can be used to add your AD as an identity source to the Platform Security Controller instances in those installations.

  • A workload domain’s workflow can fail if a VM in the management domain on which the workflow depends is in non-operational state
    Workflows to deploy, delete, and expand workload domains can fail if some of the management domain’s virtual machines are in an invalid state, down, or temporarily inaccessible. SDDC Manager does not prevent you from initiating and submitting a workflow when one of the VMs is in an invalid state. These virtual machines include the PSC VMs, vCenter Server VMs, Infrastructure Services Manager VMs, vRealize Operations Manager VM, vRealize Log Insight VM, NSX Manager VM, and, in a multi-rack system, the VRM VM. If you submit the workflow and one of those virtual machines becomes temporarily inaccessible as the workflow is performed, the workflow will fail.

    Workaround: Before initiating a workflow, review the state of the management domain’s virtual machines to see that they are all in a valid (green) state. You can see the virtual machines by launching the vSphere Web Client from the domain details of the management domain.

  • When using the SDDC Manager client’s Uplink screen to update L3 connectivity settings, the Uplink screen does not indicate which of the ToR switches has the L3 uplink configured on it
    When an uplink is configured to L3 mode, only one of the two ToR switches has an uplink port. The SDDC Manager client does not indicate which ToR switch is connected to the upstream router.

    Workaround: When you use the Uplink screen to change uplink connectivity settings, perform the following steps.
    Note: Changing the settings triggers uplink reconfiguration on the switches. Because the reconfiguration process might take a few minutes to complete, connectivity to the corporate network might be lost during the process. To avoid losing connectivity with SDDC Manager, it is strongly recommended that you are connected to port 48 on the management switch when updating the settings using the Uplink screen.

    1. Connect to port 48 on the management switch and log in to the SDDC Manager client using that connection.
    2. On the Uplink screen, configure the L3 uplink and click SAVE EDITS.
    3. Re-configure your upstream router to use the new network settings that you specified in step 2.
    4. Wait at least 3 minutes.
    5. Try connecting the upstream router to the top ToR switch.
    6. Test the new uplink connectivity by disconnecting from port 48 on the management switch and connecting to the virtual rack with the new uplink configuration.
    7. If you are unable to reconnect to the virtual rack, try connecting the upstream router to the bottom ToR switch.
    8. If you are unable to connect to the virtual rack, reconnect using port 48 on the management switch and try reconfiguring your network to the original configuration.
    9. If you cannot connect to the virtual rack with either configuration, contact VMware Support.
  • Existing trunk ports on ToR Cisco Nexus 9K switches are assigned to new VLANs when a VI or VDI workload domain is created
    During imaging with the VIA, port-channel 29, 45, 100, 110, and 120 are created on the ToR Cisco Nexus 9K switches, and are set to belong to all VLANs. As a result, when new VLANs are entered during creation of a VI or VDI workload domain, these port-channels become part of the new VLANs and the external VLAN and other VLANs created specifically for the new workload domain get assigned to all existing trunk ports on the ToR Cisco Nexus 9K switches, including the uplink and management cluster bonds.

    Workaround: None

  • When a server commissioning or decommissioning operation is in progress, the Rack Details screen and subscreens display an error message "Error loading rack details VRM 040901E"
    In the SDDC Manager client, you navigate to the Rack Details screen from the Physical Resources area in the Dashboard or from the links in a management domain or workload domain’s Domain Details screen. Due to this issue, the Rack Details screen displays that error message because the backend data that populates that screen is changing as a result of the server commissioning or decommissioning activities.

    Workaround: Wait until the server commissioning or decommissioning operation is completed before attempting to view the Rack Details screen.

  • Log files and some locations in the SDDC Manager client might contain occurrences of "IaaS"
    The phrase "IaaS" might appear in the system's log files and in some places in the SDDC Manager client instead of "Virtual Infrastructure" or "VI".

    Workaround: None. If you see occurrences of "IaaS" in the log files or the user interface, they are referring to the system's Virtual Infrastructure (VI) features.

Virtual Infrastructure Workload Domain Known Issues

  • In a Cloud Foundation environment configured with L3 uplinks, when you try to create a workload domain with a data center (external) connection using the same subnet but a different VLAN as a workload domain that was previously created and deleted, the workload domain creation fails
    When a workload domain is deleted and your environment’s ToR switch uplinks are configured with L3, the Switched Virtual Interfaces (SVIs) that were originally created on the ToR switches for that workload domain are not deleted. Due to this issue, if you subsequently try to create a workload domain using a different VLAN ID but same subnet as the deleted one, the workload domain creation fails because the switches do not allow two VLAN IDs with the same subnet.

    Workaround: When creating a VI or VDI workload domain, in the data center connection’s configuration, do not combine a different VLAN ID with a subnet that was previously used for a deleted workload domain. You can reuse the same VLAN with the same subnet or reuse the same VLAN with a different subnet.

  • The VI workload domain creation and expansion workflows might fail at task "ConfigureVCenterForLogInsightTask" due to a failure to connect to the deployed vRealize Log Insight instance
    During the VI workload domain creation and expansion workflows, if the system cannot connect to the deployed vRealize Log Insight instance, the workflow fails at the "ConfigureVCenterForLogInsightTask" task and you see an exception in the log with a 500 HTTP error code:
    [com.vmware.vrack.vrm.workflow.tasks.loginsight.ConfigureVCenterForLogInsightTask] Exception while doing the integration: Create session to LogInsight Failed : HTTP error code : 500

    Workaround: Restart the vRealize Log Insight's virtual machine by using the management domain's vCenter Server launch link to open the vSphere Web Client and using the vSphere Web Client user interface to restart the vRealize Log Insight's virtual machine. Then restart the failed workflow.

  • The VI workload domain creation workflow might fail at task "VC: Deploy vCenter" due to a failure to connect to the system's Platform Services Controller instances
    During the VI workload domain creation workflow, if the system cannot connect to the integrated Platform Services Controller instances, the workflow fails at the "VC: Deploy vCenter" task and you see errors in the log such as:
    Unexpected error while verifying Single Sign-On credentials: [Errno 111]
    Connection refused
    Cannot get a security token with the specified vCenter Single Sign-On configuration.

    Workaround: Restart the system's PSC-2 virtual appliance, then the PSC-1 virtual appliance, then the vCenter Server virtual appliance. Wait until each virtual appliance is up and running before restarting the next one. Then restart the failed workflow.

  • On the Review page of the VI workload domain creation wizard, the Download and Print buttons are not operational
    Due to this issue, when you reach the Review step of the VI workload domain creation wizard, you cannot use the Download or Print buttons to create a printable file of the displayed information for future reference.

    Workaround: None. At the Review step of the wizard, you must manually capture the information for future reference, for example by taking screen captures of the displayed information.

VDI Workload Domain Known Issues

  • After a workload has been deleted, it still appears in the interface for an extended time
    This issue is self-correcting. The workload disappears only after the deletion process is completed.

    Workaround: None.

  • Unable to restart failed VDI workflow
    Sometimes if a VDI workflow fails, the user may be unable to restart it. The system returns the following error message: Plan is expired. Please start a new request and shows that the failure occurred when during the creation of a VI or VDI workload. This issue has been fast-tracked for resolution.

    Workaround: None. However, this issue may in part be due to the VI reservation being incorrectly marked as consumed and therefore no longer available.

  • The VDI workload domain creation workflow might fail at task "Instantiate Horizon View Adapter"
    Due to intermittent timing issues, the VDI workload domain creation workflow sometimes fails at the Instantiate Horizon View Adapter task with the following exception error in the log:

    com.vmware.vrack.vdi.deployment.tools.tasks.VDIWorkflowException: "Unable to create vROps REST client"

    As a result, the pairing credential between the vRealize Operations Manager instance and the VDI environment is in a partially instantiated state and must be deleted before restarting the workflow.

    Workaround: Manually delete the pairing credential that is associated with the workload domain's Horizon Connection server and then restart the failed workflow using the Restart Workflow action in the workflow's status screen using these steps:

    1. Verify that you have the IP address for the first Horizon Connection server that was deployed for this VDI workload domain, such as 10.11.39.51 You will use that IP address to identify which pairing credential to delete.
    2. Log in to the vRealize Operations Manager Web interface. You can use the launch link in the management domain's details screen to open the log in screen.
    3. From the vRealize Operations Manager Web interface's Home screen, navigate to the Credentials screen by clicking Administration > Credentials.
    4. Locate the pairing credential having a name in the form of vdi-view-adapter-IPaddress, where the IP address matches the one you obtained in step 1. For example, if the Horizon Connection server has IP address 10.11.39.51, the displayed pairing credential name is vdi-view-adapter-10.11.39.51.
    5. Select that pairing credential and delete it.
    6. In the workflow's status screen, restart the failed workflow using the Restart Workflow action.
  • In a Cloud Foundation environment configured with L3 uplinks, when you try to create a workload domain with a data center (external) connection using the same subnet but a different VLAN as a workload domain that was previously created and deleted, the workload domain creation fails
    When a workload domain is deleted and your environment’s ToR switch uplinks are configured with L3, the Switched Virtual Interfaces (SVIs) that were originally created on the ToR switches for that workload domain are not deleted. Due to this issue, if you subsequently try to create a workload domain using a different VLAN ID but same subnet as the deleted one, the workload domain creation fails because the switches do not allow two VLAN IDs with the same subnet.

    Workaround: When creating a VI or VDI workload domain, in the data center connection’s configuration, do not combine a different VLAN ID with a subnet that was previously used for a deleted workload domain. You can reuse the same VLAN with the same subnet or reuse the same VLAN with a different subnet.

  • The VDI workload domain deletion workflow fails at the task "VDIRemovalWaitForDesktopPoolDeletionTask"
    Due to an underlying issue in Horizon version 6.2, if a virtual desktop is undergoing customization when deletion of a desktop pool is initiated, deletion of the desktop pool halts. As a result, the VDI workload domain deletion workflow fails at the task "VDIRemovalWaitForDesktopPoolDeletionTask".

    Workaround: Before deleting a VDI workload domain, log in to the View Administrator and ensure that no virtual desktops are undergoing customization. If you started a VDI workload domain deletion workflow and encounter this issue, manually remove the desktop pool in View Administrator and then restart the deletion workflow in the System Status > Workflow Details screen in the SDDC Manager client.

  • When creating a VDI workload domain with specified settings that results in the system deploying two vCenter Server instances, the creation workflow might fail at the "ESXI: Incremental LI Integration" task
    Depending on the value for the Max Desktops [per vCenter Server] setting in the VDI Infrastructure screen and your choice for the number of desktops in the VDI workload domain creation wizard, the system might need to deploy more than one vCenter Server instance to support the desired VDI workload domain. As part of this deployment, the system starts two VI workload domain creation workflows. One of the VI workload domain creation workflows might fail at the task "ESXI: Incremental LI Integration" with an error message about failure to connect ESXi hosts to vRealize Log Insight:

    hosts to LogInsight failed : HTTP error code : 404 : Response :
    

    Workaround: Use the Physical Resources screens to verify that the ESXi hosts that the failed workflow is trying to use are all up and running. Use the vSphere Web Client to verify that the vRealize Log Insight VMs in the system are all up and running. Ensure that the ESXi hosts involved in the failed workflow and the vRealize Log Insight VMs are in a healthy state, and then click Retry in the failed workflow.

Life Cycle Management (LCM) Known Issues

  • SSO customization page is lost after PSC upgrade
    PSC upgrade removes the customized SSO login page, accessing the SDDC Manager shows the default login page.

    Workaround: None.

  • LCM update status erroneously displays as FAILED after auto-recovery
    If the ESXi update fails during the ESX HOST UPGRADE STAGE REBOOT stage due to connectivity issues, and the update is automatically recovered after the connectivity issue is resolved, the update might still display as FAILED in the LCM update history even though it was successful. (The respective domains are upgraded to the target version.)

    Workaround: None. Ignore the LCM update status.

  • ESXi and vCenter update on a host might fail in the task of exiting maintenance mode
    Sometimes during an ESXi and vCenter update process, a host might fail to exit maintenance mode, which results in a failed update status. During an update, the system puts a host into maintenance mode to perform the update on that host, and then tells the host to exit maintenance mode after its update is completed. At that point in time, a separate issue on the host might prevent the host from exiting maintenance mode.

    Workaround: Attempt to exit the host from maintenance mode through the vSphere Web Client.

    • Locate the host in vSphere and right-click it.
    • Select Maintenance Mode > Exit Maintenance Mode.

      This action will list any issues preventing the host from exiting maintenance mode.

    • Address the issues until you can successfully bring the host out of maintenance mode.
    • Return to the SDDC Manager client and retry the update.
  • NSX upgrade fails with RuntimeException
    During an NSX upgrade, the update process fails after the NSX Manager upgrade, skipping the update of Controllers and Edge. The upgrade log shows the following message:
    Upgrade element resourceType: NSX_CONTROLLER resourceId: 3b4e23c4-7177-4444-9901-9fe7a02a30ae:controller-cluster status changed to SKIPPED

    Workaround: Go to the LCM Inventory page (SDDC Manager > Lifecycle Management > Inventory) and check if the domain state has a status of Failed. If so, click Resolve and re-apply the same update.

  • LCM Inventory page shows a failed domain, but no failed components
    The LCM Inventory page shows a failed domain, but does not show any failed components.

    Workaround: Log in to vCenter for the domain and check that all hosts in the domain have the lcm-bundle-repo available. Add the lcm-bundle-repo if necessary.

  • Lifecycle Management page shows all available update bundles independent of the Cloud Foundation release in your environment
    The Lifecycle Management Repository page displays all available updates, regardless of your specific release.

    Workaround: None. Proceed to download the bundles as indicated. Lifecycle Management evaluates and determines the necessary update bundles after they have been downloaded and will apply only the bundles appropriate for your product versions.

  • After using the LCM capability to update the vCenter Server software, one or more hosts in an NSX host prep cluster might have 'Not Ready' status, which results in the NSX audit failing and prevents future updates from being scheduled
    Due to an underlying issue with ESX Agent Manager (EAM), after updating the vCenter Server software, the installation status of one or more hosts in the NSX host prep cluster associated with the updated vCenter Server cluster might be in 'Not Ready' status. You can examine the NSX host prep status of the hosts by using the vSphere Web Client to log into the vCenter Server and navigating to the Networking & Security > Installation > Host Preparation tab, and seeing the status for each host in the Installation Status column on that tab.

    Workaround: On the Host Preparation tab in the vSphere Web Client, use the Resolve action in the Installation Status column's menu to manually resolve the cluster. The displayed status will change as the operation proceeds on each host in the cluster. If any host continues to show 'Not Ready' status, use the Resolve action again. You might need to perform this operation a few times.

  • On the Lifecycle Management Update screen, when you expand the section for a failed VMware Software upgrade to see the status of the underlying tasks, the task at which the process failed has a green check mark icon
    The VMware Software upgrade process involves performing a number of tasks. When one task fails, the screen shows that the overall VMware Software upgrade process failed. Due to this issue, when you expand the section in the user interface to view the list of tasks, the task at which the process failed has a green check mark icon next to it instead of the red failure icon.

    Workaround: None. You can tell which task caused the failure of the overall process because all of the tasks in the list after the failed task have gray icons, indicating the process failed before reaching those subsequent tasks.

  • LCM update logs saved in two folders
    LCM update logs are being saved in two similarly named folders:

    • /home/vrack/lcm/upgrades
    • /home/vrack/lcm/upgrade

    Workaround: Review logs in both folders.

  • NSX hostprep resolve required after a vCenter or ESXi update
    After a vCenter or ESXi update, the Installation Status of one or more hosts in an NSX host prep cluster that is associated with the vCenter's domain may show up as "Not Ready."

    Workaround: You can manually restore the cluster to "Ready" status.

    • Navigate to the Host Preparation tab. (Networking & Security > Installation)
    • Click Resolve.

    You may need to repeat this procedure several times.

  • ESXi upgrade fails with message Cluster not in good condition
    Virtual SAN clustering on the host was not enabled resulting in the upgrade failure.

    Workaround:

    1. Open the vCenter Web Client.
    2. Navigate to Hosts and clusters >ClusterName >Monitor >Issues.
    3. Fix the Virtual SAN issue.
    4. Reschedule the update.

Monitoring Known Issues

  • In the Events and Audit Events screens in the SDDC Manager client, as you scroll down to the end of the event lists, you see the last entry repeated at the bottom
    Due to this issue, when you scroll to the bottom of the event lists, the list repeatedly displays the last entry instead of displaying a message to use the Analysis button to view older events.

    Workaround: None. Ignore these extraneous entries.

  • The icon displayed for the VMware Cloud Foundation content pack in vRealize Log Insight references the prior product name
    Prior to this release, the product name was EVO SDDC. In the vRealize Log Insight web interface, when you navigate to the Installed Content Packs list and select the one labeled VMware - VCF to see its information screen, the displayed icon references that prior name. The small icon next to the VMware - VCF label in the Dashboards view also references the prior product name.

    Workaround: None

  • When a Cisco ToR switch is powered down or up, the TOR_SWITCH_DOWN event is not listed in the System Status - Events screen in the SDDC Manager client, even though the event is visible in the vRealize Log Insight instance
    For systems that have Cisco ToR switches, when the switch is powered down, the TOR_SWITCH_DOWN event is not being written to the software location that populates the System Status - Events screen, even though the event is sent as expected to the vRealize Log Insight instance that SDDC Manager deploys in the system. As a result, even though you can see the reported events using the vRealize Log Insight Web interface, the TOR_SWITCH_DOWN events from Cisco ToR switches will not appear in the SDDC Manager client's Events screen.

    Workaround: Use the vRealize Log Insight Web interface as the reporting source for any TOR_SWITCH_DOWN events.

  • When a Quanta server is powered down (off), the CPU_CAT_ERROR (CPU Catastrophic Error) event is generated
    When a Quanta server is powered down, the SERVER_DOWN event is generated, which is the expected behavior. The issue is the CPU_CAT_ERROR event is also generated when a Quanta server is powered down.

    Workaround: None

Hardware and FRU Replacement Known Issues

  • Server decommissioning workflow might fail at task "Enter hosts maintenance mode" when a server being decommissioned has many VMs running on it
    This issue might occur when the servers are part of a workload domain in which many VMs are running on the servers, such as a VDI workload domain with many virtual desktops. When a server is decommissioned, the server’s ESXi host is put into maintenance mode so that it can be removed from the workload domain’s vCenter Server cluster. Prior to an ESXi host entering maintenance mode, the VMs running on the host are migrated to another ESXi host in the workload domain using vMotion. The system waits until all ESXi hosts participating in the decommissioning workflow have entered maintenance mode before beginning the next workflow task. If there are many VMs running on one of the hosts, the migration process can take longer than for the others. After a number of attempts to verify that the host has not entered into maintenance mode, the system flags the workflow task as failed.

    Workaround: Restart the failed workflow using the Restart Workflow action in the workflow’s status screen.

  • ToR-Spine Link (40G) not coming up after imaging
    Post imaging of rack, the ToR and spine ports show as not connected. This issue is caused by the port-channel speed being set to Auto but the speed on the Network Transceiver does not default to automatically to 40Gbps. As a result, the ports on the ToR and Spine switches operate at different speeds, causing this connectivity issue.

    Workaround: Perform the following procedure on both TORs.

    1. Log in to the ToR as an administrative user and run the following command.
      conf t
      int po 120
      speed 40000
      end
    2. Check the port status of ports 49 and 50.
      sh int br <press spacebar to scroll down>
    3. If the port status is administratively down, run the below commands to make it administratively up.
      conf t
      int e1/49-50
      no shut
      end
    4. Run the sh int br command to verify that ports 49 and 50 are up.

Issues Resolved in Version 2.1.3

  • Bundle versions in interface do not match product version
    The Bundle details page within SDDC Manager shows bundle version numbers 2.1.2, 2.1.3, 2.1.4. These values reflect internal versioning and not the Cloud Foundation product release version.

    This issue is resolved in this release.

  • Incorrect BMC alerts on the dashboard
    Out-of-band IPs might incorrectly trigger BMC alerts on the dashboard.

    This issue is resolved in this release.

  • VDI Creation failed at task Horizon View Event Configuration
    This issue occurs because a VI Workload Domain was deleted, but an IP address previously assigned to an NSX controller as part of that VI Workload Domain remains in the unbound.conf. Subsequently, if during VDI Workload creation, the SDDC Manager already assigned the same IP to the first Connection server (it was freed during VI deletion and was available). However, the DNS record for that IP remains in unbound.conf file. The SSL connection fails because it tries to map the IP ( that is now Connection server IP) to NSX controller DNS. Obviously this is not true, because the IP has now new DNS - the Connection server DNS. Therefore the Hostname Verification during SSL handshake fails. The issue is DNS mismatch. Previously assigned DNS-IP mapping is wrong because now the IP belongs to a new machine (with new DNS).

    This issue is resolved in this release.

  • The system's bring-up process creates a distributed port group vRack-DPortGroup-VXLAN in the management domain that does not get used in the steady-state system
    When you look at the distributed port groups that are created for the management domain, you see a distributed port group named vRack-DPortGroup-VXLAN, and the management domain's ESXi hosts have access to it. However, that distributed port group is not used by the system. The VXLAN distributed port group that the system uses is the one that is automatically created by the NSX Manager during its VXLAN preparation process. The one that is used is named vxw-vmknivPg-dvs-nnnn, where nnn is a system-generated identifier.

    This issue is resolved in this release.