VMware Cloud Foundation 2.2 Release Notes

|

VMware Cloud Foundation 2.2 | 24 AUGUST 2017 | Build 6383997

Check for additions and updates to these release notes.

What's in the Release Notes

The release notes cover the following topics:

What's New

The VMware Cloud Foundation 2.2 release includes the following:

  • Improved system architecture using a single management domain model that provides full deployment and configuration during bring-up, and reduced demand on resources.
  • vSphere 6.5 Update 1, vSAN 6.6.1, NSX 6.3.3, and Horizon 7.2.
  • Supported hardware extended to include servers from Fujitsu, Lenovo, and HDS. See the VMware Compatibility Guide for details.
  • Optimized SDDC Manager now consolidated to two VMs (controller and utility), providing a smaller footprint while improving scalability and performance, and incorporating hardware management for improved network accessibility.
  • Improved interface for bring-up operations.
  • Improved password management with password rotation, which randomizes passwords. Passwords for built-in accounts are automatically set during bring-up and can be updated or rotated on demand. Passwords can be rotated directly in SDDC Manager.
  • Support for third-party vSphere Installation Bundles (VIBs) during imaging in VIA.
  • Improved vRealize Log Insight support. Now deployed during bring-up as a three-host cluster.
  • Automated implementation of signed certificates for vCenter, PSC, NSX, SDDC Manager, and vRealize Log Insight.
  • Ability to replace certificates for Cloud Foundation components that are externally accessible with signed certificates?
  • Expanded SoS Tool Options to include health status checks for various components or services, including connectivity, compute, storage, database, domains, and networks, among others.
  • Improved and expanded functionality for image-level and file-level backup and restore for management domain VMs.
  • Improved and expanded VDI configuration options and display. You can now save VDI workload domains as templates, configure relay agents, and review physical resources for workload domains.
  • Out-of-band (OOB) network is now a Layer 2 network scaled across multiple racks.
  • Expanded alert and event reporting.

NOTE: vRealize Operations is not included in Cloud Foundationfor the 2.2 release. However, you can deploy and configure a standalone installation of vRealize Operations to work with Cloud Foundation 2.2. See the Knowledge Base article How to configure a standalone instance of vRealize Operations Manager in a VMware Cloud Foundation environment.

Cloud Foundation is a software stack that deploys the VMware SDDC Software Stack. Older version releases are 2.1, 2.1.1, 2.1.2, 2.1.3, and 2.1.3a. For information about what is new in those products, as well as their known issues and resolved issues, see the release notes for those software versions. You can locate their release notes from their documentation landing pages at docs.vmware.com.

VMware Software Versions and Build Numbers

You can install Cloud Foundation 2.2 directly or upgrade from a previous version. See the Upgrade and Installation Information section.

The Cloud Foundation 2.2 software product is installed and deployed by completing two phases: Imaging phase (phase one) and Bring-Up with Automated Deployment phase (phase two). The following sections list the VMware software versions and builds that are involved in each phase.

Phase One: Imaging with VIA

In this phase, hardware is imaged using the following VMware software build:

Software Component Version Date Build Number
VIA (the imaging appliance) 2.2 21 AUG 2017 6376914

Phase Two: Bring-Up With Automated Deployment

In this phase, the Cloud Foundation software product enables automated deployment of the following software Bill-of-Materials (BOM). This BOM is interoperable and compatible.

The Cloud Foundation 2.2 software BOM.

Software Component Version Date Build Number
VMware Cloud Foundation Bundle 2.2 24 AUG 2017 6383997
VMware SDDC Manager 2.2 24 AUG 2017 6382873
VMware Platform Services Controller 6.5 Update 1 27 JUL 2017 5973321
VMware vCenter Server on vCenter Server Appliance 6.5 Update 1 27 JUL 2017 5973321
VMware vSphere (ESXi) 6.5 Update 1 27 JUL 2017 5969303
VMware vSAN 6.6.1 27 JUL 2017 5969303
VMware NSX for vSphere 6.3.3 08 AUG 2017 6276725
VMware vRealize Log Insight 4.3 03 JUN 2017 5084751
VMware vRealize Log Insight Agent 4.3 03 MAR 2017 5052904
VMware NSX content pack for vRealize Log Insight 3.6 08 AUG 2017 n/a
VMware vSAN content pack for vRealize Log Insight 2.0 18 APR 2016 n/a
VMware Tools 10.1.10 27 JUL 2017 6082533
VMware Horizon 7 7.2 26 JUN 2017 5748532
VMware Horizon View content pack for vRealize Log Insight 3.0 n/a n/a
VMware App Volumes 2.12 08 DEC 2016  

VMware Cloud Foundation Component License Information

The VIA and SDDC Manager software is licensed under the Cloud Foundation license. As part of this product, the SDDC Manager software deploys specific VMware software products.

The following VMware software deployed by SDDC Manager is licensed under the Cloud Foundation license:

  • VMware vSphere
  • VMware vSAN
  • VMware NSX

The following VMware software deployed by SDDC Manager is licensed separately:

  • VMware vCenter Server
  • VMware vRealize Log Insight
  • Content packs for Log Insight
  • VMware Horizon
  • VMware App Volumes

NOTE Only one vCenter Server license is required for all vCenter Servers deployed in a Cloud Foundation system.

NOTE The use of vRealize Log Insight for the management workload domains in a Cloud Foundation system is permitted without purchasing Log Insight licenses.

For details about the specific VMware software editions that are licensed under the licenses you have purchased, see the VMware Software Versions and Build Numbers section above.

For more general information, see the Cloud Foundation product page.

Supported Hardware

For details on the hardware requirements for a Cloud Foundation environment, including manufacturers and model numbers, see the VMware Cloud Foundation Compatibility Guide.

Network Switch Operating System Versions

For details on network operating systems for the networking switches in Cloud Foundation, see the VMware Cloud Foundation Compatibility Guide.

Documentation

To access the Cloud Foundation 2.2 documentation, go to the VMware Cloud Foundation documentation landing page.

To access the documentation for VMware software products that SDDC Manager can deploy, see their documentation landing pages and use the drop-down menus on the page to choose the appropriate version:

Browser Compatibility and Screen Resolutions for the Cloud Foundation Web-Based User Interfaces

The following Web browsers can be used to view the Cloud Foundation Web-based user interfaces:

  • Mozilla Firefox: Version 55.x or 54.x
  • Google Chrome: Version 60.x or 59.x
  • Internet Explorer: 11.x for 10.x or Windows systems, with all security updates installed
  • Safari: Basic Version 9.x or 10.x on Mac only

For the Web-based user interfaces, the supported standard resolution is 1024 by 768 pixels. For best results, use a screen resolution within these tested resolutions:

  • 1024 by 768 pixels (standard)
  • 1366 by 768 pixels
  • 1280 by 1024 pixels
  • 1680 by 1050 pixels

Resolutions below 1024 by 768, such as 640 by 960 or 480 by 800, are not supported.

Installation and Upgrade Information

You install Cloud Foundation 2.2 directly as a new release. For a fresh installation of this release:

  1. Read the VIA User's Guide for guidance on setting up your environment, deploying VIA, and imaging an entire rack.
  2. Read the Cloud Foundation Overview and Bring-Up Guide for guidance on deploying the VMware Cloud Foundation software Bill-of-Materials (BOM) stack.

For information on upgrading to Cloud Foundation 2.2 from an earlier version, contact VMware Support.

Security Update

On 27 July 2017, VMware issued advisory VMSA-2017-0012 regarding a potential vulnerability in which VMware VIX API allows for direct access to Guest Operating Systems (Guest OSs) by vSphere users with limited privileges.

The workaround is documented in Knowledge Base Article 2151027.

Resolved Issues

  • Vendor drop-down in Imaging tab does not include HDS
    On the Imaging tab, the Vendor drop-down list does not display HDS, preventing you from selecting it as an option.

    This issue is resolved in this release.

  • When retrying a stopped imaging run in VIA, failed tasks may incorrectly display as completed. The imaging process continues with subsequent tasks but later shows the earlier tasks as failed

    This issue can occur when you use the Stop button to stop an in-progress imaging run and then click Retry within a short period to start imaging again. Sometimes clicking Retry shortly after stopping imaging results in two running threads for the same imaging job. Even though the second thread successfully completes, resulting in the Completed status, the first thread is still running and fails because it was superseded by the second thread, resulting in the Failed status displaying on the screen.

    This issue is resolved in this release.

  • System bring-up process might fail at task Network: Assign Public Management IP address to Switches

    If intermittent connectivity to the switches occurs during the bring-up process, the process might fail at the Network: Assign Public Management IP address to Switches task. The vrm log file will have error messages stating the updates of the default route configuration on the switches did not receive responses from the HMS Aggregator, and the HMS log will have lines showing response code 500 for the PUT API request making the route update on the switches.

    This issue is resolved in this release.

  • If the first rack’s ESXi host N1 is down when you start the bring-up process on an added rack, the starting Time Sync user interface screen for the new rack’s bring-up process appears blank and no progress is visible

    At the start of the bring-up process on the new rack, the SDDC Manager VM on the new rack tries to connect to the N1 host in the first rack to obtain the host's time.To ensure that time is synchronized, the system reads the time of a host in the first rack to set the time sync process on the new rack. If the first rack's N1 host is down, the system continues to try to connect to that host for more than 10 minutes before attempting to obtain the time from another host in the first rack. During this retry time, the Time Sync screen appears blank, the bring-up process is delayed, and you cannot tell if progress is being made.

    This issue is resolved in this release.

  • When you log in to the vSphere Web Client using the superuser SSO account defined during bring-up, you do not see all of the NSX Manager features in the Networking & Security area

    During the bring-up process, in the Create a Superuser Account screen, you entered a user name and password. Using that input, the system created a superuser account as your main login account for your system. Due to this issue, the system does not automatically assign the NSX Manager Enterprise Administrator role to that superuser account. The NSX Manager Enterprise Administrator role grants permissions for NSX operations and security to a vCenter Server user.

    This issue is resolved in this release.

  • In the Workflows screens, for a workload domain deletion workflow, the "Destroy vms on hosts" subtask does not display the details of the VMs and ESXi hosts affected by that subtask
    For a workflow, you can navigate to see the details of the workflow's subtasks from the System Status > Workflows screen in the SDDC Manager, and when you expand a subtask, the screen usually displays details about the system artifacts that are being operated on by that subtask. For the workflow in which a workload domain is being deleted, the user interface does not display any details about the VMs or ESXi hosts on which the "Destroy vms on hosts" subtask is operating, such as the number of VMs that are deleted from which ESXi host.

    This issue is resolved in this release.

  • Log files and some locations in the SDDC Manager might contain occurrences of "IaaS"
    The phrase "IaaS" might appear in the system's log files and in some places in the SDDC Manager instead of "Virtual Infrastructure" or "VI".

    This issue is resolved in this release.

  • After a workload has been deleted, it still appears in the interface for an extended time

    This issue is self-correcting. The workload disappears only after the deletion process is completed.

    This issue is resolved in this release.

  • SSO customization page is lost after PSC upgrade

    PSC upgrade removes the customized SSO login page, accessing the SDDC Manager shows the default login page.

    This issue is resolved in this release.

  • NSX upgrade fails with RuntimeException

    During an NSX upgrade, the update process fails after the NSX Manager upgrade, skipping the update of Controllers and Edge. The upgrade log shows the following message:
    Upgrade element resourceType: NSX_CONTROLLER resourceId: 3b4e23c4-7177-4444-9901-9fe7a02a30ae:controller-cluster status changed to SKIPPED

    This issue is resolved in this release.

  • On the Lifecycle Management Update screen, when you expand the section for a failed VMware Software upgrade to see the status of the underlying tasks, the task at which the process failed has a green check mark icon

    The VMware Software upgrade process involves performing a number of tasks. When one task fails, the screen shows that the overall VMware Software upgrade process failed. Due to this issue, when you expand the section in the user interface to view the list of tasks, the task at which the process failed has a green check mark icon next to it instead of the red failure icon.

    This issue is resolved in this release.

  • LCM update logs saved in two folders
    LCM update logs are being saved in two similarly named folders:

    • /home/vrack/lcm/upgrades
    • /home/vrack/lcm/upgrade

    This issue is resolved in this release.

  • The icon displayed for the VMware Cloud Foundation content pack in vRealize Log Insight references the prior product name

    Prior to this release, the product name was EVO SDDC. In the vRealize Log Insight web interface, when you navigate to the Installed Content Packs list and select the one labeled VMware - VCF to see its information screen, the displayed icon references that prior name. The small icon next to the VMware - VCF label in the Dashboards view also references the prior product name.

    This issue is resolved in this release.

  • When a Cisco ToR switch is powered down or up, the TOR_SWITCH_DOWN event is not listed in the System Status - Events screen in the SDDC Manager, even though the event is visible in the vRealize Log Insight instance

    For systems that have Cisco ToR switches, when the switch is powered down, the TOR_SWITCH_DOWN event is not being written to the software location that populates the System Status - Events screen, even though the event is sent as expected to the vRealize Log Insight instance that SDDC Manager deploys in the system. As a result, even though you can see the reported events using the vRealize Log Insight Web interface, the TOR_SWITCH_DOWN events from Cisco ToR switches will not appear in the SDDC Manager's Events screen.

    This issue is resolved in this release.

  • ToR-Spine Link (40G) not coming up after imaging
    Post imaging of rack, the ToR and spine ports show as not connected. This issue is caused by the port-channel speed being set to Auto but the speed on the Network Transceiver does not default to automatically to 40Gbps. As a result, the ports on the ToR and Spine switches operate at different speeds, causing this connectivity issue.

    This issue is resolved in this release.

Known Issues

The known issues are grouped as follows.

Imaging Known Issues
  • VDI configuration: FQDN setting requires specific formatting

    When configuring the network for a new VDI, if you select Active Directory, an error will result if you do not use the correct formatting for the FQDN setting.

    Workaround: Use the following format when specifying the FQDN:

    csload.horizon-[x].local
    apload.horizon-[x].local

  • Modified bundle section in VIA interface not displaying correctly

    Observed in the Firefox browser. The interface elements of the Bundle tab in VIA do not display correctly. This is caused by older versions of Javascript cached in the browser.

    Workaround: Use Ctrl+F5 to force a refresh. The Bundle tab should display correctly.

Bring-Up Known Issues
  • If the time sync process fails on a host but POSV (Power On System Validation) passes with no issues, you are not prevented from continuing the system bring-up process even though tasks later on in the bring-up process might fail
    If the time sync process indicates failure on a host, you can continue in the UI to the POSV screen and run the POSV process to help identify issues on the host that might have caused the failure. However, due to this issue, if the POSV process subsequently passes, the Continue button is available and allows you to proceed in the bring-up process even though the time was not synchronized on that host. If you click Continue to proceed instead of Retry to rerun the time sync process, bring-up tasks that are performed later might fail, such as setting the NTP or deploying the PSC and vCenter Server appliances.

    Workaround: To avoid any unexpected issues, if the time sync process indicates a host has failed but the POSV process passes, click Retry after the POSV is done to ensure the time synchronization is rerun.

  • Alerts raised during POSV do not contain a rack name in their description
    Because the rack name is not specified by the user until after the IP allocation step in the system configuration wizard, a rack name is not available to display in alerts raised prior to that step. Alerts that are raised subsequent to that step do contain the user-specified rack name.

    Workaround: None.

  • The bring-up process on the first rack fails at task "NSX: Register vCenter with error NSX did not power on on time"

    The bring-up process fails because the NSX Controller virtual machines did not power on during the wait time set in the NSX: Register vCenter task.

    Workaround: On the Add Host Bring-Up Status page, click Retry to proceed with the bring-up process.

  • System bring-up process might fail at task ESX: Configure Power Management
    If intermittent connectivity to an ESXi host occurs during the bring-up process, the process might fail at the ESX: Configure Power Management task with the following exception
    com.vmware.vrack.vrm.core.error.EvoWorkflowException: Unable to access the ESXi host

    Workaround: In the bring-up user interface, click the Retry button to perform the task and proceed with the bring-up process.

  • POSV Failed as N0 was missing from prm_host tables

    Power On System Validation (POSV) failed because N0 was missing from the prm_host tables. This issue was observed in a deployment with a Quanta S210 server and a Quanta ToR switch, a combination which is not supported.

    Workaround: Use the following workaround to resolve this issue.

    1. Shutdown the SDDC Controller VM.
    2. Obtain the IP Address of the ESXi host (first ESXi) on which the SDDC Controller VM is deployed.
    3. Log in to the first ESXi host using the VI Client.
    4. Revert the snapshot of the SDDC Controller VM to the one created by VIA.
    5. Power on the SDDC Controller VM but do not initiate bring-up.
    6. Log in to the SDDC Controller VM and open the /home/vrack/vrm/vRack/vcf-imaging-details.json file.
    7. Locate the entry for the first ESXi host in the file and change the Node-ID for the host to N0.
    8. Verify that there is no duplicate entry for N0 in the file.
    9. Save the file contents and restart the SDDC Controller VM.
    10. After restart is complete, initiate bring-up.
  • Google Chrome browser crashes for no known reason during bring-up

    The Chrome browser sometimes crashes when left open during bring-up, displaying the "Aw Snap something went wrong while displaying the webpage" message. Bring-up is unaffected. This is presumed to be a browser issue, not a Cloud Foundation issue.

    Workaround: Reload the web page.

Multi-Rack Bring-Up Known Issues
  • Add Host/Add Rack interface prevents user from adding host/rack when bring-up of any previous add rack/add host task is failed.

    The Add Host and Add Rack interfaces do not allow users to add a host or a rack if the bring-up of a Add Host and Add Rack is in currently in a failed state. Racks and adding hosts to racks should be independent from other racks.

    Workaround: None. This issue may be addressed as a design issue in a future release.

Post Bring-Up Known Issues
  • User is not prevented from marking bootstrapped host as "Ineligible"

    Users are able to configure bring-up to skip hosts during the add-rack and add-host workflows by running a script that marks the hosts as ineligible.
    However, if a user inadvertently marks an already bootstrapped host as ineligible, it generates an event and alert that requires the host to be decommissioned and reimaged, which prevents the host from being added to a domain. The user is also prevented from running add-host or add-rack until this is resolved.

    Workaround: You can undo this error by clearing the Alerts for this host from the interface. Afterward, access the host and modify the prm_host table by changing the bootstrap_status to COMPLETE.

  • When you use the vSphere Web Client to view the vCenter Server clusters associated with the management domains or workload domains, you might see alarms related to vSAN HCL
    As described in KB article 2109262, the vSAN Health Service has built-in Hardware Compatibility (HCL) health check that uses a JSON file as its HCL database, to inform the service of the hardware and firmware that is supported for vSAN. These alarms are raised if the HCL health check fails. However, because the set of supported hardware and firmware is constantly being updated as support for new hardware and firmware are added, if the check fails, the first step is to obtain the most recent vSAN HCL data and use the vSphere Web Client to update the HCL database.

    Workaround: The steps to update the vSAN HCL database are described in KB article 2145116.

    1. Download the latest vSAN HCL data by opening a browser to https://partnerweb.vmware.com/service/vsan/all.json.
    2. Save the data to a file named all.json.
    3. Copy the all.json file to a location that is reachable by your Cloud Foundation installation.
    4. Log in to the vCenter Server instance using the vSphere Web Client and select the vRack-Cluster for the management domain or workload domain in which you are seeing the HCL health alarms.
    5. Navigate to the Manage tab, click Settings, and select Health and Performance in the vSAN section.
    6. In the HCL Database area, click Update from file, browse to and select the all.json file you saved in step 2.
    7. Retest the health by navigating from the Manage tab to the Monitor tab and clicking vSAN > Health > Retest.

    After granting the NSX Manager Enterprise Administrator role to the superuser account, you can now log in to the vSphere Web Client with that account and the Networking & Security features are available to that user.

  • The standard vCenter Server alarm named "License inventory monitoring" is raised for overprovisioning of the ESXi hosts, even though the ESXi hosts have the appropriate license key applied
    Under the standard licensing terms for the VMware Cloud Foundation product, all of the ESXi hosts in a Cloud Foundation installation are licensed using the same key. In the vCenter Server Licenses pane in the vSphere Web Client, in the Product column for this key, you see the associated product name is VMware vSphere 6 Enterprise for Embedded OEMs (CPUs). Under the VMware licensing terms, that type of key is allowed to be overprovisioned. However, due to this issue, when the vCenter Server sees this key as overprovisioned, it is incorrectly raising the standard vSphere "License inventory monitoring" alarm. You can use the vSphere Web Client to see the alarm definition for that alarm, by selecting the vCenter Server object in the left hand navigation area, and clicking the Manage tab > Alarm Definitions and clicking License inventory monitoring in the list.

    Workaround: None. Ignore these vCenter Server license inventory monitoring alarms about the overprovisioning of license capacity of the ESXi hosts in your Cloud Foundation installation.

  • Decommissioning of all hosts from second rack results in PRM exception

    If you decommission all hosts from the second rack, this breaks the logical inventory data fetch. As a result, SDDC Manager cannot display any data from the rack.

    Workaround: Don't decommission all the hosts for the addOn rack. Retain at least one host in this rack.

  • Host sometimes hangs after doing a normal reboot
    Due to a known vSAN issue, in a Cloud Foundation installation that has Dell R630 or Dell x730d servers and with certain controllers, sometimes a host hangs after doing a normal reboot. For a complete list of affected controllers, see VMware Knowledge Base article 2144936.

    Workaround:

    1. Ensure the storage-controller drivers and firmware are updated on each host according to the information in the VMware Compatibility Guide for vSAN for PERC H370:http://www.vmware.com/resources/compatibility/detail.php?deviceCategory=vsanio&productid=34853&deviceCategory=vsanio&details=1&vsan_type=vsanio&keyword=h730
    2. Apply the settings as described in the VMware Knowledge Base article 2144936.
    3. Log into the host through its iDRAC web interface and reset the server.
    4. If the issue persists, run the Hardware Diagnostics, also from iDRAC web interface:
      1. From the menu bar, select Next Boot > Lifecycle Controller.
      2. When prompted, click OK to confirm the selection.
      3. From the menu bar, select Power > Power Cycle System (cold boot).
      4. When prompted, click Yes to confirm the action.
        The system is power cycled and rebooted to the Lifecycle Controller. If Settings or any other page displays, click Back or Cancel to go to the Hardware Diagnostics page.
      5. From the Lifecycle Controller, select Hardware Diagnostics in the navigation panel and click Run Hardware Diagnostics.
      6. When prompted, click Yes to confirm the action.
        The hardware diagnostics process starts and lasts for an hour or longer.
      7. When the hardware diagnostics process completes, repeat the Power > Power Cycle System (cold boot) command.
        The system should now boot without any hang time.
  • Update/upgrade attempt on unmanaged host fails with error: "UPGRADE_SPEC_INVALID_DATA; The ESX host is managed by vCenter server with IP: x.x.x.x"

    If a selected unmanaged host update/upgrade fails with the above message, retrying will also fail because the API expects unmanaged hosts to not be managed in vCenter. If the initial update failure persists after adding the host to vCenter, the host remains attached to vCenter and causes subsequent attempts to fail.

    Workaround: Before retrying, remove the host from the vCenter inventory. Verify that the host is present in the SDDC Manager free-pool capacity and displays a healthy status.

  • Datacenter added as part of bring-up is not displayed in the Network Settings: Datacenter page

    The datacenter subnet named PUBLIC is provided as part of the bring-up process and is associated with the management domain. However, this subnet is not listed on the Datacenter page, allowing for the possibility of a user inadvertently creating the same subnet in the Datacenter page. This it triggers a validation rule and displays a "Subnet in use" message.

    Workaround: Do not duplicate the PUBLIC subnet.

  • The Add Host workflow fails at the reconfiguring Host OOB IP task

    The Add Host workflow fails at the reconfiguring Host OOB IP task.

    Workaround: Use the procedure to work around this issue.

    1. On the host, set a static IP for iDRAC in the range of 192.168.0.50 to 192.168.0.99.
    2. Make sure the host has uses the default iDRAC username and password.
    3. Image the node using VIA individual node imaging.
    4. Once imaging completes on the host, follow the procedure for adding host in the VCF product documentation.

  • Recreate VI workload domain or VDI workload domain blocked

    Sometimes you cannot recreate a VI workload domain or VDI workload domain after it has been deleted post-bring-up.

    Workaround: Clear the error Alerts, and retry.

  • During imaging, the host loses network connectivity but the Imaging tab indicates that it has been successfully imaged.

    If DHCP is not enabled on the server BMC before imaging, the server is not accessible on the network.

    Workaround: If DHCP was not enabled on the server BMC before imaging, follow the steps below.

    1. Log in to the management switch.
    2. Open the /etc/network/interfaces file.
    3. Scan the configuration settings for the auto iface entries.

      All the entries should show bridge-access 4. For example:

      auto swp6
      iface swp6
          bridge-access 4
      
    4. If any entries show bridge-access 3, change the bridge-access value to 4.
    5. Save and close the /etc/network/interfaces file.

      This should restore network access to the host and its iDRAC.

SDDC Manager Known Issues
  • In SDDC Manager, the Dashboard: Physical Resource: Rack Details page is taking a long time to respond

    This delay is a result of SDDC Manager calls to the Physical Resource Manager (PRM) requiring up to five times longer to complete than previously baselined.

    Workaround: This issue occurs only when the SDDC Manager VM and management vCenter VM are on the same host. To prevent this issue, it is recommended you create a DRS Anti-Affinity Rule that prevents the SDDC Manager VM and management vCenter VM from being on the same host.

  • SDDC Manager considers hosts for workloads whose bootstrap has failed and status is "Eligible"

    When creating a workload domain, SDDC Manager should not consider hosts with an "Eligible" status, only those with "Complete" status. Otherwise, it may add a host even though the Add Host workflow has failed.

    Workaround: Do not create new workload domains if the Add Host workflow is in a failed state.

  • Unable to trigger password rotation if there is a failed SDDC Manager Configurations Backup workflow

    The workflow for triggering Password Rotation fails is the SDDC Manager Configuration Backup workflow has a failed status. This behavior is by design; you cannot rotate passwords if there is a failed backup workflow.

    Workaround: Use the following procedure to remove the failed workflow.

    1. Log in as root to the SDDC controller VM.
    2. Get all the workflows and note the workflow Id of the failed backup workflow.
      /home/vrack/bin/vrm-rest GET /core/activity/vrack/workflows
    3. Log in as root to the SDDC Manager VM and delete the specified failed workflow using zkCli.sh.

      To connect to the Zookeeper service:

      root@sddc-manager [ / ]# /opt/vmware/zookeeper/bin/zkCli.sh

      To remove the failed workflow:

      rmr /Workloads/Workflows/<FAILED WORKFLOW Id>
    4. Retry the failed workflow rotation.

      It should now succeed.

  • SDDC Manager has no interface for verifying and modifying the configured DNS forwarders

    SDDC Manager currently provides no usr interface where a user can view, verify, and possible modify the configurations of DNS forwarder. The ideal scenario would be in the VRM UI, probably somewhere in the settings, to be able to see and change those DNS forwarders.

    Workaround: You can access and modify these configurations in the unbound.conf file.

    1. Login to the SDDC Manager Controller VM.
    2. Open the /etc/unbound/unbound.conf file.
    3. Verify or modify the configured DNS forwarders.
    4. Save and close the /etc/unbound/unbound.conf file.
    5. Reboot the SDDC Manager to verify configuration persistence.

  • After the SDDC Manager times out in your browser session and displays the login screen, when you try to log in after a few hours, an error message about the required user name and password is displayed instead of the expected message about the expired SAML request

    Authentication to the SDDC Manager uses SAML (Security Assertion Markup Language). When the SDDC Manager is idle for a period of time, it automatically logs you out and displays the login screen. The URL in the browser holds the original SAML authentication request. After a longer period of time, on the order of hours, the SAML authentication request expires, by design. As a result, if you return to the screen without refreshing the browser session to get a new SAML authentication request, the request fails by design. However, instead of an error message informing you of the expired SAML request, an error message stating "User name and password are required" is displayed.

    Workaround: If you encounter this issue, open a new browser session to the virtual IP address of your SDDC Manager, such as https://vrm.subdomain.root-domain:8443/vrm-ui, as described in the Administering VMware Cloud Foundation Guide.

  • An expansion workflow that involves adding more than one ESXi host to a management or workload domain is marked successful, even though when the hosts were added to the domain's vCenter Server cluster, the NSX Manager Host Preparation process failed to complete on one or more hosts
    During an expansion workflow, the hosts are added to the vCenter Server cluster that underlies the management or workload domain. When hosts are added to a vCenter Server cluster that has NSX enabled on the cluster, one of the tasks involves preparing the newly added hosts, as described in the Prepare Hosts on the Primary NSX Manager topic in the NSX 6.2 documentation. Part of this host preparation process involves a scan of each added ESXi host prior to installing the required NSX software on that host. If the scan on a particular host fails for some transient reason, the NSX Manager host preparation process fails for that host. However, this failure condition is not reported to the expansion workflow and the workflow appears as successful in the SDDC Manager.

    Workaround: When performing an expansion workflow that involves multiple hosts and when the SDDC Manager indicates the workflow has completed, perform the follow steps to verify the NSX host preparation was successful for each added host, and if not, resolve the issues reported by NSX.

    1. Using the vSphere Web Client, log in to the vCenter Server instance for the management or workload domain that was expanded.
    2. In the vSphere Web Client, examine the NSX Manager host preparation state by navigating to Networking & Security > Installation and clicking the Host Preparation tab.
    3. On the Host Preparation tab, expand the cluster if it is not already expanded, and examine the data reported for each host in the Installation Status column and VXLAN column:
      • If the Installation Status column reports green checkmarks and "Configured" in the VXLAN column for all hosts, the added hosts were successfully prepared.
      • If the Installation Status column displays "Not Ready" and the corresponding VXLAN column displays "Error" for a host, resolve the error by right-clicking on the VXLAN column's "Error" and clicking Resolve. This action also applies the VXLAN distributed switch port group to that host.
  • Because no unique identifier is used to identify a Cloud Foundation system, when you deploy more than one system in your networking environment, you cannot use the same level of Active Directory (AD) integration for both systems

    For the first Cloud Foundation system deployed in your environment, you would configure AD authentication by adding your AD as an identity source to the Platform Security Controller instances using the Active Directory (Integrated Windows Authentication) option and joining the vCenter Single Sign-On server to the AD domain. Due to this issue, for additional systems, you cannot do that same configuration.

    Workaround: For additional systems, the Active Directory as an LDAP Server option can be used to add your AD as an identity source to the Platform Security Controller instances in those systems.

  • A workload domain’s workflow can fail if a VM in the management domain on which the workflow depends is in non-operational state
    Workflows to deploy, delete, and expand workload domains can fail if some of the management domain’s virtual machines are in an invalid state, down, or temporarily inaccessible. SDDC Manager does not prevent you from initiating and submitting a workflow when one of the VMs is in an invalid state. These virtual machines include the PSC VMs, vCenter Server VMs, Infrastructure Services Manager VMs, vRealize Operations Manager VM, vRealize Log Insight VM, NSX Manager VM, and, in a multi-rack system, the SDDC Manager VM. If you submit the workflow and one of those virtual machines becomes temporarily inaccessible as the workflow is performed, the workflow will fail.

    Workaround: Before initiating a workflow, review the state of the management domain’s virtual machines to see that they are all in a valid (green) state. You can see the virtual machines by launching the vSphere Web Client from the domain details of the management domain.

  • When using the SDDC Manager’s Uplink screen to update L3 connectivity settings, the Uplink screen does not indicate which of the ToR switches has the L3 uplink configured on it
    When an uplink is configured to L3 mode, only one of the two ToR switches has an uplink port. The SDDC Manager does not indicate which ToR switch is connected to the upstream router.

    Workaround: When you use the Uplink screen to change uplink connectivity settings, perform the following steps.
    Note: Changing the settings triggers uplink reconfiguration on the switches. Because the reconfiguration process might take a few minutes to complete, connectivity to the corporate network might be lost during the process. To avoid losing connectivity with SDDC Manager, it is strongly recommended that you are connected to port 48 on the management switch when updating the settings using the Uplink screen.

    1. Connect to port 48 on the management switch and log in to the SDDC Manager using that connection.
    2. On the Uplink screen, configure the L3 uplink and click SAVE EDITS.
    3. Re-configure your upstream router to use the new network settings that you specified in step 2.
    4. Wait at least 3 minutes.
    5. Try connecting the upstream router to the top ToR switch.
    6. Test the new uplink connectivity by disconnecting from port 48 on the management switch and connecting to the virtual rack with the new uplink configuration.
    7. If you are unable to reconnect to the virtual rack, try connecting the upstream router to the bottom ToR switch.
    8. If you are unable to connect to the virtual rack, reconnect using port 48 on the management switch and try reconfiguring your network to the original configuration.
    9. If you cannot connect to the virtual rack with either configuration, contact VMware Support.
  • Existing trunk ports on ToR Cisco Nexus 9K switches are assigned to new VLANs when a VI or VDI workload domain is created
    During imaging with the VIA, port-channel 29, 45, 100, 110, and 120 are created on the ToR Cisco Nexus 9K switches, and are set to belong to all VLANs. As a result, when new VLANs are entered during creation of a VI or VDI workload domain, these port-channels become part of the new VLANs and the external VLAN and other VLANs created specifically for the new workload domain get assigned to all existing trunk ports on the ToR Cisco Nexus 9K switches, including the uplink and management cluster bonds.

    Workaround: None

  • Rebooting Switches Can Crash SDDC Manager Server

    After rebooting the ToR and inter-rack switches, the SDDC Manager interface was not accessible.

    Workaround: Restart the Tomcat server to restore the SDDC Manager server.

  • Unable to change the Datacenter Connection name on racks from the SDDC Manager interface

    If a user wants to the change the value of the Datacenter Connection name on a rack, there is no apparent way to do so from the SDDC Manager interface.

    Workaround: Delete the datacenter network using the following SoS command:
    /opt/vmware/sddc-support/sos --delete-dc-nw --dc-nw-name <datacenter_name>

    You can now create a new datacenter network with the desired name.

  • In SDDC Manager, the Rack Details page fails during VI domain deletion task

    If you navigate to and refresh the Rack Details page while a domain deletion task is running, the page will not display and eventually time out.

    Workaround: Navigate away from the Rack Details page and return after the domain deletion task completes.

Virtual Infrastructure Workload Domain Known Issues
  • Cisco plugin throws WARNING instead of ERROR if <subnet> is already configured on interface

    This is caused by an IP address conflict. When the uplink is over layer-3 (IP routing based) instead of VLAN switching, then the management VLAN and VI VLAN are configured to have a Switch Virtual Interface (SVI) on both ToR switches with individual IP address and common VRRP IP (for each VLAN). When the VI is deleted, the IP addresses are not removed from the SVI. As a result, if any subsequent VI workload has the same subnet, and therefore the same IP for SVI then the SVI creation fails.

    Workaround: You can prevent the IP address conflict as follows:

    1. After a VI workload is deleted through UI on a setup with Layer 3 uplink, log in to both ToR switches.
    2. Run the following command.
      configure terminal
      no interface vlan <VLAN id>

      Where <VLAN id> is the VLAN id that was given when the VI workload was created.

  • In a Cloud Foundation environment configured with L3 uplinks, when you try to create a workload domain with a data center (external) connection using the same subnet but a different VLAN as a workload domain that was previously created and deleted, the workload domain creation fails
    When a workload domain is deleted and your environment’s ToR switch uplinks are configured with L3, the Switched Virtual Interfaces (SVIs) that were originally created on the ToR switches for that workload domain are not deleted. Due to this issue, if you subsequently try to create a workload domain using a different VLAN ID but same subnet as the deleted one, the workload domain creation fails because the switches do not allow two VLAN IDs with the same subnet.

    Workaround: When creating a VI or VDI workload domain, in the data center connection’s configuration, do not combine a different VLAN ID with a subnet that was previously used for a deleted workload domain. You can reuse the same VLAN with the same subnet or reuse the same VLAN with a different subnet.

  • The VI workload domain creation and expansion workflows might fail at task "ConfigureVCenterForLogInsightTask" due to a failure to connect to the deployed vRealize Log Insight instance
    During the VI workload domain creation and expansion workflows, if the system cannot connect to the deployed vRealize Log Insight instance, the workflow fails at the "ConfigureVCenterForLogInsightTask" task and you see an exception in the log with a 500 HTTP error code:
    [com.vmware.vrack.vrm.workflow.tasks.loginsight.ConfigureVCenterForLogInsightTask] Exception while doing the integration: Create session to LogInsight Failed : HTTP error code : 500

    Workaround: Restart the vRealize Log Insight's virtual machine by using the management domain's vCenter Server launch link to open the vSphere Web Client and using the vSphere Web Client user interface to restart the vRealize Log Insight's virtual machine. Then restart the failed workflow.

  • The VI workload domain creation workflow might fail at task "VC: Deploy vCenter" due to a failure to connect to the system's Platform Services Controller instances
    During the VI workload domain creation workflow, if the system cannot connect to the integrated Platform Services Controller instances, the workflow fails at the "VC: Deploy vCenter" task and you see errors in the log such as:
    Unexpected error while verifying Single Sign-On credentials: [Errno 111]
    Connection refused
    Cannot get a security token with the specified vCenter Single Sign-On configuration.

    Workaround: Restart the system's PSC-2 virtual appliance, then the PSC-1 virtual appliance, then the vCenter Server virtual appliance. Wait until each virtual appliance is up and running before restarting the next one. Then restart the failed workflow.

  • On the Review page of the VI workload domain creation wizard, the Download and Print buttons are not operational
    Due to this issue, when you reach the Review step of the VI workload domain creation wizard, you cannot use the Download or Print buttons to create a printable file of the displayed information for future reference.

    Workaround: None. At the Review step of the wizard, you must manually capture the information for future reference, for example by taking screen captures of the displayed information.

  • Dual rack VI creation fails at during the vCenter: Deploy vCenter workflow

    When creating a dual rack VI workload domain, it fails during the vCenter: Deploy vCenter workflow. The log may show that there is no available space for the vSAN datastore. For example:

    2017-08-17 10:40:18.146 [Thread-6839] DEBUG [com.vmware.vrack.vrm.core.local.InMemoryLogger] 
    The free space of datastore 'vsanDatastore' (0.0 GB) in host

    Workaround: Restart the VMware vCenter Server by restarting the vmware-vpxd service and retry. The process should succeed. For information about restarting this service, see the Knowledge Base Article 2109887 Stopping, starting, or restarting VMware vCenter Server Appliance 6.x services.

VDI Workload Domain Known Issues
  • Host resources reflecting to zero after VI deletion

    After deleting a VI , some of the host indicate having zero resources (CPU and memory).

    Workaround: Wait some time. The interface update is delayed due to an unrelated bug. Alternatively, you can restart HMS to force a refresh.

  • VDI Creation failing at task vCenter: Enable vSAN

    While creating a new VDI, the workflow fails during the vCenter: Enable vSAN task.

    Workaround: Disable and then re-enable vSAN. Retry the workflow. It should continue.

  • Unable to restart failed VDI workflow
    Sometimes if a VDI workflow fails, the user may be unable to restart it. The system returns the following error message: Plan is expired. Please start a new request and shows that the failure occurred when during the creation of a VI or VDI workload. This issue has been fast-tracked for resolution.

    Workaround: None. However, this issue may in part be due to the VI reservation being incorrectly marked as consumed and therefore no longer available.

  • The VDI workload domain creation workflow might fail at task "Instantiate Horizon View Adapter"
    Due to intermittent timing issues, the VDI workload domain creation workflow sometimes fails at the Instantiate Horizon View Adapter task with the following exception error in the log: com.vmware.vrack.vdi.deployment.tools.tasks.VDIWorkflowException: "Unable to create vROps REST client" As a result, the pairing credential between the vRealize Operations Manager instance and the VDI environment is in a partially instantiated state and must be deleted before restarting the workflow.

    Workaround: Manually delete the pairing credential that is associated with the workload domain's Horizon Connection server and then restart the failed workflow using the Restart Workflow action in the workflow's status screen using these steps:

    1. Verify that you have the IP address for the first Horizon Connection server that was deployed for this VDI workload domain, such as 10.11.39.51 You will use that IP address to identify which pairing credential to delete.
    2. Log in to the vRealize Operations Manager Web interface. You can use the launch link in the management domain's details screen to open the log in screen.
    3. From the vRealize Operations Manager Web interface's Home screen, navigate to the Credentials screen by clicking Administration > Credentials.
    4. Locate the pairing credential having a name in the form of vdi-view-adapter-IPaddress, where the IP address matches the one you obtained in step 1. For example, if the Horizon Connection server has IP address 10.11.39.51, the displayed pairing credential name is vdi-view-adapter-10.11.39.51.
    5. Select that pairing credential and delete it.
    6. In the workflow's status screen, restart the failed workflow using the Restart Workflow action.
  • In a Cloud Foundation environment configured with L3 uplinks, when you try to create a workload domain with a data center (external) connection using the same subnet but a different VLAN as a workload domain that was previously created and deleted, the workload domain creation fails
    When a workload domain is deleted and your environment’s ToR switch uplinks are configured with L3, the Switched Virtual Interfaces (SVIs) that were originally created on the ToR switches for that workload domain are not deleted. Due to this issue, if you subsequently try to create a workload domain using a different VLAN ID but same subnet as the deleted one, the workload domain creation fails because the switches do not allow two VLAN IDs with the same subnet.

    Workaround: When creating a VI or VDI workload domain, in the data center connection’s configuration, do not combine a different VLAN ID with a subnet that was previously used for a deleted workload domain. You can reuse the same VLAN with the same subnet or reuse the same VLAN with a different subnet.

  • When creating a VDI workload domain with specified settings that results in the system deploying two vCenter Server instances, the creation workflow might fail at the "ESXI: Incremental LI Integration" task
    Depending on the value for the Max Desktops [per vCenter Server] setting in the VDI Infrastructure screen and your choice for the number of desktops in the VDI workload domain creation wizard, the system might need to deploy more than one vCenter Server instance to support the desired VDI workload domain. As part of this deployment, the system starts two VI workload domain creation workflows. One of the VI workload domain creation workflows might fail at the task "ESXI: Incremental LI Integration" with an error message about failure to connect ESXi hosts to vRealize Log Insight:

    hosts to LogInsight failed : HTTP error code : 404 : Response :
    

    Workaround: Use the Physical Resources screens to verify that the ESXi hosts that the failed workflow is trying to use are all up and running. Use the vSphere Web Client to verify that the vRealize Log Insight VMs in the system are all up and running. Ensure that the ESXi hosts involved in the failed workflow and the vRealize Log Insight VMs are in a healthy state, and then click Retry in the failed workflow.

  • UEM agents are not installed as part of the VMware UEM installation from Windows template

    The VDI workflow installs VMware UEM but UEM agents are not installed in that process.

    Workaround: Power on the Windows template, install UEM, then power off the template. Redeploy the desktop. The agents should be successfully installed.

  • Desktop VM creation fails with error "Cloning of VM vm-8-1-45 has failed..."

    During VDI workload creation, the VDI VM fails with error "Cloning of VM vm-8-1-45 has failed: Fault type is INVAlID_CONFIGURATION_FATAL - Failed to generate proposed link for specified host: host-122 because we cannot find a viable datastore".

    Workaround: On the affected desktop pool, as administrator re-enable provisioning through the Horizon console.

  • DHCP Relay Agent IP field missing from the VDI Expansion wizard

    The DHCP Relay Agent IP field is not necessarily required for some domain network expansions, such as Management and VI domains. Additionally this is an optional setting; the user is not required to complete it for VDI expansions. This field will be added to the UI in a future release.

    Workaround: None.

Life Cycle Management (LCM) Known Issues
  • While bundle download is in progress, the myvmware login icon displays the error icon

    Sometimes, while an update bundle download is in progress, the error icon (an exclamation point in a yellow triangle) displays next to the myvmware login icon at the top of the Lifecycle Management: Repository page.

    Workaround: None. In this case, you can ignore the icon.

  • Management vCenter certificate replacement fails

    The Management vCenter SSL certificate fails with the message "Error while reverting certificate for store : MACHINE_SSL_CERT". The certificate manager fails to register the service on the vcenter-1 instance. This is due to a rare vCenter issue.

    Workaround: Wait and retry until the process succeeds. It may require several tries.

  • LCM update status erroneously displays as FAILED after auto-recovery
    If the ESXi update fails during the ESX HOST UPGRADE STAGE REBOOT stage due to connectivity issues, and the update is automatically recovered after the connectivity issue is resolved, the update might still display as FAILED in the LCM update history even though it was successful. (The respective domains are upgraded to the target version.)

    Workaround: None. Ignore the LCM update status.

  • ESXi and vCenter update on a host might fail in the task of exiting maintenance mode
    Sometimes during an ESXi and vCenter update process, a host might fail to exit maintenance mode, which results in a failed update status. During an update, the system puts a host into maintenance mode to perform the update on that host, and then tells the host to exit maintenance mode after its update is completed. At that point in time, a separate issue on the host might prevent the host from exiting maintenance mode.

    Workaround: Attempt to exit the host from maintenance mode through the vSphere Web Client.

    • Locate the host in vSphere and right-click it.
    • Select Maintenance Mode > Exit Maintenance Mode.

      This action will list any issues preventing the host from exiting maintenance mode.

    • Address the issues until you can successfully bring the host out of maintenance mode.
    • Return to the SDDC Manager and retry the update.
  • LCM Inventory page shows a failed domain, but no failed components
    The LCM Inventory page shows a failed domain, but does not show any failed components.

    Workaround: Log in to vCenter for the domain and check that all hosts in the domain have the lcm-bundle-repo available. Add the lcm-bundle-repo if necessary.

  • Lifecycle Management page shows all available update bundles independent of the Cloud Foundation release in your environment
    The Lifecycle Management Repository page displays all available updates, regardless of your specific release.

    Workaround: None. Proceed to download the bundles as indicated. Lifecycle Management evaluates and determines the necessary update bundles after they have been downloaded and will apply only the bundles appropriate for your product versions.

  • Improve error logging for the failures that  happen before VUM upgrade stage

    The error message that is returned when an ESXi update fails includes the misleading text: "LCM will bring the domain back online once problems found in above steps are fixed manually."

    Workaround: This particular sentence is not relevant to the ESXi update and can be ignored.

  • ESXi VUM-based update failure

    ESXi VUM-based update fails with " Failed tasks during remediation" and "Failed VUM tasks" messages. This failure is most likely due to a collision between simultaneous VUM-based tasks. This issue will soon be resolved by improved VUM integration.

    Workaround: Retry.

  • Skipped host(s) during ESXi upgrade prevent users from continuing the upgrade process 

    If during the ESXi upgrade process one or more hosts are skipped, the user is unable to continue the rest of the upgrade. However, this is expected behavior because the entire domain must be upgraded against available update bundle.

    Workaround: Retry and complete all skipped host upgrades to unblock the remaining upgrades.

  • Non-applicable LCM update bundles show status Pending

    LCM update bundles that were released for earlier versions of VMware Cloud Foundation appear in the LCM  Repository page with a status of PENDING. They should not appear at all.

    Workaround: None. Please ignore the bundles and the status.

Monitoring Known Issues
  • When a Quanta server is powered down (off), the CPU_CAT_ERROR (CPU Catastrophic Error) event is generated
    When a Quanta server is powered down, the SERVER_DOWN event is generated, which is the expected behavior. The issue is the CPU_CAT_ERROR event is also generated when a Quanta server is powered down.

    Workaround: None