Learn how to troubleshoot your VMware Tanzu Application Service for VMs [Windows] (TAS for VMs [Windows]) issues.

Installation issues

This section describes the issues that might occur during installation.

Missing local certificates for Windows FS Injector

Symptom

You run the winfs-injector and see this certificate error:

Get https://auth.docker.io/token?service=registry.docker.io&
scope=repository:cloudfoundry/windows2016fs:pull: x509:
failed to load system roots and no roots provided

Explanation

Local certificates are needed to communicate with Docker Hub.

Solution

Complete the installation of the necessary certificates on your local machine. On Ubuntu, you can install certificates with the ca-certificates package.

Outdated version for Windows FS Injector

Symptom

You run the winfs-injector and see this missing file or directory error:

open ...windows2016fs-release/VERSION: no such file or directory

Explanation

You are using an outdated version of the winfs-injector.

Solution

From the VMware Tanzu Application Service for VMs [Windows] page on Broadcom Support, download the recommended version of Windows FS Injector for the tile.

Missing container image

Symptom

You clicked the + icon in Operations Manager to add the TAS for VMs [Windows] tile to the Installation Dashboard and see this error:

Error message: Has an invalid release (windows 2016fs) specified for job type Windows Diego Cell.

Explanation

The product file that you are trying to upload does not contain the Windows Server container base image.

Solution

  1. Delete the product file listing from Operations Manager by clicking the Trash can icon under Import a Product.

  2. Follow the TAS for VMs [Windows] installation instructions to run the winfs-injector tool locally on the product file. This step adds the Windows Server container base image to the product file, requires Internet access, and can take up to 20 minutes. For more information, see Installing and configuring TAS for VMs.

  3. Click Import a Product to upload the injected product file.

  4. Click the + icon next to the product listing to add the TAS for VMs [Windows] tile to the Installation Dashboard.

Upgrade issues

This section describes issues that might occur during upgrade.

Failure to create Containers when upgrading with Shared Microsoft Base image

Symptom

The prestart script for the windowsfs job fails, and the upgrade fails:

Task 308031 | 13:47:04 | Preparing deployment: Preparing deployment (00:00:03)

Task 308031 | 13:47:11 | Preparing package compilation: Finding packages to compile (00:00:00)

Task 308031 | 13:47:21 | Updating instance windows_diego_cell: windows_diego_cell/44c5841f-7580-4e9c-9856-89fcbe08ab0d (2) (canary) (00:00:35)

L Error: Action Failed get_task: Task 59ba76d1-14c5-4d7b-681c-08b9ec4bd64d result: 1 of 10 pre-start scripts failed. Failed Jobs: windows1803fs. Successful Jobs: set_kms_host, groot, loggregator_agent_windows, bosh-dns-windows, rep_windows, winc-network-1803, set_password, enable_ssh, enable_rdp.

Task 308031 | 13:47:56 | Error: Action Failed get_task: Task 59ba76d1-14c5-4d7b-681c-08b9ec4bd64d result: 1 of 10 pre-start scripts failed. Failed Jobs: windows1803fs. Successful Jobs: set_kms_host, groot, loggregator_agent_windows, bosh-dns-windows, rep_windows, winc-network-1803, set_password, enable_ssh, enable_rdp.

Otherwise, the post-start script for the rep_windows job fails, and the upgrade fails:

Task 8192 | 21:12:30 | Updating instance windows2019-cell: windows2019-cell/bd6d70b9-ed1f-412f-9d49-8045627f4ab3 (0) (canary) (00:17:24)
                     L Error: Action Failed get_task: Task a9555020-1a3b-40c7-677c-d6fc392ce135 result: 1 of 3 post-start scripts failed. Failed Jobs: rep_windows. Successful Jobs: route_emitter_windows, bosh-dns-windows.
Task 8192 | 21:29:55 | Error: Action Failed get_task: Task a9555020-1a3b-40c7-677c-d6fc392ce135 result: 1 of 3 post-start scripts failed. Failed Jobs: rep_windows. Successful Jobs: route_emitter_windows, bosh-dns-windows.

Explanation

When you upgrade between versions of Windows rootfs that have a shared Microsoft base layer, TAS for VMs [Windows] might fail to create containers.

Solution

For available workarounds, see Failure to create containers when upgrading with shared Microsoft base image.

Forwarding logs to a syslog server

You can use Windows Diego Cell logs to troubleshoot Windows Diego Cells. Windows Diego Cells generate several types of logs:

  • BOSH job logs, such as rep_windows and consul_agent_windows. These logs stream to the syslog server configured in the System Logging pane of the TAS for VMs [Windows] tile. The names of these BOSH job logs correspond to the names of the logs emitted by Linux Diego Cells.

  • Windows event logs. These logs stream to the syslog server configured in the System Logging pane of the TAS for VMs [Windows] tile.

To forward Windows logs to an external syslog server:

  1. Go to the Operations Manager Installation Dashboard.

  2. Click the TAS for VMs [Windows] tile.

  3. Click System Logging.

  4. For VM log forwarding, select Configure.

  5. In Syslog server address, enter the host name or IP address of your syslog server.

  6. Under Syslog server port, enter the port of your syslog server. The default port is 514. The host must be reachable from the TAS for VMs network. Ensure that your syslog server listens on external interfaces.

  7. Under Transport protocol, select the transport protocol for forwarding logs.

  8. If you are using a TCP protocol and want to allow TLS communication:

    1. Activate the Send logs over TLS check box.
    2. For CA certificate, provide the CA certificate that validates connections to an external server. The certificate used to send logs to your syslog server over TLS.
  9. Under ca_cert, add the certificate to validate connections to external server if using tls.

  10. Select the Enable system metrics check box. For a list of the VM metrics that the System Metric Agent emits, see System Metrics Agent in the System Metrics repository on GitHub.

  11. Click Save.

Downloading Windows Diego Cell logs

To download Windows Diego Cell logs:

  1. Go to the Operations Manager Installation Dashboard.

  2. Click the TAS for VMs [Windows] tile.

  3. Click the Status tab.

  4. In the Logs column, click the Download icon for the Windows Diego Cell for which you want to retrieve logs.

  5. Click the Logs tab.

  6. When the logs are ready, click the filename to download them.

  7. Unzip the file and examine the contents. Each component on the Diego Cell has its own log folder:

    • /consul_agent_windows/
    • /garden-windows/
    • /metron_agent_windows/
    • /rep_windows/

Troubleshooting Windows compilation VMs

BOSH deletes a compilation VM after the compilation VM fails. In a vSphere environment, use one of these procedures to troubleshoot your Windows stemcell v2019.7 and any compilation VM issues that follow:

Troubleshooting a slowly deleted Windows compilation VM

The easiest method to troubleshoot a Windows compilation VM is to use SSH into the VM before BOSH deletes it.

To troubleshoot a compilation VM from an ssh session:

  1. Open the vSphere UI.

  2. Open two different BOSH CLI terminal sessions.

  3. Open Tanzu Operations Manager.
  4. For these settings in Tanzu Operations Manager:
    • Select Keep Unreachable Director VMs from the BOSH Director tile, then click Director config.
    • Select Access VMs with BOSH SSH on the TAS for VMs [Windows] tile, then click VM options.
  5. Click Apply Changes against the TAS for VMs [Windows] tile.
  6. From the first BOSH CLI terminal, monitor the BOSH task:

    watch -n 5 "bosh -d TAS-WINDOWS-DEPLOYMENT is --details | grep compilation"
    

    Where TAS-WINDOWS-DEPLOYMENT is the name of your TAS for VMs [Windows] deployment.

  7. Wait until the compilation VM CID is up.

  8. From the second BOSH CLI terminal window, SSH to the Windows compilation VM:

    bosh -d TAS-WINDOWS-DEPLOYMENT ssh COMPILATION-NAME
    

    Where:

    • TAS-WINDOWS-DEPLOYMENT is the name of your TAS for VMs [Windows] deployment.
    • COMPILATION-NAME is the name of your Windows compilation VM.

  9. To prevent BOSH from deleting the compilation VM after the compilation VM fails, search for the compilation VM CID in the vSphere UI and rename it. You are now able to troubleshoot within this session.

  10. After troubleshooting, delete the VM manually.

Troubleshooting a quickly deleted Windows compilation VM

In some situations, the Windows compilation VM might be deleted, making it impossible to SSH into the VM before BOSH deletes it.

To troubleshoot a deleted compilation VM:

  1. Download an Ubuntu desktop image from Ubuntu Releases Xenial.

  2. Upload the Ubuntu desktop image into your vSphere datastore.

  3. Open the vSphere UI.

  4. Open a BOSH CLI terminal session.

  5. Open Tanzu Operations Manager.
  6. For these settings in Tanzu Operations Manager:
    • Select Keep Unreachable Director VMs from BOSH Director tile, then click Director config.
    • Select Access VMs with BOSH SSH from TAS for VMs [Windows] tile, then click VM options.
  7. Click Apply Changes in Tanzu Operations Manager.

  8. From the BOSH CLI terminal window, monitor the BOSH task:

    watch -n 5 "bosh -d TAS-WINDOWS-DEPLOYMENT is --details | grep compilation"
    

    Where TAS-WINDOWS-DEPLOYMENT is the name of your TAS for VMs [Windows] deployment.

  9. Wait until the compilation VM CID is done.

  10. From the vSphere UI:

    1. Locate the compilation VM CID in the vSphere UI.
    2. To prevent BOSH from deleting the compilation VM after the compilation VM fails, rename the compilation VM.
    3. On the Windows compilation VM, go to Edit settings, then click add a device CD/DVD drive. Click browse Datastore ISO file, select the Ubuntu desktop iso, then click Connect at Power ON.
    4. Go to Edit settings -> VM options tab -> Boot Options.
    5. Increase the Boot Delay to 10000 milliseconds.
    6. Click Force BIOS Setup.
    7. Click Start/Restart to start the VM again.
  11. On the BIOS setup screen, boot with the CD-ROM Drive.

  12. After Ubuntu desktop starts, click try Ubuntu and launch a terminal window.

  13. In the terminal window, run:

    sudo fdisk -l
    sudo mkdir /mnt/windows
    sudo mount /dev/sda1 /mnt/windows
    
  14. You can now troubleshoot within this session by exploring the contents of the windows VM’s file system within /mnt/windows.

  15. After troubleshooting, delete the VM manually.

Windows authentication

For the basics, start with the Microsoft documentation.

The Windows Authentication feature uses the Windows event log system. To access a log from the Windows Diego Cell, ssh onto the cell, then enter powershell. To get the events in an event log, use the command Get-WinEvent:

> bosh ssh -d TAS-WINDOWS-DEPLOYMENT windows_diego_cell/INDEX
> powershell
PS> Get-WinEvent -LogName LOG-NAME

Where TAS-WINDOWS-DEPLOYMENT is the name of your TAS for VMs [Windows] deployment, INDEX is the VM you want to access, and LOG-NAME is the name of the log for which you want to view events.

TAS for VMs gMSA plug-in logs

TAS for VMs gMSA plugin logs are in the event log Cloudfoundry-CCG-Plugin. They tell you if the plug-in has run successfully, and if the input to the plug-in is correct. Success logs look like this:

PS>  Get-WinEvent -LogName Cloudfoundry-CCG-Plugin
   ProviderName: CfCcgPlugin

TimeCreated                     Id LevelDisplayName Message
-----------                     -- ---------------- -------
10/28/2022 6:06:58 PM            0 Information      Successfully got password credentials
10/28/2022 6:06:58 PM            0 Information      Plugin invoked
10/28/2022 6:06:58 PM            0 Information      Plugin instantiated

Windows compute service logs

If Windows containers do not start, you can check the logs for the Host Compute Service, which is the Windows component that administers Windows containers.

Success logs look like this:

PS> Get-WinEvent -LogName Microsoft-Windows-Hyper-V-Compute-Admin

ProviderName: Microsoft-Windows-Hyper-V-Compute

TimeCreated                     Id LevelDisplayName Message
-----------                     -- ---------------- -------
10/28/2022 5:57:30 PM         1001 Information      The Host Compute Service started successfully.

Windows container credential guard logs

In the event that Windows containers do not start, or Windows Authentication does not work through an app, you can check the logs for the CCG.exe process. This runs the TAS for VMs plug-in and establishes connectivity to the Active Directory. Successful logs look like this example:

PS>  Get-WinEvent -LogName Microsoft-Windows-Containers-CCG/Admin


   ProviderName: Microsoft-Windows-Containers-CCG

TimeCreated                     Id LevelDisplayName Message
-----------                     -- ---------------- -------
10/28/2022 6:35:34 AM            2 Information      Container Credential Guard fetched gmsa credentials for GMSA$ using plugin: {8019A64C-3F4E-4DE3-AD2B-9A544290E2C3}.

Where GMSA$ is the name of your GMSA service account.

For a list of possible Microsoft-Windows-Containers-CCG events, see the Microsoft Troubleshooting documentation.

Note You might need to follow additional steps to have events show up in the Microsoft-Windows-Containers-CCG log. If you have additional questions, you can contact Support.

Testing connectivity to Active Directory

Both the Windows Diego Cell and the application container must have network connectivity to the Active Directory instance. The Firewall rules must be set up correctly. If you have connectivity from the Windows Diego Cell but not from the application container, verify that there are no App Security Groups preventing access from application containers.

From inside an application container, if the network connectivity is set up correctly, these commands must be successful:

nslookup AD.DOMAIN          # Make sure DNS is properly set
ping AD.DOMAIN              # Make sure basic connectivity to the Domain Controllers is working
nltest /sc_query:AD.DOMAIN  # Make sure the secure channel to the Domain Controllers is working

Where AD.DOMAIN is the fully-qualified domain name of the Active Directory instance.

If the environment variable COMPUTERNAME is not set to GMSA$ and the USERDNSDOMAIN variable is not set to AD.DOMAIN, verify your tile setup.

check-circle-line exclamation-circle-line close-line
Scroll to top icon