This document addresses common issues in using the airgap appliance and related solutions.

Troubleshooting

This section mainly focus on addressing common issues hit by users when using the appliance.

How to check if all images synced successfully after sync operation done

Since TCA 3.0.0 release, a new approach of syncing images via "tanzu isolated-cluster plugin" introduced. There're more logs and checkpoints introduced along with the new apporach. Hence sometimes ansible scripts cannot catch all excpetions occurred during the syncing jobs. Then this topic introduces checkpoints to be checked out after sync operation done which can be done manually to validate whether all images synced successfully as reported.
  • Check local bom sync logs under /usr/local/airgap/logs/
    There're several logs can be checked under /usr/local/airgap/logs/.
    1. publish-images.log & publish-helm.log
      These logs record sync status of all images and helm charts from local bom files Generally there will be a summary generated on bottom of both log files which looks like the example below:
      !!! Image synchronizing done !!!
      ====== Summary ======
      Total processed images:               115
      Total failed images:                  0
      Total passed images:                  115
      ======== end ========
      
      ======= Failed images list ======
      
      =====  End image porcess log =====
      

      With above results, if "failed images" is 0 and no image listed under "Failed images list" then all data should been synced successfully.

  • Check ansible related logs

    There's an ansible job log with name of "ansible_sync_[timestamp].log" generated under folder /usr/local/airgap/logs/. There're checkpoints in the log files to check. Use vi editor open the log file and search:

    1. Go to bottom of the log and check if the overall job results in succeeded results. If all tasks succeeded then value of "failed" should be 0. Below is an example of the final job result:
      PLAY RECAP *********************************************************************
      localhost                  : ok=81   changed=40   unreachable=0    failed=0    skipped=9    rescued=0    ignored=1
      
    2. "Upload tanzu plugin bundle"

      Check if all tasks' status under this section are "changed". If so which means all tasks completed successfully.

    3. "Wait for tkg bundle download completion"
      Go to bottom of this section to check if all tasks' results in "changed" status. Below is an example of the results:
      changed: [localhost] => (item={'failed': 0, 'started': 1, 'finished': 0, 'ansible_job_id': '44610320391.767196', 'results_file': '/root/.ansible_async/44610320391.767196', 'changed': True, 'item': 'v2.2.0', 'ansible_loop_var': 'item'})
      changed: [localhost] => (item={'failed': 0, 'started': 1, 'finished': 0, 'ansible_job_id': '649226982409.767214', 'results_file': '/root/.ansible_async/649226982409.767214', 'changed': True, 'item': 'v2.3.0', 'ansible_loop_var': 'item'})
      

      In above example we can find there's "results_file" listed for each sub-task, then use vi editor to open the result files and search for key "stderr", if related value field is "" then all images should been downloaded successfully.

    4. "Wait for tkg bundle upload completion"

      Go to bottom of this section to check if all tasks' results in "changed" status. Also like checkpoint "2" there's "results_file" listed in each sub-task, use vi editor to open the files to search for "stderr" see if all data been uploaded successfully.

Finally, if all above checkpoints passed which means the sync operation done successfully. Otherwise, there might be failure occurred during the operation which requires repetition of the job.

Issues and solutions

  1. Issue: Resize appliance disks

    Cause: While using the airgap server, there might be chances that disk can go low on available space.

    Solution: Follow the steps below to increase the disk size:
    1. Open VM Edit settings wizard.
    2. Specify a new size for the selected disk and click OK.
    3. Open VM Edit settings wizard again.
    4. Take note of total provisioned disk size in line Hard disks.
    5. Expand Hard disks list, resize the Hard disk 7 with value of the provisioned disk size in step 4 multiplied by 15%.

      For example, if the provisioned disk size is 1000GB, then the size for Hard disk 7 is 150GB(1000 * 15%).

    6. SSH login airgap appliance console.
    7. Check the list of disks by running the command fdisk -l.
    8. Rescan the extended disk: echo 1 > /sys/class/block/<disk-name>/device/rescan.

      For example: echo 1 > /sys/class/block/sdc/device/rescan.

    9. Perform pvresize to resize physical volume: pvresize /dev/<disk-name>.

      For example: pvresize /dev/sdc.

    10. Check mount points and logic volume mapping by df -h |grep ^/dev, find out the logic volume to extend size.
    11. Extend logical volume: lvextend -l +100%FREE <LV-name>.

      For example: lvextend -l +100%FREE /dev/VGOS/LV_OS.

    12. Perform a filesystem resize of logical volume: resize2fs <LV-name>. For example: resize2fs /dev/VGOS/LV_OS.
    13. Execute df -h |grep ^/dev to validate the disk resize is completed.
    14. Scan the snapshot disk echo 1 > /sys/class/block/sdg/device/rescan.
    15. Perform pvresize to the snapshot disk: pvresize /dev/sdg.
    16. Extend snapshot logical volume: lvextend -l +100%FREE /dev/mapper/vg_lvm_snapshot-lv_lvm_snapshot.
    17. Perform a filesystem resize of snapshot volume: resize2fs /dev/mapper/vg_lvm_snapshot-lv_lvm_snapshot.
    18. Execute df -h |grep ^/dev to validate the disk resize is completed.
  2. Issue: Failed to execute command: umount -l /storage/alt_root/boot/efi

    Cause: This is a temporary file system issue.

    Solution: Redo agctl upgrade.

  3. Issue: One of the mandatory disks (lvm_snapshot) required for update is not present

    Cause: This might occur if the snapshot volume does not automatically mount after reboot/power on.

    Solution:

    1. Check /etc/fstab if “lvm_snapshot” is available: grep lvm_snapshot /etc/fstab.

    2. If the mount point is available, run command mount -a.

    3. Check if the volume is mounted: mount |grep lvm_snapshot.

    4. If the mount point is not available in /etc/fstab and the volume is not mounted, run command mount /dev/mapper/vg_lvm_snapshot-lv_lvm_snapshot /storage/lvm_snapshot/ to mount the volume manually.

    5. Rerun upgrade: agctl upgrade.

  4. Issue: agctl upgrade fails at times with error, iso does not exist even correct iso path is provided especially when actual iso name is used as is which contains version numbering.

    Cause: This is a known issue caused by combining source and target user-inputs.yml files during upgrade.

    Solution:

    1. Modify the upgrade ISO file name to a simple text like update.iso.

    2. Edit /usr/local/airgap/scripts/vars/user-inputs.yml, goto field local_iso_path: and revise the value of

      For example, if the iso located under /tmp folder, then the value should be /tmp/update.iso.

    3. Save the user-inputs.yml and rerun agctl upgrade.

  5. Issue: Failed to execute rsync operation when airgap appliance configured with proxy

    Cause: This is a known issue that the source host and target host should be within the same network which can connect to each other directly. Access remote airgap server via proxy is not supported.

    Solution: Below steps are to resolve this issue:

    1. SSH to target airgap host.
    2. Clear proxy info: bash /usr/local/airgap/scripts/bin/clear-proxy.sh.
    3. Logout from ssh connection.
    4. Login target host again via SSH.
    5. Start rsync agctl rsync.

Best Practices

  1. Encoding certificate using command line interface

    In some environments, the online Base64 encode/decode tool ​​https://www.base64encode.org/ may not be accessible. When user wants to encode or decode certificates for deploying airgap ova or for proxy server following steps can be performed from command line:

    Encode certificate:
    #base64 -w0 <certificate>

    Decode certificate:

    #base64 -d <encoded-certificate>
  2. Correct way of using chained certificates in airgap server

    When user has chained certificate which includes server cert, intermediate CA, and root CA, user can follow the below steps to correctly apply the certificates on airgap server:

    1. Create a folder certificate on airgap server.
    2. Copy the 3 certificates into above folder.
    3. Merge the server certificate and intermediate CA certificate into one file:
      cat <server-certificate> <intermediate-CA> > <new-chaine-certificate>
    4. After the new chained certificate was created, convert it into Base64 format and apply in the following way:
      1. While deploying the OVA, provide the merged certificate as server certificate and Root CA certificate as CA certificate.
      2. While registering the airgap server in the TCA partner system, provide the root CA certificate in the certificate field.