Airgap Appliance keeps consistency of data flow operations with previous releases. After appliance deployment user can run data flow operations to sync data onto the server. Current supported data flow operations are "sync", "export", "import", and "rsync". Please note, sync and export operations require internet access, import and rsync operations can be executed in internet-restricted (air-gapped) environment.

Login Airgap Appliance console

Airgap Appliance created with security considerations. Not like previous releases' setup which allows user to login as root user, now only admin user has the permission to remote login the appliance via SSH. But as the tdnf tool in photon OS requires root permission and our scripts utilizes it then user need to run our CLI tool with root permission either. This can be done either with "sudo" or "su" to root user to trigger the command. Another change in the appliance is we rename the previous "run.sh" into "agctl" and add it in system path, then user can kick off the command in anywhere of system, no need to switch to bin folder under airgap scripts any more.

Use "sync" to fetch data from remote registries

The "Sync" operation can be executed after modifying the "user-inputs.yml" file. In the TCA 3.0 release, we support the input of the TCA build number in the YAML file to download images for the specified build. The steps are as follows:

  • Standard sync operation

    1. Open the user-inputs.yml with vi editor

      vi /usr/local/airgap/scripts/vars/user-inputs.yml
    2. Replace TCA build number in build_sync field

      Next, modify the "build_sync" field to replace the default build number "12345678" with the TCA build number that you intend to install:

      build_sync: "3.0.0-<tca-build-number>"

    3. Then execute the sync command

      agctl sync

  • Sync images from local bom file(s) only

    A new parameter "local_only" introduced in TCA3.0.0 release which enables user to sync images from local bom file(s) only. This is very helpful when user wants to sync a specific TCA build's images but no need to sync tkg images at same time. The steps to enable sync local bom images are as follows:

    1. Open the user-inputs.yml with vi editor

      vi /usr/local/airgap/scripts/vars/user-inputs.yml

    2. Set parameter "local_only" to "True"

      The default value of this parameter is False. To enable it then set the value to True:

      local_only: True

      Save the file and exit the vi editor.

    3. Then execute the sync command

      agctl sync

  • Set customized retry_times in user-inputs.yml

    Another new parameter "retry_times" introduced in TCA3.0.0 release which user can define how many times to retry when failed to download image from remote repo. The default value of the parameter is 3. Below example shows steps to set user defined retry times:

    1. Open the user-inputs.yml with vi editor

      vi /usr/local/airgap/scripts/vars/user-inputs.yml

    2. Set user defined value to "retry_times"

      retry_times: 8

    3. Then execute the sync command

      agctl sync

The process will commence, and you will be prompted to enter Harbor's password for security purposes. All logs can be found under the "/usr/local/airgap/logs" folder. Also the "status" operation can be used to check image sync progress: agctl status

Use "export" to download data from remote registries to local folder into tar package

  • Export full size bundle

    Prerequisite: As full size bundle will export all local bom images, TKG images and 4 entire Photon OS repos which requires at least 500GB free space on "/photon-reps" disk. Please make sure there're sufficient space before run export operation.

    "Export" operation shares similar steps with "sync". After appliance OVA deploy, edit user-inputs.yml with vi editor, modify "build_sync" field to input target TCA build number, save the file and exit vi editor. Then run export.

    If run export directly then will export full size bundle which includes all images of specified TCA release with full size of photon repos. Steps as below:

    1. Open the user-inputs.yml with vi editor

      vi /usr/local/airgap/scripts/vars/user-inputs.yml

    2. Replace TCA build number in build_sync field

      Next, modify the "build_sync" field to replace the default build number "12345678" with the TCA build number that you intend to install:

      build_sync: "3.0.0-12345678"

      Save the file and exit the vi editor.

    3. Run export command

      agctl export

  • Export incremental bundle

    1. If user would like to export an incremental bundle then baseline need to be set in user-inputs.yml:

      vi /usr/local/airgap/scripts/vars/user-inputs.yml

    2. Besides input the target TCA build number, also need to uncomment lines and set baseline release, current supported baseline are 2.1.0, 2.2.0 and 2.3.0:

      #  - name: "photon"
      #    baseline: "2.3.0"
    3. Save the user inputs yaml file and then run export:

      agctl export

Export bundle will be generated as airgap-export-bundle*tar.gz in /photon-reps/export-bundle/ folder, please prepare an USB thumb drive or Mobile Hard Disk to copy the bundle to airgap server's "/photon-reps" folder for import usage. As of TCA 3.0 release, the full size bundle is about 250GB and incremental bundle based on TCA 2.3 release is about 150GB then please make sure there's sufficient space on mobile disk.

  • Export adhoc RAN BOM images

    If you would like to export RAN BOM Images then 'ran_bom_images' need to be set in user-inputs.yml:

    vi /usr/local/airgap/scripts/vars/user-inputs.yml

    then change the 'ran_bom_images' to true:

    ran_bom_images: true

    Save the user inputs yaml file and then run export:

    agctl export

    Export bundle will be generated as airgap-ranbom-export-bundle*tar.gz in /photon-reps/export-bundle/ran-bom folder, please prepare an USB thumb drive or Mobile Hard Disk to copy the bundle to airgap server's "/photon-reps" folder for import usage.

Use "import" to import data from tar bundle into airgap server

Before run import operation user need to copy the tar bundle generated in export operation into airgap server's "/photon-reps" folder and execute import. No user inputs modification needed. And user will be asked to input harbor's password after kick off operation then process starts.

agctl import

  1. Import adhoc RAN BOM images

    Before run import operation for RAN BOM images, user need to copy the tar bundle generated in export operation for RAN BOM as in 4.3.3 Export adhoc RAN BOM images section into airgap server's "/photon-reps" folder and make sure 'ran_bom_images' is set to 'true' in user-inputs.yaml. User will be asked to input harbor's password after kick off operation then process starts.

    vi /usr/local/airgap/scripts/vars/user-inputs.yml

    then change the flag 'ran_bom_images' to true:

    ran_bom_images: true

    Save the user inputs yaml file and then run import:

    agctl import

Use "rsync" to replicate data from another airgap server

Since TCA3.0.0 release a new operation "rsync" added to replicate data from another airgap server directly. User inputs yaml file under "/usr/local/airgap/scripts/vars" need to be modified before run rsync operation.

vi /usr/local/airgap/scripts/vars/user-inputs.yml

Below is an example of parameters for rsync operation in user-inputs.yml, user can modify the value per their environment settings.

remote_server_fqdn: testrsync001.example.com
endpoint_name: remote_registry_001
username: admin
secret: Harbor12345
remote_server_cert_file: /usr/local/airgap/certs/remote_registry_001_ca.crt
reg_des: remote harbor registry as source
policy_des: new policy for replication

policy_name: policy1
cron: 0 */30 * * * *

Note, user need to save remote airgap server's ca.crt into the file specified in field "remote_server_cert_file" before run rsync operation, otherwise the process will fail with error of cannot find the cert file. Here in the example it is "/usr/local/airgap/certs/remote_registry_001_ca.crt", and currently the file is must have even user is using public cert which procured from external vendor.

After user-inputs.yml file saved, kick off rsync operation:

agctl rsync

Sync log can be found in ansible output logs in folder /usr/local/airgap/logs/.

Upgrade the appliance

This document focus on addressing upgrade approaches for new CAP based airgap appliance in TCA3.0.0 release and beyond.

To upgrade existing TCA2.3.x airgap server via legacy airgap scripts still follow up the workflow of "export-->import-->upgrade" approach as what we did in previous releases. This approach is not part of this document.

Steps to upgrade airgap appliance:

  1. Download airgap appliance upgrade bundle ISO image from customer connect website

    Since TCA3.0.0 release, upgrade bundle of airgap appliance stored in ISO images. Customer can download the image from customer support website and save the image to a folder in airgap applaince. Generally we recommend user to save the image to /data folder on airgap appliance.

  2. Modify airgap appliance user-inputs.yml config file with path to ISO image

    There's a parameter "local_iso_path" which used to define path to ISO image of airgap appliance upgrade bundle. After login airgap appliance, user need to edit config file of "/usr/local/airgap/scripts/vars/user-inputs.yml" with correct path to upgrade bundle ISO image. Sample of user-inputs.yml ISO image path parameter:

    # local_iso_path define the path to upgrade bundle isolocal_iso_path: /data/update.iso

    Another optional parameter "skip_snapshot" can be set here. When set this parameter to "yes" there will no disk snapshot took during upgrade and then the system cannot be revert to original state after upgrade. By default, the value of this parameter is "no". To enable skip snapshot, set the value to "yes":

    skip_snapshot: yes

    Save the file and exit editor.

  3. Kick off upgrade process

    After save the config file now user can starts the upgrade by executing command below:

    agctl upgrade

    Once upgrade session starts up user can monitor the ansible output logs in folder /usr/local/airgap/logs/ and check CAP upgrade logs under folder "/var/log/vmware/capengine/cap-update/workflow.log" for more details.

  4. After upgrade completed, reboot the system to make upgrade take effects

    reboot

    After reboot, check if build number already upgraded to expected:

    cat/etc/vmware/cap/product.info

Troubleshooting

This section mainly focus on addressing common issues hit by users when using the appliance.

  1. How to check if all images synced successfully after sync operation done

    Since TCA3.0.0 release, a new approach of syncing images via "tanzu isolated-cluster plugin" introduced. There're more logs and checkpoints introduced along with the new apporach. Hence sometimes ansible scripts cannot catch all excpetions occurred during the syncing jobs. Then this topic introduces checkpoints to be checked out after sync operation done which can be done manually to validate whether all images synced successfully as reported.

    1. Check local bom sync logs under /usr/local/airgap/logs/

      There're several logs can be checked under /usr/local/airgap/logs/

      1. publish-images.log & publish-helm.log

        These logs record sync status of all images and helm charts from local bom files Generally there will be a summary generated on bottom of both log files which looks like the example below:

        !!! Image synchronizing done !!!
        ====== Summary ======
        Total processed images:               115
        Total failed images:                  0
        Total passed images:                  115
        ======== end ==========
        ===== Failed images list ======
        =====  End image porcess log =====

        With above results, if "failed images" is 0 and no image listed under "Failed images list" then all data should been synced successfully.

    2. Check ansible related logs

      There's an ansible job log with name of "ansible_sync_[timestamp].log" generated under folder /usr/local/airgap/logs/. There're checkpoints in the log files to check. Use vi editor open the log file and search:

      1. Go to bottom of the log and check if the overall job results in succeeded results. If all tasks succeeded then value of "failed" should be 0. Below is an example of the final job result:

        PLAY RECAP *********************************************************************
        localhost                  : ok=81   changed=40   unreachable=0    failed=0    skipped=9    rescued=0    ignored=1
      2. "Upload tanzu plugin bundle"

        Check if all tasks' status under this section are "changed". If so which means all tasks completed successfully.

      3. "Wait for tkg bundle download completion"

        Go to bottom of this section to check if all tasks' results in "changed" status. Below is an example of the results:

        changed: [localhost] => (item={'failed': 0, 'started': 1, 'finished': 0, 'ansible_job_id': '44610320391.767196', 'results_file': '/root/.ansible_async/44610320391.767196', 'changed': True, 'item': 'v2.2.0', 'ansible_loop_var': 'item'})changed: [localhost] => (item={'failed': 0, 'started': 1, 'finished': 0, 'ansible_job_id': '649226982409.767214', 'results_file': '/root/.ansible_async/649226982409.767214', 'changed': True, 'item': 'v2.3.1', 'ansible_loop_var': 'item'})

        In above example we can find there's "results_file" listed for each sub-task, then use vi editor to open the result files and search for key "stderr", if related value field is "" then all images should been downloaded successfully.

      4. "Wait for tkg bundle upload completion"

        Go to bottom of this section to check if all tasks' results in "changed" status. Also like checkpoint "2" there's "results_file" listed in each sub-task, use vi editor to open the files to search for "stderr" see if all data been uploaded successfully.

      Finally, if all above checkpoints passed which means the sync operation done successfully. Otherwise there might failure occurred during the operation which requires more retry of the job.

      Regarding the "skipped" items in output they are expected as that's general ansible output and some negative conditions not meet then skip the task.

  2. Clean up caches before rerun sync operation

    The "agctl sync" operation may fail due to unexpected reasons. Then user need to rerun the job to try to get all data synced successfully. Before rerun, some caches need to be cleaned up to avoid unexpected failures. To clean up the caches:

    rm -rf /photon-reps/tkg_temp 
    rm -rf /tmp/imgpkg-*

    then rerun sync operation:

    agctl sync

    and check logs under /usr/local/airgap/logs/ folder to monitor real time data sync status

  3. Harbor service failed to start after OVA deploy

    Sometimes the harbor service may not brought up successfully during first boot after OVA deploy.

    Then user can retry the setup after login VM through SSH or VM console.

    To check whether harbor service running properly, then after login the VM, then:

    cd /opt/harbor 
    docker-compose ps

    if not all the containers running in "healthy" status, in most cases, the "harbor-core" and "harbor-jobservice" containers are in "restarting" status all the time. Then need to do workaround jobs when under such situation.

    1. Option-1: Rerun the init script

      There's an init script will be ran on first boot after OVA deploy. And output logs will be put under /var/log/airgap/ folder. User can reran the script manually:

      /etc/vmware/cap/cap-firstboot.sh

      and check log created for this job under /var/log/airgap/ folder.

    2. Option-2: Manually setup harbor, nginx and Photon OS

      If still facing error after execute option-1 or user do not like to rerun the init script, manual steps can be performed to setup harbor, nginx and system as well.
      1. Fix harbor issue

        When "harbor-core" and "harbor-jobservice" containers are in "restarting" status, a workaround can be applied to recreate harbor database and restart harbor service:

        cd /opt/harbor/ docker-compose ps docker exec -i harbor-db sh -c "psql -U postgres -c \"DROP DATABASE registry;\"" docker exec -i harbor-db sh -c "psql -U postgres -c \"CREATE DATABASE registry;\"" docker-compose down -v && docker-compose up -d docker login <airgap-FQDN>:<harbor https port>

        note: with this approach the harbor service will not be managed by systemd, user need to handle harbor service themselves via docker-compose command

      2. Reconfig nginx

        With harbor deploy failure, nginx service will be with default factory configurations still which need to do reconfig as well. To reconfig nginx:

        ansible-playbook /usr/local/airgap/scripts/playbooks/setup-web-server.yml

      3. Reconfig system after all services bring up

        Some OS configuration jobs need to be done after all services up and running:

        # link agctl in system path ln -s /usr/local/airgap/scripts/bin/agctl /usr/local/bin/agctl # Chown of airgap files after deploy chown -R admin:users /usr/local/airgap chown -R admin:users /usr/local/bin # redirect /etc/resolv.conf to user defined dns  rm /etc/resolv.conf ln -s /run/systemd/resolve/resolv.conf /etc/ echo "127.0.0.1 <replace-with-airgap-FQDN>" >> /etc/hosts # Clean unused harbor installer from system rm -f /opt/*.tgz || true
  4. Sync operation failed with error of "unexpected status code 500 Internal Server Error"

    Sometimes the sync operation may failed when pushing images to harbor with error of "unexpected status code 500 Internal Server Error". This is known intermittent harbor issue. User can manually push the images onto airgap server by executing the commands directly:

    tanzu isolated-cluster upload-bundle --destination-repo <airgap-FQDN>:<harbor_https_port>/registry --source-directory /photon-reps/tkg_temp/v2.2.0 -v 9 
    tanzu isolated-cluster upload-bundle --destination-repo <airgap-FQDN>:<harbor_https_port>/registry --source-directory /photon-reps/tkg_temp/v2.3.1 -v 9

    If manual upload succeeded, then clean up the cache files:

    rm -rf /photon-reps/tkg_temp 
    rm -rf /tmp/imgpkg-*
  5. Image path of multi level image "tca-repo" may be modified after remote sync operation

    After user executed "agctl rsync" to sync data from an existing remote airgap server to new deploy airgap appliance there's opportunity the image path of "tca-repo" been modified by replication policy's flatten rule. To fix this problem, user need to modify the rule as below steps and manually execute the replication.

    1. Login Harbor UI

    2. Go to "Replications" in left panel

    3. Click the radio button of the policy and select "EDIT" from "ACTIONS"

    4. Change value of "Flattening" from "Flatten All Levels" to "Flatten 1 Level" and click "SAVE"

    5. Click "REPLICATE" button to manually trigger the replication

  6. Manually run deploy failed when configuring harbor password

    User might re-run "agctl deploy" manaully after OVA deploy to revise appliance's configurations. There will be error when configuring harbor password as user need to manually provide harbor password in temp file before rerun deploy.

    1. Edit harbor credential file to input harbor password temporarily

      vi /usr/local/airgap/scripts/vars/harbor-credential.yml

      and add below line in file, where to replace <harbor_password> with real harbor password

      harbor_password: <harbor_password>

    2. Run "agctl deploy"

      agctl deploy