Airgap Appliance keeps consistency of data flow operations with previous releases. After appliance deployment user can run data flow operations to sync data onto the server. Current supported data flow operations are "sync", "export", "import", and "rsync". Please note, sync and export operations require internet access, import and rsync operations can be executed in internet-restricted (air-gapped) environment.
Login Airgap Appliance console
Airgap Appliance created with security considerations. Not like previous releases' setup which allows user to login as root user, now only admin user has the permission to remote login the appliance via SSH. But as the tdnf tool in photon OS requires root permission and our scripts utilizes it then user need to run our CLI tool with root permission either. This can be done either with "sudo" or "su" to root user to trigger the command. Another change in the appliance is we rename the previous "run.sh" into "agctl" and add it in system path, then user can kick off the command in anywhere of system, no need to switch to bin folder under airgap scripts any more.
Use "sync" to fetch data from remote registries
The "Sync" operation can be executed after modifying the "user-inputs.yml" file. In the TCA 3.0 release, we support the input of the TCA build number in the YAML file to download images for the specified build. The steps are as follows:
Standard sync operation
Open the user-inputs.yml with vi editor
vi /usr/local/airgap/scripts/vars/user-inputs.yml
Replace TCA build number in build_sync field
Next, modify the "build_sync" field to replace the default build number "12345678" with the TCA build number that you intend to install:
build_sync: "3.0.0-<tca-build-number>"
Then execute the sync command
agctl sync
Sync images from local bom file(s) only
A new parameter "local_only" introduced in TCA3.0.0 release which enables user to sync images from local bom file(s) only. This is very helpful when user wants to sync a specific TCA build's images but no need to sync tkg images at same time. The steps to enable sync local bom images are as follows:
Open the user-inputs.yml with vi editor
vi /usr/local/airgap/scripts/vars/user-inputs.yml
Set parameter "local_only" to "True"
The default value of this parameter is False. To enable it then set the value to True:
local_only: True
Save the file and exit the vi editor.
Then execute the sync command
agctl sync
Set customized retry_times in user-inputs.yml
Another new parameter "retry_times" introduced in TCA3.0.0 release which user can define how many times to retry when failed to download image from remote repo. The default value of the parameter is 3. Below example shows steps to set user defined retry times:
Open the user-inputs.yml with vi editor
vi /usr/local/airgap/scripts/vars/user-inputs.yml
Set user defined value to "retry_times"
retry_times: 8
Then execute the sync command
agctl sync
The process will commence, and you will be prompted to enter Harbor's password for security purposes. All logs can be found under the "/usr/local/airgap/logs" folder. Also the "status" operation can be used to check image sync progress: agctl status
Use "export" to download data from remote registries to local folder into tar package
Export full size bundle
Prerequisite: As full size bundle will export all local bom images, TKG images and 4 entire Photon OS repos which requires at least 500GB free space on "/photon-reps" disk. Please make sure there're sufficient space before run export operation.
"Export" operation shares similar steps with "sync". After appliance OVA deploy, edit user-inputs.yml with vi editor, modify "build_sync" field to input target TCA build number, save the file and exit vi editor. Then run export.
If run export directly then will export full size bundle which includes all images of specified TCA release with full size of photon repos. Steps as below:
Open the user-inputs.yml with vi editor
vi /usr/local/airgap/scripts/vars/user-inputs.yml
Replace TCA build number in build_sync field
Next, modify the "build_sync" field to replace the default build number "12345678" with the TCA build number that you intend to install:
build_sync: "3.0.0-12345678"
Save the file and exit the vi editor.
Run export command
agctl export
Export incremental bundle
If user would like to export an incremental bundle then baseline need to be set in user-inputs.yml:
vi /usr/local/airgap/scripts/vars/user-inputs.yml
Besides input the target TCA build number, also need to uncomment lines and set baseline release, current supported baseline are 2.1.0, 2.2.0 and 2.3.0:
# - name: "photon" # baseline: "2.3.0"
Save the user inputs yaml file and then run export:
agctl export
Export bundle will be generated as airgap-export-bundle*tar.gz in /photon-reps/export-bundle/ folder, please prepare an USB thumb drive or Mobile Hard Disk to copy the bundle to airgap server's "/photon-reps" folder for import usage. As of TCA 3.0 release, the full size bundle is about 250GB and incremental bundle based on TCA 2.3 release is about 150GB then please make sure there's sufficient space on mobile disk.
Export adhoc RAN BOM images
If you would like to export RAN BOM Images then 'ran_bom_images' need to be set in user-inputs.yml:
vi /usr/local/airgap/scripts/vars/user-inputs.yml
then change the 'ran_bom_images' to true:
ran_bom_images: true
Save the user inputs yaml file and then run export:
agctl export
Export bundle will be generated as airgap-ranbom-export-bundle*tar.gz in /photon-reps/export-bundle/ran-bom folder, please prepare an USB thumb drive or Mobile Hard Disk to copy the bundle to airgap server's "/photon-reps" folder for import usage.
Use "import" to import data from tar bundle into airgap server
Before run import operation user need to copy the tar bundle generated in export operation into airgap server's "/photon-reps" folder and execute import. No user inputs modification needed. And user will be asked to input harbor's password after kick off operation then process starts.
agctl import
Import adhoc RAN BOM images
Before run import operation for RAN BOM images, user need to copy the tar bundle generated in export operation for RAN BOM as in 4.3.3 Export adhoc RAN BOM images section into airgap server's "/photon-reps" folder and make sure 'ran_bom_images' is set to 'true' in user-inputs.yaml. User will be asked to input harbor's password after kick off operation then process starts.
vi /usr/local/airgap/scripts/vars/user-inputs.yml
then change the flag 'ran_bom_images' to true:
ran_bom_images: true
Save the user inputs yaml file and then run import:
agctl import
Use "rsync" to replicate data from another airgap server
Since TCA3.0.0 release a new operation "rsync" added to replicate data from another airgap server directly. User inputs yaml file under "/usr/local/airgap/scripts/vars" need to be modified before run rsync operation.
vi /usr/local/airgap/scripts/vars/user-inputs.yml
Below is an example of parameters for rsync operation in user-inputs.yml, user can modify the value per their environment settings.
remote_server_fqdn: testrsync001.example.com endpoint_name: remote_registry_001 username: admin secret: Harbor12345 remote_server_cert_file: /usr/local/airgap/certs/remote_registry_001_ca.crt reg_des: remote harbor registry as source policy_des: new policy for replication policy_name: policy1 cron: 0 */30 * * * *
Note, user need to save remote airgap server's ca.crt into the file specified in field "remote_server_cert_file" before run rsync operation, otherwise the process will fail with error of cannot find the cert file. Here in the example it is "/usr/local/airgap/certs/remote_registry_001_ca.crt", and currently the file is must have even user is using public cert which procured from external vendor.
After user-inputs.yml file saved, kick off rsync operation:
agctl rsync
Sync log can be found in ansible output logs in folder /usr/local/airgap/logs/.
Upgrade the appliance
This document focus on addressing upgrade approaches for new CAP based airgap appliance in TCA3.0.0 release and beyond.
To upgrade existing TCA2.3.x airgap server via legacy airgap scripts still follow up the workflow of "export-->import-->upgrade" approach as what we did in previous releases. This approach is not part of this document.
Steps to upgrade airgap appliance:
Download airgap appliance upgrade bundle ISO image from customer connect website
Since TCA3.0.0 release, upgrade bundle of airgap appliance stored in ISO images. Customer can download the image from customer support website and save the image to a folder in airgap applaince. Generally we recommend user to save the image to /data folder on airgap appliance.
Modify airgap appliance user-inputs.yml config file with path to ISO image
There's a parameter "local_iso_path" which used to define path to ISO image of airgap appliance upgrade bundle. After login airgap appliance, user need to edit config file of "/usr/local/airgap/scripts/vars/user-inputs.yml" with correct path to upgrade bundle ISO image. Sample of user-inputs.yml ISO image path parameter:
# local_iso_path define the path to upgrade bundle isolocal_iso_path: /data/update.iso
Another optional parameter "skip_snapshot" can be set here. When set this parameter to "yes" there will no disk snapshot took during upgrade and then the system cannot be revert to original state after upgrade. By default, the value of this parameter is "no". To enable skip snapshot, set the value to "yes":
skip_snapshot: yes
Save the file and exit editor.
Kick off upgrade process
After save the config file now user can starts the upgrade by executing command below:
agctl upgrade
Once upgrade session starts up user can monitor the ansible output logs in folder /usr/local/airgap/logs/ and check CAP upgrade logs under folder "/var/log/vmware/capengine/cap-update/workflow.log" for more details.
After upgrade completed, reboot the system to make upgrade take effects
reboot
After reboot, check if build number already upgraded to expected:
cat/etc/vmware/cap/product.info
Troubleshooting
This section mainly focus on addressing common issues hit by users when using the appliance.
How to check if all images synced successfully after sync operation done
Since TCA3.0.0 release, a new approach of syncing images via "tanzu isolated-cluster plugin" introduced. There're more logs and checkpoints introduced along with the new apporach. Hence sometimes ansible scripts cannot catch all excpetions occurred during the syncing jobs. Then this topic introduces checkpoints to be checked out after sync operation done which can be done manually to validate whether all images synced successfully as reported.
Check local bom sync logs under /usr/local/airgap/logs/
There're several logs can be checked under /usr/local/airgap/logs/
publish-images.log & publish-helm.log
These logs record sync status of all images and helm charts from local bom files Generally there will be a summary generated on bottom of both log files which looks like the example below:
!!! Image synchronizing done !!! ====== Summary ====== Total processed images: 115 Total failed images: 0 Total passed images: 115 ======== end ========== ===== Failed images list ====== ===== End image porcess log =====
With above results, if "failed images" is 0 and no image listed under "Failed images list" then all data should been synced successfully.
Check ansible related logs
There's an ansible job log with name of "ansible_sync_[timestamp].log" generated under folder /usr/local/airgap/logs/. There're checkpoints in the log files to check. Use vi editor open the log file and search:
Go to bottom of the log and check if the overall job results in succeeded results. If all tasks succeeded then value of "failed" should be 0. Below is an example of the final job result:
PLAY RECAP ********************************************************************* localhost : ok=81 changed=40 unreachable=0 failed=0 skipped=9 rescued=0 ignored=1
"Upload tanzu plugin bundle"
Check if all tasks' status under this section are "changed". If so which means all tasks completed successfully.
"Wait for tkg bundle download completion"
Go to bottom of this section to check if all tasks' results in "changed" status. Below is an example of the results:
changed: [localhost] => (item={'failed': 0, 'started': 1, 'finished': 0, 'ansible_job_id': '44610320391.767196', 'results_file': '/root/.ansible_async/44610320391.767196', 'changed': True, 'item': 'v2.2.0', 'ansible_loop_var': 'item'})changed: [localhost] => (item={'failed': 0, 'started': 1, 'finished': 0, 'ansible_job_id': '649226982409.767214', 'results_file': '/root/.ansible_async/649226982409.767214', 'changed': True, 'item': 'v2.3.1', 'ansible_loop_var': 'item'})
In above example we can find there's "results_file" listed for each sub-task, then use vi editor to open the result files and search for key "stderr", if related value field is "" then all images should been downloaded successfully.
"Wait for tkg bundle upload completion"
Go to bottom of this section to check if all tasks' results in "changed" status. Also like checkpoint "2" there's "results_file" listed in each sub-task, use vi editor to open the files to search for "stderr" see if all data been uploaded successfully.
Finally, if all above checkpoints passed which means the sync operation done successfully. Otherwise there might failure occurred during the operation which requires more retry of the job.
Regarding the "skipped" items in output they are expected as that's general ansible output and some negative conditions not meet then skip the task.
Clean up caches before rerun sync operation
The "agctl sync" operation may fail due to unexpected reasons. Then user need to rerun the job to try to get all data synced successfully. Before rerun, some caches need to be cleaned up to avoid unexpected failures. To clean up the caches:
rm -rf /photon-reps/tkg_temp rm -rf /tmp/imgpkg-*
then rerun sync operation:
agctl sync
and check logs under /usr/local/airgap/logs/ folder to monitor real time data sync status
Harbor service failed to start after OVA deploy
Sometimes the harbor service may not brought up successfully during first boot after OVA deploy.
Then user can retry the setup after login VM through SSH or VM console.
To check whether harbor service running properly, then after login the VM, then:
cd /opt/harbor docker-compose ps
if not all the containers running in "healthy" status, in most cases, the "harbor-core" and "harbor-jobservice" containers are in "restarting" status all the time. Then need to do workaround jobs when under such situation.
Option-1: Rerun the init script
There's an init script will be ran on first boot after OVA deploy. And output logs will be put under /var/log/airgap/ folder. User can reran the script manually:
/etc/vmware/cap/cap-firstboot.sh
and check log created for this job under /var/log/airgap/ folder.
Option-2: Manually setup harbor, nginx and Photon OS
If still facing error after execute option-1 or user do not like to rerun the init script, manual steps can be performed to setup harbor, nginx and system as well.Fix harbor issue
When "harbor-core" and "harbor-jobservice" containers are in "restarting" status, a workaround can be applied to recreate harbor database and restart harbor service:
cd /opt/harbor/ docker-compose ps docker exec -i harbor-db sh -c "psql -U postgres -c \"DROP DATABASE registry;\"" docker exec -i harbor-db sh -c "psql -U postgres -c \"CREATE DATABASE registry;\"" docker-compose down -v && docker-compose up -d docker login <airgap-FQDN>:<harbor https port>
note: with this approach the harbor service will not be managed by systemd, user need to handle harbor service themselves via docker-compose command
Reconfig nginx
With harbor deploy failure, nginx service will be with default factory configurations still which need to do reconfig as well. To reconfig nginx:
ansible-playbook /usr/local/airgap/scripts/playbooks/setup-web-server.yml
Reconfig system after all services bring up
Some OS configuration jobs need to be done after all services up and running:
# link agctl in system path ln -s /usr/local/airgap/scripts/bin/agctl /usr/local/bin/agctl # Chown of airgap files after deploy chown -R admin:users /usr/local/airgap chown -R admin:users /usr/local/bin # redirect /etc/resolv.conf to user defined dns rm /etc/resolv.conf ln -s /run/systemd/resolve/resolv.conf /etc/ echo "127.0.0.1 <replace-with-airgap-FQDN>" >> /etc/hosts # Clean unused harbor installer from system rm -f /opt/*.tgz || true
Sync operation failed with error of "unexpected status code 500 Internal Server Error"
Sometimes the sync operation may failed when pushing images to harbor with error of "unexpected status code 500 Internal Server Error". This is known intermittent harbor issue. User can manually push the images onto airgap server by executing the commands directly:
tanzu isolated-cluster upload-bundle --destination-repo <airgap-FQDN>:<harbor_https_port>/registry --source-directory /photon-reps/tkg_temp/v2.2.0 -v 9 tanzu isolated-cluster upload-bundle --destination-repo <airgap-FQDN>:<harbor_https_port>/registry --source-directory /photon-reps/tkg_temp/v2.3.1 -v 9
If manual upload succeeded, then clean up the cache files:
rm -rf /photon-reps/tkg_temp rm -rf /tmp/imgpkg-*
Image path of multi level image "tca-repo" may be modified after remote sync operation
After user executed "agctl rsync" to sync data from an existing remote airgap server to new deploy airgap appliance there's opportunity the image path of "tca-repo" been modified by replication policy's flatten rule. To fix this problem, user need to modify the rule as below steps and manually execute the replication.
Login Harbor UI
Go to "Replications" in left panel
Click the radio button of the policy and select "EDIT" from "ACTIONS"
Change value of "Flattening" from "Flatten All Levels" to "Flatten 1 Level" and click "SAVE"
Click "REPLICATE" button to manually trigger the replication
Manually run deploy failed when configuring harbor password
User might re-run "agctl deploy" manaully after OVA deploy to revise appliance's configurations. There will be error when configuring harbor password as user need to manually provide harbor password in temp file before rerun deploy.
Edit harbor credential file to input harbor password temporarily
vi /usr/local/airgap/scripts/vars/harbor-credential.yml
and add below line in file, where to replace <harbor_password> with real harbor password
harbor_password: <harbor_password>
Run "agctl deploy"
agctl deploy