This topic explains how to deploy a STIG-hardened Tanzu Kubernetes Grid (TKG) management cluster to an airgapped Virtual Private Cloud (VPC) on AWS. The management cluster can then create and manage STIG-hardened Tanzu Kubernetes (workload) clusters.
For STIG compliance scan results and NSA/CISA Kubernetes Hardening Guidance for Tanzu Kubernetes Grid, see STIG and NSA/CISA Hardening
You have the following options when deploying hardened TKG to the airgapped VPC:
You can enable FIPS-compliance for the clusters, based on Canonical FIPS for the node OS and BoringCrypto for Kubernetes.
The TKG deployment can use your own, existing image registry. Otherwise, by default, the 1-Click script installs a new instance of Harbor on an Amazon 2 AMI.
To deploy a STIG-hardened management cluster to an airgapped environment on AWS, you need:
An AWS user account with permissions to create, list, and delete:
Some of these permissions are required to fulfill prerequisites below.
An airgapped AWS Virtual Private Cloud (VPC) with:
sts
ssm
ec2
ec2messages
elasticloadbalancing
secretsmanager
ssmmessages
cloudformation
s3
An S3 bucket in the region where you will run TKG and accessible from within the airgapped VPC
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Access-to-specific-VPCE-only",
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::<MY BUCKET NAME>/*",
"Condition": {
"StringEquals": {
"aws:sourceVpce": "<MY VPC ENDPOINT ID>"
}
}
}
]
}
An ssh
key pair for the AWS region where you will run TKG, created via the Amazon EC2 console.
A removable storage device, such as a USB thumb drive, with at least 20GB of free space.
A Linux bastion host (jumpbox) that:
This is your bootstrap machine.
An online machine that can download content from a shared S3 bucket and code repository onto the removable storage device outside of the airgapped environment.
To deploy the management cluster, you do the following as described in the sections below:
From the online machine, load the removable storage device with the following content:
The Tanzu Compliance TKG 1-Click repository and its submodules.
git clone REPO --recursive
.The TKG dependencies. These images are stored in a shared S3 bucket called tkg-1click-dependencies
:
Retrieve read-only access credentials for the S3 bucket:
Copy the airgapped dependencies to the storage device:
export TKG_VERSION=TKG-VERSION
export TKR_VERSION=TKR-VERSION
make download-deps
Where TKG Version
and TKR Version
are the respective versions of TKG and TKR you wish you to use.
Note: Currently, only TKG_VERSION 1.4.0
and TKR_VERSION 1.21.2
are supported, which pull TKG v1.4.0-fips.1
and TKR v1.21.2_vmware.1-fips.1-tkg.1
.
Plug the storage device into the bootstrap machine and copy its content to your own S3 bucket:
export BUCKET_NAME=MY-BUCKET
export DEPS_DIR=MY-DEPENDENCY-DIRECTORY
make upload-deps
Where MY-BUCKET
is your own S3 bucket, accessible from within the airgapped VPC and DEPS_DIR
is the path to the directory where your airgapped dependencies are located.
(Optional) To use your own, existing image registry instead of having a registry automatically created, set up your registry up as follows:
Create a publicly-readable project in the registry called tkg
, for TKG to access at REGISTRY-NAME/tkg
.
Export the following environment variables:
export REGISTRY=MY-REGISTRY
export REGISTRY_CA_PATH=/PATH/TO/REGISTRY/CA
export BUCKET_NAME=MY-BUCKET
export TKG_VERSION=TKG-VERSION
export TKR_VERSION=TKR-VERSION
export IMGPKG_USERNAME=REGISTRY-USERNAME
export IMGPKG_PASSWORD=REGISTRY-PASSWORD
Where MY-REGISTRY
,/PATH/TO/REGISTRY/CA
,MY-BUCKET
, TKG-VERSION
, TKR-VERSION
,IMGPKG_USERNAME
, and IMGPKG_PASSWORD
are your registry’s DNS name, a full path that points to a local file containing your registry’s CA, your own s3 bucket, the TKG Version (i.e. v1.4.0), the TKR Version (i.e. v1.21.2), and your username and password to use to access your registry.
Upload the TKG and image-builder images into your registry’s tkg
project by running the following sequence of commands:
make upload-images
Set the environment variables that the 1-Click script uses to deploy the management cluster:
Set the required variables, based on your AWS account and the values above:
export BUCKET_NAME=MY-BUCKET
export VPC_ID=MY-VPC
export SUBNET_ID=MY-SUBNET-ID
export SSH_KEY_NAME=AWS-RSA-SSH-KEY
export AWS_AZ_ZONE=MY-AZ
export AWS_ACCESS_KEY_ID=KEYPAIR-ACCESS-KEY-ID
export AWS_SECRET_ACCESS_KEY=KEYPAIR-SECRET-ACCESS-KEY
export AWS_DEFAULT_REGION=MY-AWS-REGION
export TKR_VERSION=TKR-VERSION
export TKG_VERSION=TKG-VERSION
Note: Currently, only TKG_VERSION 1.4.0
and TKR_VERSION 1.21.2
are supported, which pull TKG v1.4.0-fips.1
and TKR v1.21.2_vmware.1-fips.1-tkg.1
.
Image Registry: Set variables to configure the image registry, depending on whether TKG will use an existing registry:
If you are using your own image registry set the following:
export REGISTRY=MY-REGISTRY
export USE_EXISTING_REGISTRY=true
export REGISTRY_CA_FILENAME=/PATH/TO/REGISTRY/CA
Where /PATH/TO/REGISTRY/CA
is a full path that points to a local file containing your registry’s CA. The file should have the extension .crt
.
To customize the Harbor registry that the 1-Click script creates, optionally set the following:
By default, the 1-click script sets the admin
password for Harbor to the value that Terraform writes into the HARBOR_ADMIN_PWD
field in the file air-gapped/airgapped.env
. To use a different password:
export TF_VAR_harbor_pwd=CUSTOM-HARBOR-PASSWORD
To configure registry access via either certs that you specify, or generated certs:
export TF_VAR_create_certs = <Default is true>
If you set TF_VAR_create_certs
to true
or let it default, set the following:
export TF_VAR_cert_l=#Default Minneapolis| L in the certs cn(Location)
export TF_VAR_cert_st=#Default Minnesota|ST in the certs CN(State)
export TF_VAR_cert_o=#Default VmWare|O in the certs CN(Organization)
export TF_VAR_cert_ou=#Default VmWare R&D|OU in the certs CN(Organizational Unit)
If you set TF_VAR_create_certs
to false
, set the following:
export TF_VAR_cert_path=#Path to certificate on harbor ami
export TF_VAR_cert_key_path=#Path to private key on harbor ami
export TF_VAR_cert_ca_path=#Path to ca certificate on harbor ami
In addition to environment variables, the 1-Click script uses configuration options that you set within your local copy of the script repository itself:
To disable FIPS, set install_fips
to no
in this repo submodule file:
ami/stig/roles/canonical-ubuntu-18.04-lts-stig-hardening/vars/main.yml
To add CA certificates to the AMI, copy them in PEM
format into this submodule folder:
ami/stig/roles/canonical-ubuntu-18.04-lts-stig-hardening/files/ca
Additional options:
terraform/startup.sh
file contains the following configurable options that you can set within the file, listed here with their defaults:Name | Default | Description |
---|---|---|
AMI_ID |
tkg_ami_id value from Terraform |
The AMI ID to deploy |
REGISTRY_CA_FILENAME |
ca.crt |
The name of the CA file for the private registry |
AWS_NODE_AZ |
az_zone value from Terraform |
The first AWS Availability Zone to deploy to |
AWS_SSH_KEY_NAME |
set in tfvars |
The ssh key to use for TKG cluster; must be RSA if STIG |
AWS_REGION |
set it tfvars |
The AWS region to deploy TKG in |
AWS_VPC_ID |
VPC ID of bootstrap machine | The VPC ID to deploy TKG into |
OFFLINE_REGISTRY |
Registry DNS name of the Harbor instance | The DNS name of the Docker registry; only modify if user-provided registry |
REGISTRY_IP |
IP of the Harbor instance | IP Address of Docker registry only modify if user provided registry |
TKG_CUSTOM_IMAGE_REPOSITORY |
$OFFLINE_REGISTRY/tkg |
The full Docker registry project path to use for tkg images |
CLUSTER_NAME |
airgapped-mgmnt |
The name of the TKG management cluster to deploy |
TKG_CUSTOM_COMPATIBILITY_PATH |
fips/tkg-compatibility |
The compatibility path to use; set to "" if non-FIPS deploy |
COMPLIANCE |
stig |
The compliance standard to follow; set to stig , cis , or none |
ENABLE_AUDIT_LOGGING |
true |
Whether or not auditing is enabled on Kubernetes |
ENABLE_SERVING_CERTS |
false |
Whether or not to enable serving certificates on Kubernetes |
PROTECT_KERNEL_DEFAULTS |
true |
Whether or not to set --protect-kernel-defaults on kubelet; only set to true with AMIs that allow it |
CLUSTER_PLAN |
dev |
The cluster plan for TKG; set to dev or prod |
AWS_PRIVATE_SUBNET_ID |
For dev plan clusters; set to subnet ID of bootstrap machine |
Used for dev plan clusters; set to private subnet ID to deploy TKG into |
AWS_NODE_AZ_1 |
none | Required for prod plan clusters; set to node Availability Zone 1 |
AWS_NODE_AZ_2 |
none | Required for prod plan clusters; set to node Availability Zone 2 |
AWS_PRIVATE_SUBNET_ID_1 |
none | Required for prod plan clusters; set to private subnet 1 |
AWS_PRIVATE_SUBNET_ID_2 |
none | Required for prod plan clusters; set to private subnet 2 |
CONTROL_PLANE_MACHINE_TYPE |
none | Required for prod plan clusters; the AWS machine type to use for control plane nodes |
NODE_MACHINE_TYPE |
none | Required for prod plan clusters; the AWS machine type to use for worker nodes |
SERVICE_CIDR |
none | Required for prod plan clusters; set to Kubernetes services CIDR |
CLUSTER_CIDR |
none | Required for prod plan clusters; set to cluster CIDR |
To deploy the management cluster:
Log in as root to the bootstrap machine.
Depending on your IAM privileges in AWS, do one of the following:
make all
If you cannot create IAM policies and roles:
Have someone with IAM privileges run the CloudFormation template 1clickiamtemplate
in the TKG 1-Click repository. This creates the roles, policies, and instance profiles needed to deploy a TKG management cluster on AWS.
Run:
make all-no-iam
To view a list of supported commands run:
make
You can track the deployment progress by accessing the newly-created image registry and TKG cluster as follows:
Harbor
Once Terraform finishes, set up VPC peering between your airgapped VPC and another, non-airgapped VPC.
Within the non-airgapped VPC, modify the security group on an EC2 instance to allow it to ssh
over to the airgapped bootstrap machine.
ssh
into the bootstrap machine and then again into the Harbor instance.
Run the following to track the progress of your Harbor installation and subsequent loading of TKG images:
sudo tail -f /var/log/cloud-init-output.log
TKG
Set up VPC peering and ssh
access to your bootstrap machine as described for Harbor above.
ssh
into the bootstrap machine and run the following to track the progress of your management cluster deployment:
sudo tail -f /var/log/cloud-init-output.log
Once you see a message about the security group of your bootstrap being modified, the script has finished.
After the script has finished, you can:
kubectl get pods -A
to see all the pods running on your management cluster.kubectl get nodes
to retrieve an IP address of one of the cluster nodes, then ssh
into it using the ssh_key
you provided to Terraform.admin
PasswordTo update the admin
password to the Harbor instance created by the 1-Click script, run the following from your bootstrap machine:
curl -XPUT -H 'Content-Type: application/json' -u admin:$HARBOR_ADMIN_PWD "https://$REGISTRY/api/v2.0/users/1/password" --cacert /etc/docker/certs.d/$REGISTRY/ca.crt -d '{
"new_password": "NEW-PASSWORD",
"old_password": "OLD-PASSWORD"
}'
Where REGISTRY
, OLD-PASSWORD
, and NEW-PASSWORD
are the registry name and old and new passwords.
To delete the TKG management cluster deployed by the 1-Click script, run the following from the bootstrap machine:
sudo su
cd air-gapped
./delete-airgapped.sh
The delete the bootstrap server:
Save the TKG management cluster’s kubeconfig
, or delete the management cluster via Delete the Management Cluster above.
From the TKG 1-Click repository, run:
make destroy
To delete the Harbor server:
Make sure no TKG clusters are using the images hosted the Harbor server.
From the TKG 1-Click repository, run:
make destroy-harbor
If the management cluster deployment does not succeed, try the following from the bootstrap machine:
Export KUBECONFIG
to the one used by the temporary local bootstrap kind
cluster when TKG starts deploying:
export KUBECONFIG=~/.kube-tkg/tmp/config_UID
Where UID
is the UID of the kind cluster created by tanzu management-cluster create
Try the following commands to examine the deployed Kubernetes objects:
kubectl get events -A --sort-by='.metadata.creationTimestamp'
kubectl get clusters -n tkg-system -o yaml
kubectl get machinedeployments -n tkg-system -o yaml
kubectl get awsclusters -n tkg-system -o yaml
kubectl get kcp -n tkg-system -o yaml
kubectl get machines -n tkg-system -o yaml
If you are sure about a change that you need to make to one of the .yaml
object specifications above, run kubectl edit <apiobject> -n tkg-system <object name>
to edit the file.
OwnerReferences
section to ensure that it does not have a controller that will revert your changes.The following explanations, diagrams, and CLI output examples explain the deployment process, following the default option of creating an image registry rather than using an existing one:
In Step 1: Transfer the 1-Click Script and TKG Dependencies, you:
Populate the portable storage device with the TKG dependencies and the 1-Click installer repo
Copy the contents of the portable device to the bastion VM.
Copy the dependencies to the AWS S3 bucket.
In Step 2: Set Environment Variables, you export variables used by the 1-Click script.
In Step 3: Edit Script Files you configure more options within the 1-Click script repository and its submodules.
In Step 4: Deploy the Management Cluster, you run the 1-Click script, 1click.sh
.
Using the TKG dependencies, the 1-Click script creates an Amazon Linux 2 AMI inside the VPC hosting an instance of Harbor registry, and populates the registry with TKG images.
After the Harbor instance is created, you can monitor the Harbor logs as described in Monitor the Deployment, above.
Using the TKG dependencies, the 1-Click script creates an AMI for the bootstrap VM in the VPC.
When the bootstrap AMI is ready, you see output similar to:
==> Builds finished. The artifacts of successful builds are:
--> aws-tkg-bootstrap-builder: AMIs were created:
us-east-1: ami-05cf054a1ecf64784
--> aws-tkg-bootstrap-builder: AMIs were created:
us-east-1: ami-05cf054a1ecf64784
Using the TKG dependencies, the 1-Click script creates a STIG AMI with FIPS enabled, for creating cluster nodes.
When the bootstrap AMI is ready, you see output similar to:
==> Builds finished. The artifacts of successful builds are:
--> ubuntu-18.04: AMIs were created:
us-east-1: ami-00b79c841eeef51c2
--> ubuntu-18.04: AMIs were created:
us-east-1: ami-00b79c841eeef51c2
Using the bootstrap AMI, the 1-Click script creates the bootstrap VM in the VPC.
After the bootstrap VM is created, you should be able to monitor the status of your management cluster deploy as described in Monitor the Deployment, above.
The 1-Click script runs tanzu management-cluster create
on the bootstrap VM to deploy the STIG-compliant management cluster in the airgapped VPC.
From the management cluster, you can create and manage STIG-compliant workload clusters.
If you configure the 1-Click script to use an existing registry rather than create a new one, the deployment end state looks like this: