STIG and NSA/CISA Hardening

By default, Tanzu Kubernetes Grid (TKG) workload clusters deployed by standalone management clusters are hardened to the levels shown in STIG Results and Exceptions and CIS Results and Exceptions.

In TKG v2.5, Ubuntu 22.04 and Photon 5 OS nodes are hardened to CIS and STIG standards without requiring additional steps.

  • Photon 5 nodes are CIS- and STIG-hardened.
  • Ubuntu 22.04 nodes are CIS-hardened and hardened for STIG per Ubuntu 20.04 controls because STIG specifications have not yet been released for Ubuntu 22.04, as of the TKG v2.5.0 release date.

This topic explains how to further harden clusters that are not based on Ubuntu 22.04 or Photon 5 OS.

The methods depend on whether the cluster is class-based or plan-based, as described as described in Workload Cluster Types.

TKG releases are continuously validated against the Defense Information Systems Agency (DISA) Kubernetes Security Technical Implementation Guide (STIG) and NSA/CISA Kubernetes Hardening Guide.

Additional Kubernetes Hardening in Class-Based Workload Clusters

To increase the STIG- and CIS-compliance of Kubernetes in class-based workload clusters, configure them as described in the sections below.

For more granular exception handling, you can follow the workarounds listed in the exceptions tables in STIG Results and Exceptions and CIS Results and Exceptions.

STIG Hardening

To harden Kubernetes in class-based workload clusters to STIG standards, do either of the following before creating the cluster:

  • Include the variable settings below in the cluster configuration file
  • Set the variables in your local environment, for example with export

    ETCD_EXTRA_ARGS: "auto-tls=false;peer-auto-tls=false;cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384"
    KUBE_CONTROLLER_MANAGER_EXTRA_ARGS: "tls-min-version=VersionTLS12;profiling=false;tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384"
    WORKER_KUBELET_EXTRA_ARGS: "streaming-connection-idle-timeout=5m;tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384;protect-kernel-defaults=true"
    APISERVER_EXTRA_ARGS: "tls-min-version=VersionTLS12;tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384"
    KUBE_SCHEDULER_EXTRA_ARGS: "tls-min-version=VersionTLS12;tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384"
    CONTROLPLANE_KUBELET_EXTRA_ARGS: "streaming-connection-idle-timeout=5m;tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384;protect-kernel-defaults=true"
    ENABLE_AUDIT_LOGGING: true
    
    

NSA/CISA Hardening

To harden Kubernetes in class-based workload clusters to CIS standards, do the following before creating the cluster:

  1. Review the Event Rate limit configuration below. If you want to change any settings, save the code to event-rate-config.yaml and change the settings as desired:

    apiVersion: eventratelimit.admission.k8s.io/v1alpha1
    kind: Configuration
    limits:
    - type: Namespace
      qps: 50
      burst: 100
      cacheSize: 2000
    - type: User
      qps: 10
      burst: 50
    
  2. If you created event-rate-config.yaml with custom settings, base64-encode the file by running the following and recording the output string:

    • Linux: base64 -w 0 event-rate-config.yaml
    • Mac: base64 -b 0 event-rate-config.yaml
  3. Do either of the following before creating the cluster:

    • Include the variable settings below in the cluster configuration file
    • Set the variables in your local environment, for example with export

      ETCD_EXTRA_ARGS: "auto-tls=false;peer-auto-tls=false;cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_GCM_SHA256"
      KUBE_CONTROLLER_MANAGER_EXTRA_ARGS: "profiling=false;terminated-pod-gc-threshold=500;tls-min-version=VersionTLS12;profiling=false;tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_GCM_SHA256"
      WORKER_KUBELET_EXTRA_ARGS: "read-only-port=0;authorization-mode=Webhook;client-ca-file=/etc/kubernetes/pki/ca.crt;event-qps=0;make-iptables-util-chains=true;streaming-connection-idle-timeout=5m;tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_GCM_SHA256;protect-kernel-defaults=true"
      APISERVER_EXTRA_ARGS: "enable-admission-plugins=AlwaysPullImages,NodeRestriction;profiling=false;service-account-lookup=true;tls-min-version=VersionTLS12;tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_GCM_SHA256"
      KUBE_SCHEDULER_EXTRA_ARGS: "profiling=false;tls-min-version=VersionTLS12;tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_GCM_SHA256"
      CONTROLPLANE_KUBELET_EXTRA_ARGS: "read-only-port=0;authorization-mode=Webhook;client-ca-file=/etc/kubernetes/pki/ca.crt;event-qps=0;make-iptables-util-chains=true;streaming-connection-idle-timeout=5m;tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_GCM_SHA256;protect-kernel-defaults=true"
      APISERVER_EVENT_RATE_LIMIT_CONF_BASE64: "<EVENT-RATE-CONFIG>"
      ENABLE_AUDIT_LOGGING: true
      

    Where <EVENT-RATE-CONFIG> is the base64-encoded value from the last step, or the following if you did not change the Event Rate limit configuration:

    • YXBpVmVyc2lvbjogZXZlbnRyYXRlbGltaXQuYWRtaXNzaW9uLms4cy5pby92MWFscGhhMQpraW5kOiBDb25maWd1cmF0aW9uCmxpbWl0czoKLSB0eXBlOiBOYW1lc3BhY2UKICBxcHM6IDUwCiAgYnVyc3Q6IDEwMAogIGNhY2hlU2l6ZTogMjAwMAotIHR5cGU6IFVzZXIKICBxcHM6IDEwCiAgYnVyc3Q6IDUwCg==

Additional OS Hardening in Class-Based Workload Clusters

To harden Ubuntu OS v20.04 in class-based workload clusters to STIG or CIS standards, create custom hardened VM images for the clusters by running Image Builder with the ansible_user_vars settings for STIG or CIS hardening, as described in Build a Linux Image

Hardening Plan-Based Workload Clusters

Legacy, non class-based TKG workload clusters deployed by standalone management clusters can be hardened by using ytt overlays. For information on how to customize plan-based TKG clusters using ytt, see Legacy Cluster Configuration with ytt.

You can create legacy clusters by setting allow-legacy-cluster to true in your CLI configuration as describe in Features in Tanzu CLI Architecture and Configuration.

STIG Hardening

To further harden plan-based TKG clusters, VMware provides a STIG hardening ytt overlay.

The following snippet is an ytt overlay to set tls-min-version (STIG: V-242378) on the api-server.

#@ load("@ytt:overlay", "overlay")
#@ load("@ytt:data", "data")
#@overlay/match missing_ok=True,by=overlay.subset({"kind":"KubeadmControlPlane"})
---
spec:
  kubeadmConfigSpec:
    clusterConfiguration:
      apiServer:
        extraArgs:
        #@overlay/match missing_ok=True
        tls-min-version: VersionTLS12

NSA/CISA Hardening

To further harden plan-based TKG clusters for NSA/CISA, VMware provides the following Antrea object specifications:

  • NSA/CISA hardening: Antrea ClusterNetworkPolicies:

    The following Antrea ClusterNetworkPolicies specification for Network policies control sets a default policy for all Pods to deny all ingress and egress traffic and ensure that any unselected Pods are isolated.

    apiVersion: security.antrea.tanzu.vmware.com/v1alpha1
    kind: ClusterNetworkPolicy
    metadata:
      name: default-deny
    spec:
      priority: 150
      tier: baseline
      appliedTo:
        - namespaceSelector: {}
      ingress:
        - action: Drop              # For all Pods in every namespace, drop and log all ingress traffic from anywhere
          name: drop-all-ingress
          enableLogging: true
      egress:
        - action: Drop              # For all Pods in every namesapces, drop and log all egress traffic towards anywhere
          name: drop-all-egress
          enableLogging: true
    
  • NSA/CISA hardening: Antrea network policy:

    The following Antrea network policy allows tanzu-capabilities-manager egress to kube-apiserver ports 443 and 6443.

    apiVersion: security.antrea.tanzu.vmware.com/v1alpha1
    kind: NetworkPolicy
    metadata:
      name: tanzu-cm-apiserver
      namespace: tkg-system
    spec:
      priority: 5
      tier: securityops
      appliedTo:
        - podSelector:
            matchLabels:
            app: tanzu-capabilities-manager
      egress:
        - action: Allow
          to:
          - podSelector:
              matchLabels:
              component: kube-apiserver
            namespaceSelector:
              matchLabels:
                kubernetes.io/metadata.name: kube-system
          ports:
          - port: 443
            protocol: TCP
          - port: 6443
            protocol: TCP
          name: AllowToKubeAPI
    
  • NSA/CISA hardening: OPA template and constraints:

    The following example of uses OPA gatekeeper to restrict allowed images repositories.

    • OPA template:

      apiVersion: templates.gatekeeper.sh/v1beta1
      kind: ConstraintTemplate
      metadata:
        name: k8sallowedrepos
        annotations:
          description: Requires container images to begin with a repo string from a specified
            list.
      spec:
        crd:
          spec:
            names:
              kind: K8sAllowedRepos
            validation:
              # Schema for the `parameters` field
              openAPIV3Schema:
                type: object
                properties:
                  repos:
                    type: array
                    items:
                      type: string
        targets:
          - target: admission.k8s.gatekeeper.sh
            rego:|
              package k8sallowedrepos
      
              violation[{"msg": msg}] {
                container := input.review.object.spec.containers[_]
                satisfied := [good| repo = input.parameters.repos[_] ; good = startswith(container.image, repo)]
                not any(satisfied)
                msg := sprintf("container <%v> has an invalid image repo <%v>, allowed repos are %v", [container.name, container.image, input.parameters.repos])
              }
      
              violation[{"msg": msg}] {
                container := input.review.object.spec.initContainers[_]
                satisfied := [good| repo = input.parameters.repos[_] ; good = startswith(container.image, repo)]
                not any(satisfied)
                msg := sprintf("container <%v> has an invalid image repo <%v>, allowed repos are %v", [container.name, container.image, input.parameters.repos])
              }
      
    • OPA constraints:

      apiVersion: constraints.gatekeeper.sh/v1beta1
      kind: K8sAllowedRepos
      metadata:
        name: repo-is-openpolicyagent
      spec:
        match:
          kinds:
            - apiGroups: [""]
              kinds: ["Pod"]
        parameters:
          repos:
            - "<ALLOWED_IMAGE_REPO>"
      
  • NSA/CISA hardening: OPA mutations:

    The following example uses OPA mutation to set allowPrivilegeEscalation to false if it is missing in the pod spec.

    apiVersion: mutations.gatekeeper.sh/v1alpha1
    kind: Assign
    metadata:
      name: allow-privilege-escalation
    spec:
        match:
          scope: Namespaced
          kinds:
            - apiGroups: ["*"]
              kinds: ["Pod"]
          excludedNamespaces:
          - kube-system
        applyTo:
        - groups: [""]
          kinds: ["Pod"]
          versions: ["v1"]
        location: "spec.containers[name:*].securityContext.allowPrivilegeEscalation"
        parameters:
          pathTests:
          - subPath: "spec.containers[name:*].securityContext.allowPrivilegeEscalation"
            condition: MustNotExist
          assign:
            value: false
    

Antrea CNI is used in this guide for network hardening as it provides fine-grained control of network policies using tiers and the ability to apply a cluster-wide security policy using ClusterNetworkPolicy.

Open Policy Agent (OPA) is used instead of pod security policies as they were deprecated in Kubernetes 1.21.

ytt overlays, network policies and OPA policies are created in a way to make it easy for cluster admins to opt out of hardening controls for certain workloads. We suggest not completely opting out of the hardening practices and instead isolating workloads in namespaces where these hardening controls are not applied. Opting out of hardening controls also depends on the risk appetite of the TKG deployment.

NSA/CISA Kubernetes Hardening Guidance

Title Compliant By Default? Can be resolved? Explanation/Exception
Allow privilege escalation No Yes Resolved with OPA Gatekeeper Policy as well as mutations
Non-root containers No Yes Resolved with OPA Gatekeeper Policy as well as mutations.
Exception Some pods such as contour/envoy need root in order to function. Tanzu System Ingress needs to interact with the network
Automatic mapping of service account No Yes Resolved with OPA Gatekeeper Mutation.
Exception Gatekeeper needs access to the API server so its service accounts are automounted
Applications credentials in configuration files No No Exception All of the detected credentials in config files were false positives as they were public keys
Linux hardening No Yes Resolved with OPA gatekeeper constraint as well as mutation to drop all capabilities
Exception Some pods such as contour/envoy need advanced privileges in order to function. Tanzu System Ingress needs to interact with the network
Seccomp Enabled No Yes Resolved with OPA gatekeeper mutation to set a seccomp profile for all pods
Host PID/IPC privileges No Yes A gatekeeper constraint has been added to prohibit all pods from running with PID/IPC
Dangerous capabilities No Yes A gatekeeper constraint has been added to prohibit dangerous capabilities and a mutation has been added to set a default.
Exception Some pods such as contour/envoy need advanced privileges in order to function. Tanzu System Ingress needs to interact with the network
Exec into container No No Kubernetes ships with accounts that have exec access to pods this is likely needed by admins and a customer facing solution would be advised. Such as removing exec in RBAC for normal end users
Allowed hostPath No Yes A gatekeeper constraint has been added to prevent the host path from being mounted
hostNetwork access No Yes A gatekeeper constraint has been added to prevent the host network from being used.
Exception The Kapp controller needs access to the host for tanzu to function and is the only pod outside the control plane allowed host network access
Exposed dashboard Yes
Cluster-admin binding No No A cluster admin binding is needed for k8s to start and should be the only one in the cluster
Resource policies No Yes Fixed by setting a default for all pods via a gatekeeper mutation
Control plane hardening Yes
Insecure capabilities No Yes A gatekeeper constraint has been added to prohibit dangerous capabilities and a mutation has been added to set a default.
Exception Some pods such as contour/envoy need advanced privileges in order to function. Tanzu System Ingress needs to interact with the network
Immutable container filesystem No Yes A gatekeeper constraint has been added to prevent readOnlyRootFilesystems from being disabled.
Exception Pods created by contour/envoy, fluentd, the kapp controller, telemetry agents, and all other data services that need to run on k8s
Caution This mutation can cause issues within the cluster and may not be the wisest to implement.
Privileged container No Yes By default all pods have privileged set to false but a constraint has been added to enforce that a user does not enable it.
Ingress and Egress blocked No Yes A default deny cluster network policy can be implemented in Antrea
Container hostPort No Yes A gatekeeper constraint has been added to ensure users do not use hostPorts.
Exception The Kapp controller needs access to the host for tanzu to function and is the only pod outside the control plane allowed host network access
Network policies No Yes A suite of network policies can be installed to ensure all namespaces have a network policy
Fluent Bit Forwarding to SIEM No Yes Fluent bit needs to be installed and point at a valid output location
Fluent Bit Retry Enabled No Yes Fluent bit needs to be installed by the user with retries enabled in the config
IAAS Metadata endpoint blocked No Yes A Cluster Network Policy can be implemented to restrict all pods from hitting the endpoint
check-circle-line exclamation-circle-line close-line
Scroll to top icon