vSphere IaaS control plane 使用传输层安全 (Transport Layer Security, TLS) 加密来保护组件之间的通信。主管 上的 TKG 包括多个支持此加密基础架构的 TLS 证书。主管 证书轮换需手动执行。TKG 证书轮换自动执行,但如有必要,可以手动执行。

关于 TKG Service 集群的 TLS 证书

vSphere IaaS control plane 使用 TLS 证书保护以下组件之间的通信:
  • vCenter Server
  • 主管 控制平面节点
  • 用作 vSphere Pod 的工作节点的 ESXi 主机
  • TKG 集群节点(控制平面节点和工作节点)
vSphere IaaS control plane 在以下信任域中运行以对系统用户进行身份验证。
信任域 描述
vCenter 信任域 此信任域中 TLS 证书的默认签名者是 vCenter Server 中内置的 VMware Certificate Authority (VMCA)。
Kubernetes 信任域 此信任域中 TLS 证书的默认签名者是 Kubernetes 证书颁发机构 (CA)
有关其他详细信息和 vSphere IaaS control plane 环境中使用的每个 TLS 证书的完整列表,请参阅知识库文章 vSphere with Tanzu 证书指南

TLS 证书轮换

TLS 证书轮换过程有所不同,具体取决于证书是用于 主管 还是 TKG Service 集群。
主管证书轮换

主管 的 TLS 证书派生自 VMCA 证书。有关 主管 证书的详细信息,请参阅知识库文章 vSphere with Tanzu 证书指南

主管 的证书轮换手动执行。有关使用 WCP 证书管理器工具替换 主管 证书的说明,请参阅知识库文章替换 vSphere with Tanzu 主管证书

TKG 2.0 集群证书轮换

通常,您无需手动轮换 TKG 集群的 TLS 证书,因为更新 TKG 集群时,滚动更新过程会自动为您轮换 TLS 证书。

如果 TKG 集群的 TLS 证书尚未过期,并且您需要手动轮换这些证书,可以通过完成下一部分中的步骤进行轮换。

手动轮换 TKG Service 集群的 TLS 证书

这些说明假定您具备 TKG 集群管理的高级知识和经验。此外,这些说明假定 TLS 证书未过期。如果证书已过期,请不要完成以下步骤。
  1. 要运行这些步骤,请通过 SSH 访问 主管 节点之一。请参见以 Kubernetes 管理员和系统用户身份连接到 TKG 服务 集群
  2. 获取 TKG 集群名称。
    export CLUSTER_NAMESPACE="tkg-cluster-ns"
    
    kubectl get clusters -n $CLUSTER_NAMESPACE
    NAME                    PHASE         AGE   VERSION
    tkg-cluster             Provisioned   43h
  3. 获取 TKG 集群 kubeconfig。
    export CLUSTER_NAME="tkg-cluster"
    
    kubectl get secrets -n $CLUSTER_NAMESPACE $CLUSTER_NAME-kubeconfig -o jsonpath='{.data.value}' | base64 -d > $CLUSTER_NAME-kubeconfig
    
  4. 获取 TKG 集群 SSH 密钥。
    kubectl get secrets -n $CLUSTER_NAMESPACE $CLUSTER_NAME-ssh -o jsonpath='{.data.ssh-privatekey}' | base64 -d > $CLUSTER_NAME-ssh-privatekey
    chmod 600 $CLUSTER_NAME-ssh-privatekey
  5. 在证书轮换之前检查环境。
    export KUBECONFIG=$CLUSTER_NAME-kubeconfig
    kubectl get nodes -o wide
    kubectl get nodes \
    -o jsonpath='{.items[*].status.addresses[?(@.type=="InternalIP")].address}' \
    -l node-role.kubernetes.io/master= > nodes
    for i in `cat nodes`; do
        printf "\n######\n"
        ssh -o "StrictHostKeyChecking=no" -i $CLUSTER_NAME-ssh-privatekey -q vmware-system-user@$i hostname
        ssh -o "StrictHostKeyChecking=no" -i $CLUSTER_NAME-ssh-privatekey -q vmware-system-user@$i sudo kubeadm certs check-expiration
    done;
    

    先前命令的示例结果:

    tkg-cluster-control-plane-k8bqh
    [check-expiration] Reading configuration from the cluster...
    [check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
    
    CERTIFICATE                EXPIRES                  RESIDUAL TIME   CERTIFICATE AUTHORITY   EXTERNALLY MANAGED
    admin.conf                 Oct 04, 2023 23:00 UTC   363d                                    no
    apiserver                  Oct 04, 2023 23:00 UTC   363d            ca                      no
    apiserver-etcd-client      Oct 04, 2023 23:00 UTC   363d            etcd-ca                 no
    apiserver-kubelet-client   Oct 04, 2023 23:00 UTC   363d            ca                      no
    controller-manager.conf    Oct 04, 2023 23:00 UTC   363d                                    no
    etcd-healthcheck-client    Oct 04, 2023 23:00 UTC   363d            etcd-ca                 no
    etcd-peer                  Oct 04, 2023 23:00 UTC   363d            etcd-ca                 no
    etcd-server                Oct 04, 2023 23:00 UTC   363d            etcd-ca                 no
    front-proxy-client         Oct 04, 2023 23:00 UTC   363d            front-proxy-ca          no
    scheduler.conf             Oct 04, 2023 23:00 UTC   363d                                    no
    
    CERTIFICATE AUTHORITY   EXPIRES                  RESIDUAL TIME   EXTERNALLY MANAGED
    ca                      Oct 01, 2032 22:56 UTC   9y              no
    etcd-ca                 Oct 01, 2032 22:56 UTC   9y              no
    front-proxy-ca          Oct 01, 2032 22:56 UTC   9y              no
  6. 轮换 TKG 2.0 集群的 TLS 证书。
    将上下文切换回 主管,然后再继续执行以下步骤。
    unset KUBECONFIG
    kubectl config current-context
    kubernetes-admin@kubernetes
    kubectl get kcp  -n $CLUSTER_NAMESPACE $CLUSTER_NAME-control-plane -o jsonpath='{.apiVersion}{"\n"}'
    controlplane.cluster.x-k8s.io/v1beta1
    kubectl get kcp -n $CLUSTER_NAMESPACE $CLUSTER_NAME-control-plane
    NAME                        CLUSTER       INITIALIZED   API SERVER AVAILABLE   REPLICAS   READY   UPDATED   UNAVAILABLE   AGE   VERSION
    tkg-cluster-control-plane   tkg-cluster   true          true                   3          3       3         0             43h   v1.21.6+vmware.1
    
    kubectl patch kcp $CLUSTER_NAME-control-plane -n $CLUSTER_NAMESPACE --type merge -p "{\"spec\":{\"rolloutAfter\":\"`date +'%Y-%m-%dT%TZ'`\"}}"
    kubeadmcontrolplane.controlplane.cluster.x-k8s.io/tkg-cluster-control-plane patched
    计算机部署已启动:
    kubectl get machines -n $CLUSTER_NAMESPACE
    NAME                                              CLUSTER       NODENAME                                          PROVIDERID                                       PHASE          AGE   VERSION
    tkg-cluster-control-plane-k8bqh                   tkg-cluster   tkg-cluster-control-plane-k8bqh                   vsphere://420a2e04-cf75-9b43-f5b6-23ec4df612eb   Running        43h   v1.21.6+vmware.1
    tkg-cluster-control-plane-l7hwd                   tkg-cluster   tkg-cluster-control-plane-l7hwd                   vsphere://420a57cd-a1a0-fec6-a741-19909854feb6   Running        43h   v1.21.6+vmware.1
    tkg-cluster-control-plane-mm6xj                   tkg-cluster   tkg-cluster-control-plane-mm6xj                   vsphere://420a67c2-ce1c-aacc-4f4c-0564daad4efa   Running        43h   v1.21.6+vmware.1
    tkg-cluster-control-plane-nqdv6                   tkg-cluster                                                                                                      Provisioning   25s   v1.21.6+vmware.1
    tkg-cluster-workers-v8575-59c6645b4-wvnlz         tkg-cluster   tkg-cluster-workers-v8575-59c6645b4-wvnlz         vsphere://420aa071-9ac2-02ea-6530-eb59ceabf87b   Running        43h   v1.21.6+vmware.1
    计算机部署已完成:
    kubectl get machines -n $CLUSTER_NAMESPACE
    NAME                                              CLUSTER       NODENAME                                          PROVIDERID                                       PHASE     AGE   VERSION
    tkg-cluster-control-plane-m9745                   tkg-cluster   tkg-cluster-control-plane-m9745                   vsphere://420a5758-50c4-3172-7caf-0bbacaf882d3   Running   17m   v1.21.6+vmware.1
    tkg-cluster-control-plane-nqdv6                   tkg-cluster   tkg-cluster-control-plane-nqdv6                   vsphere://420ad908-00c2-4b9b-74d8-8d197442e767   Running   22m   v1.21.6+vmware.1
    tkg-cluster-control-plane-wdmph                   tkg-cluster   tkg-cluster-control-plane-wdmph                   vsphere://420af38a-f9f8-cb21-e05d-c1bcb6840a93   Running   10m   v1.21.6+vmware.1
    tkg-cluster-workers-v8575-59c6645b4-wvnlz         tkg-cluster   tkg-cluster-workers-v8575-59c6645b4-wvnlz         vsphere://420aa071-9ac2-02ea-6530-eb59ceabf87b   Running   43h   v1.21.6+vmware.1
  7. 验证 TKG 2.0 集群的手动证书轮换。
    运行以下命令以验证证书轮换:
    export KUBECONFIG=$CLUSTER_NAME-kubeconfig
    
    kubectl get nodes -o wide
    NAME                                        STATUS   ROLES                  AGE     VERSION            INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                 KERNEL-VERSION       CONTAINER-RUNTIME
    tkg-cluster-control-plane-m9745             Ready    control-plane,master   15m     v1.21.6+vmware.1   10.244.0.55   <none>        VMware Photon OS/Linux   4.19.198-1.ph3-esx   containerd://1.4.11
    tkg-cluster-control-plane-nqdv6             Ready    control-plane,master   21m     v1.21.6+vmware.1   10.244.0.54   <none>        VMware Photon OS/Linux   4.19.198-1.ph3-esx   containerd://1.4.11
    tkg-cluster-control-plane-wdmph             Ready    control-plane,master   9m22s   v1.21.6+vmware.1   10.244.0.56   <none>        VMware Photon OS/Linux   4.19.198-1.ph3-esx   containerd://1.4.11
    tkg-cluster-workers-v8575-59c6645b4-wvnlz   Ready    <none>                 43h     v1.21.6+vmware.1   10.244.0.51   <none>        VMware Photon OS/Linux   4.19.198-1.ph3-esx   containerd://1.4.11
    
    kubectl get nodes \
    -o jsonpath='{.items[*].status.addresses[?(@.type=="InternalIP")].address}' \
    -l node-role.kubernetes.io/master= > nodes
    
    for i in `cat nodes`; do
        printf "\n######\n"
        ssh -o "StrictHostKeyChecking=no" -i $CLUSTER_NAME-ssh-privatekey -q vmware-system-user@$i hostname
        ssh -o "StrictHostKeyChecking=no" -i $CLUSTER_NAME-ssh-privatekey -q vmware-system-user@$i sudo kubeadm certs check-expiration
    done;
    显示已更新的过期日期的示例结果。
    ######
    tkg-cluster-control-plane-m9745
    [check-expiration] Reading configuration from the cluster...
    [check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
    
    CERTIFICATE                EXPIRES                  RESIDUAL TIME   CERTIFICATE AUTHORITY   EXTERNALLY MANAGED
    admin.conf                 Oct 06, 2023 18:18 UTC   364d                                    no
    apiserver                  Oct 06, 2023 18:18 UTC   364d            ca                      no
    apiserver-etcd-client      Oct 06, 2023 18:18 UTC   364d            etcd-ca                 no
    apiserver-kubelet-client   Oct 06, 2023 18:18 UTC   364d            ca                      no
    controller-manager.conf    Oct 06, 2023 18:18 UTC   364d                                    no
    etcd-healthcheck-client    Oct 06, 2023 18:18 UTC   364d            etcd-ca                 no
    etcd-peer                  Oct 06, 2023 18:18 UTC   364d            etcd-ca                 no
    etcd-server                Oct 06, 2023 18:18 UTC   364d            etcd-ca                 no
    front-proxy-client         Oct 06, 2023 18:18 UTC   364d            front-proxy-ca          no
    scheduler.conf             Oct 06, 2023 18:18 UTC   364d                                    no
    
    CERTIFICATE AUTHORITY   EXPIRES                  RESIDUAL TIME   EXTERNALLY MANAGED
    ca                      Oct 01, 2032 22:56 UTC   9y              no
    etcd-ca                 Oct 01, 2032 22:56 UTC   9y              no
    front-proxy-ca          Oct 01, 2032 22:56 UTC   9y              no
  8. 验证 Kubelet 证书。

    假设 kubelet 配置中的参数 rotateCertificates 设置为 true(默认配置),无需轮换 Kubelet 证书。

    可以使用以下命令验证此配置:
    kubectl get nodes \
    -o jsonpath='{.items[*].status.addresses[?(@.type=="InternalIP")].address}' \
    -l node-role.kubernetes.io/master!= > workernodes
    
    for i in `cat workernodes`; do
        printf "\n######\n"
        ssh -o "StrictHostKeyChecking=no" -i $CLUSTER_NAME-ssh-privatekey -q vmware-system-user@$i hostname
        ssh -o "StrictHostKeyChecking=no" -i $CLUSTER_NAME-ssh-privatekey -q vmware-system-user@$i sudo grep rotate /var/lib/kubelet/config.yaml
    done;
    示例结果:
    ######
    tkg-cluster-workers-v8575-59c6645b4-wvnlz
    rotateCertificates: true