Update CA tool hangs on ansible playbook updating the node CA certificate when running update-all command. It requires user interaction to terminate it and run the command again.
Problem
Update CA tool hangs on ansible playbook updating the node CA certificate when running update-all command.
The logging output could halt for a few minutes. For example:
TASK [test curl] *********************************************************************************************************************************
changed: [172.16.68.129]
changed: [172.16.68.231]
changed: [172.16.69.68]
TASK [test pull tkr-compatibility] ***************************************************************************************************************
changed: [172.16.68.231]
changed: [172.16.69.68]
The operation is hung on testing 172.16.68.129 in the example.
Cause
Node disconnected while running the node update tasks may be caused by redeployment of control plane node.
Solution
- Check ansible processes and kill the relevant one.
[root@hxu-tcacp-2 ~]# ps -ef | grep ansible
root 753978 753971 15 07:28 pts/0 00:00:30 /usr/bin/python3 /usr/bin/ansible-playbook -i /root/update-ca/ansible/hosts /root/update-ca/ansible/update_node_ca.yml
root 754043 1 0 07:28 ? 00:00:00 ssh: /root/.ansible/cp/d2d0af91b5 [mux]
root 754309 753978 0 07:29 pts/0 00:00:00 /usr/bin/python3 /usr/bin/ansible-playbook -i /root/update-ca/ansible/hosts /root/update-ca/ansible/update_node_ca.yml
root 754314 754309 0 07:29 pts/0 00:00:00 ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User="capv" -o ConnectTimeout=60 -o ControlPath=/root/.ansible/cp/d2d0af91b5 172.16.68.129 /bin/sh -c 'sudo -H -S -n -u root /bin/sh -c '"'"'echo BECOME-SUCCESS-eyjcipkrfnnbjzybxqencwxtdgulalmp ; /usr/bin/python'"'"' && sleep 0'
root 756460 755805 0 07:32 pts/1 00:00:00 grep ansible
[root@hxu-tcacp-2 ~]# kill 754314
After killing the script, continue to run, but it will report some error finally.
- Run the update-allcommand again.