クラスタノードのリカバリ

Automation Orchestrator ノードをリストアすると、Kubernetes サービスで問題が発生する可能性があります。

Automation Orchestrator クラスタ内の問題のあるノードをリカバリするには、ノードを見つけてクラスタから削除し、クラスタに再度追加する必要があります。

手順

Automation Orchestrator クラスタのプライマリノードを特定します。
1. SSH を使用して、いずれかのノードの Automation Orchestrator Appliance コマンドラインに root としてログインします。
2. primary コマンドを実行して、kubectl -n prelude exec postgres-0 ロールを持つノードを見つけます。
```
kubectl -n prelude exec postgres-0 – chpst -u postgres repmgr cluster show --terse --compact
```
3. プライマリノードが配置されているポッドの名前を取得します。
  ほとんどの場合、ポッドの名前は postgres-0.postgres.prelude.svc.cluster.local です。
4. kubectl -n prelude get pods コマンドを実行して、プライマリノードの FQDN アドレスを見つけます。
```
kubectl -n prelude get pods -o wide
```
5. 取得した名前のデータベースポッドを検索し、対応するノードの FQDN アドレスを取得します。
kubectl -n prelude get node コマンドを実行して、問題のあるノードを見つけます。
問題のあるノードのステータスは NotReady です。
SSH を使用して、プライマリノードの Automation Orchestrator Appliance コマンドラインに root としてログインします。
vracli cluster remove <NODE-FQDN> コマンドを実行して、問題のあるノードをクラスタから削除します。
SSH を使用して、問題のあるノードの Automation Orchestrator Appliance コマンドラインに root としてログインします。
vracli cluster join <MASTER-DB-NODE-FQDN> コマンドを実行して、ノードをクラスタに再度追加します。