Keycloak Troubleshooting

After 1.2 CLI upgrade, the Keycloak customizations are not retained

In case one Postgres replica gets in a inconsistent state, you can follow these steps:

Identify which replica is not running correctly

kubectl -n vmware-smarts logs -f tcops-postgres-0
kubectl -n vmware-smarts logs -f tcops-postgres-1

Identify which node the concerned replica is running on (assuming it's "tcops-postgres-0") and get its IP address
```
kubectl -n vmware-smarts get nodes -o wide | grep tcops-postgres-0
```
Delete data under "/var/vmware/postgres/" on the node where the concerned replica is running on
```
ssh root@NODE_IP_ADDRESS
rm -rf /var/vmware/postgres/*
```

Delete the concerned pod so it can be re-created

kubectl -n vmware-smarts delete pods tcops-postgres-0 --force --grace-period=0

In case all Postgres replicas get in a inconsistent state (backup and restore), follow these steps:

Take backup (Keycloak and Grafana):

kubectl exec -it tcops-postgres-0 -n vmware-smarts -- /bin/bash -c 'runuser -u postgres -- pg_dump keycloak > /home/postgres/pg_keycloak.bak'
kubectl exec -it tcops-postgres-0 -n vmware-smarts -- /bin/bash -c 'runuser -u postgres -- pg_dump grafana > /home/postgres/pg_grafana.bak'

Copy backup from pod to host:

kubectl cp vmware-smarts/tcops-postgres-0:/home/postgres/pg_keycloak.bak /opt/pg_keycloak.bak -n vmware-smarts
kubectl cp vmware-smarts/tcops-postgres-0:/home/postgres/pg_grafana.bak /opt/pg_grafana.bak -n vmware-smarts

SSH into every "elasticworker" node and delete data under "/var/vmware/postgres/":
```
rm -rf /var/vmware/postgres/*
```

Recreate cluster (some commands may fail as some resources may have been already deleted)

cd /home/clusteradmin/kubernetes

kubectl -n vmware-smarts delete postgresql tcops-postgres
kubectl -n vmware-smarts delete customResourceDefinition postgresqls.acid.zalan.do
kubectl -n vmware-smarts delete -f postgres-cluster-final.yaml
kubectl -n vmware-smarts delete deployments postgres-operator tcops-postgres-pooler
kubectl -n vmware-smarts delete pods tcops-postgres-0 tcops-postgres-1 --force --grace-period=0
kubectl -n vmware-smarts delete ep tcops-postgres tcops-postgres-pooler
kubectl -n vmware-smarts delete secrets -l application=spilo 
kubectl -n vmware-smarts delete statefulsets tcops-postgres
kubectl -n vmware-smarts delete svc tcops-postgres tcops-postgres-repl tcops-postgres-config
kubectl -n vmware-smarts delete pvc pgdata-tcops-postgres-0 pgdata-tcops-postgres-1
kubectl -n vmware-smarts delete pv postgres-pv-0 postgres-pv-1
kubectl -n vmware-smarts delete poddisruptionbudgets.policy postgres-tcops-postgres-pdb
kubectl -n vmware-smarts delete -f postgres-operator.yaml
kubectl -n vmware-smarts apply -f postgres-pv0.yaml -f postgres-pv1.yaml
kubectl -n vmware-smarts apply -f postgres-operator.yaml && sleep 10 && kubectl -n vmware-smarts apply -f postgres-cluster-final.yaml

Remove pods (Keycloak and Grafana)

kubectl -n vmware-smarts delete pod,rs -l run=keycloakserver --force --grace-period=0
kubectl -n vmware-smarts delete pod,rs -l run=grafana --force --grace-period=0

Check which Postgres pod is the "control plane" checking the column "SPILO-ROLE":
```
kubectl -n vmware-smarts get pods -o wide -l application=spilo -L spilo-role
```

Copy backup from node to pod where control plane is running (in this case "tcops-postgres-0")

kubectl cp /opt/pg_keycloak.bak vmware-smarts/tcops-postgres-0:/home/postgres/pg_keycloak.bak -n vmware-smarts
kubectl cp /opt/pg_grafana.bak vmware-smarts/tcops-postgres-0:/home/postgres/pg_grafana.bak -n vmware-smarts

Restore backup

kubectl exec -it tcops-postgres-0 -n vmware-smarts -- /bin/bash -c 'runuser -u postgres -- psql -d keycloak -f /home/postgres/pg_keycloak.bak postgres'
kubectl exec -it tcops-postgres-0 -n vmware-smarts -- /bin/bash -c 'runuser -u postgres -- psql -d grafana -f /home/postgres/pg_grafana.bak postgres'