This topic provides information that can help troubleshoot problems you may encounter using Postgres for Kubernetes.
Use watch kubectl get all
to monitor the progress of the Postgres operator deployment. The deployment is complete when the postgres operator pod is in the Running
state. For example:
watch kubectl get all
NAME READY STATUS RESTARTS AGE
pod/postgres-operator-567dbc67b9-nrq5t 1/1 Running 0 57s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 2d4h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/postgres-operator 1/1 1 1 57s
NAME DESIRED CURRENT READY AGE
replicaset.apps/postgres-operator-567dbc67b9 1 1 1 57s
Check the logs of the operator to ensure that it is running properly.
kubectl logs -l app=postgres-operator
2019-08-05T17:24:16.182Z INFO controller-runtime.controller Starting EventSource{"controller": "postgres", "source": "kind source: /, Kind="}
2019-08-05T17:24:16.182Z INFO setup starting manager
2019-08-05T17:24:16.285Z INFO controller-runtime.controller Starting Controller {"controller": "postgres"}
2019-08-05T17:24:16.386Z INFO controller-runtime.controller
Starting workers {"controller": "postgres", "worker count": 1}
When you create Postgres instances, each instance is created in its own namespace. To see all Postgres instances in the cluster, add the -all-namespaces
option to the kubectl get
command.
kubectl get postgres --all-namespaces
NAMESPACE NAME STATUS AGE
default postgres-sample Running 19d
default postgres-sample2 Running 15d
test my-postgres Pending 15d
test my-postgres3 Pending 15d
In scenarios when a Postgres instance is not running due to misconfiguration, or insufficient resources, or any error scenarios, you can investigate the error by running kubectl describe postgres <postgres-instance-name> -n <namespace>
. The Status
section of the field Message
will contain the error message encountered. For example:
Status:
Backups:
Last Created:
Last Successful:
Binding:
Name: postgres-sample-app-user-db-secret
Current State: Created
Db Version: 14.5
Message: PostgresBackupLocation.sql.tanzu.vmware.com "nonexistent-location" not found
If there are multiple errors, they will appear in a list separated by semi-colons.
To find the currently deployed version of the Postgres operator, use the helm
command:
helm ls
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
postgres-operator default 1 2022-08-11 13:26:00.769535 -0500 CDT deployed postgres-operator-v2.0.0 v2.0.0
The version is in the chart name and the APP VERSION
column.
To find the version of a Postgres instance, use the kubectl
command to describe the instance's pod.
kubectl get pods
kubectl get pods
NAME READY STATUS RESTARTS AGE
postgres-sample-0 1/1 Running 0 9s
postgres-operator-85f777b9db-wbj9b 1/1 Running 0 4m15s
Name: postgres-sample-0
Namespace: default
Priority: 0
Node: minikube/192.168.64.32
Start Time: Mon, 11 Oct 2021 14:10:38 -0500
Labels: app=postgres
controller-revision-hash=postgres-sample-5fc8fb8b4b
headless-service=postgres-sample
postgres-instance=postgres-sample
role=read
statefulset.kubernetes.io/pod-name=postgres-sample-0
type=data
Annotations: <none>
Status: Running
IP: 172.17.0.8
Controlled By: StatefulSet/my-postgres
Containers:
pg-container:
Container ID: docker://6c651d690a6fdb6d1c0d3644ad8225037d31da1c33fd3f88f1625bdfd45cea3a
Image: postgres-instance:v2.0.0
Image ID: docker://sha256:00359ca344dd96eb05f2bd430430c97a6d46a40996c395fca44c209cb954a6e7
Port: 5432/TCP
Host Port: 0/TCP
The VMware Postgres Operator version can be found in the image name of the pg-container
entry.
When deploying an instance using a specific storage size in the instance yaml
deployment file, you cannot reduce the instance data storage size at a later stage. For example, after creating an instance and setting the storage size to 100M:
kubectl create -f postgres.yaml
Verify the storage size using a command similar to:
kubectl get postgres.sql.tanzu.vmware.com/postgres-sample -o jsonpath='{.spec.storageSize}'
100M
If you later patch the instance to decrease the storage size from 100M to 2M:
kubectl patch postgres.sql.tanzu.vmware.com/postgres-sample --type merge -p '{"spec":{"storageSize": "2M"}}'
the operation returns an error similar to:
Error from server (spec.storageSize: Invalid Value: "2M" spec.storageSize cannot be reduced for an existing instance
spec.storageSize: Invalid Value: "2M" spec.storageSize needs to be at least 250M): admission webhook "vpostgres.kb.io" denied the request: spec.storageSize: Invalid Value: "2M" spec.storageSize cannot be reduced for an existing instance
spec.storageSize: Invalid Value: "2M" spec.storageSize needs to be at least 250M
To reduce the instance data size, create a new instance and migrate the source data over. Ensure that the source data fits in the reduced data size allocation of the newly created instance.
When the monitor pod or data pods are out of disk space, you could receive an error similar to:
2022-07-25 20:02:19.028 UTC [248] LOG: could not close temporary statistics file "pg_stat_tmp/global.tmp": No space left on device
Resolve the issue either by increasing the storage size, or by restoring to a new instance with a larger size volume.
Increase storage size
You can modify the Postgres data volume and expand it. For information how to verify that your PVs are expandable, and how to increase them, see Expanding Storage Volume Size.
Restore to an instance with a larger volume
If your instance is backed up, you can restore it to a new instance that has a larger data volume. For restore details see Restore to a Different Instance.
This scenario occurs when you have two separate Kubernetes clusters with matching instance and namespace names. This scenario requires the following conditions:
my-namespace
, and cluster 2 has a namespace called my-namespace
.my-instance
.During backup, the first Postgres instance creates a backup stanza using the format my-instance-my-namespace
. That stanza is encrypted with a randomly-generated backup cipher. During backup configuration for the second instance, the instance detects that a backup stanza with the same name already exists in the bucket. However, the second instance cannot decrypt the backup information because it uses a different cipher. The error is similar to: :
ERROR: [043]: WAL segment to get required 2021-09-02 15:55:35.615 P00 INFO: archive-get command end: aborted with exception [043] command terminated with exit code 43 or FormatError: key/value found outside of section at line 1: ▒▒▒H▒t=֠O@▒Y▒.
Workaround: Use different instance names, or different namespace names, or different buckets for backups.
If you see a different configMap value than the one you set, check for errors in the instance using
kubectl describe postgres <instance-name> -n <namespace-name>
.
An example error could be:
Warning ConfigFileError 24s postgrescontroller 19:28:11 642 WARN Postgres logs from "/pgsql/data/startup.log":
19:28:11 642 INFO 2022-12-07 19:28:10.875 GMT [641] LOG: invalid value for parameter "log_timezone": "PST"
19:28:11 642 INFO 2022-12-07 19:28:10.875 GMT [641] FATAL: configuration file "/etc/customconfig/postgresql.conf" contains errors
Look for ConfigFileError
under Events. In this specific example, the field log_timezone
was set to the value PST
, which is an invalid value. Edit the ConfigMap, and re-apply, as described in Updating PostgreSQL parameters. After fixing the error, the pod will restart.
Confirm that you have provided valid field names. For details on PostgreSQL configuration parameter names refer to PostgreSQL Server parameters. A sample error could be:
Warning ConfigFileError 3s postgrescontroller 19:36:28 349 WARN Postgres logs from "/pgsql/data/startup.log":
19:36:28 349 INFO 2022-12-07 19:36:28.231 GMT [331] LOG: unrecognized configuration parameter "log_timezones" in file "/etc/customconfig/ postgresql.conf" line 1
19:36:28 349 INFO 2022-12-07 19:36:28.231 GMT [331] FATAL: configuration file "/etc/customconfig/postgresql.conf" contains errors
In this example, the field log_timezone
has been mistakenly entered as log_timezones
with an s
, which is an invalid value.
Review if the field you attempted to change is part of the exception list. Certain fields have default values that cannot be overwritten, and your custom values will not be applied. For the parameter exceptions details refer to Exceptions.
If your instance appears to be stuck in Pending
state, or a pod has gone into CrashLoopBackoff
, first check the events for errors by running:
kubectl describe postgres <instance-name> -n <namespace-name>
Look for ConfigFileError
under Events.
If the events have expired, check the logs in the pg-container by running:
kubectl logs -l postgres-instance=<instance-name>,type=data -n <namespace> -c <pg-container>
Look for a line that begins with FATAL
. The output could be similar to:
19:42:31.751 GMT [1181] FATAL: configuration file \"/etc/customconfig/postgresql.conf\" contains errors\n"}
Review the steps in PostgreSQL configuration settings not applied to verify that you have used valid field names and values in your configMap.
Once you update configMap with valid field names and values, the affected pod will restart and the updated values will be applied allowing the instance achieve a Running
state.
There are many scenarios when indexes may need to be rebuilt. Refer to the information in PostgreSQL REINDEX.
Use the reindexdb
utility to reindex all databases, a single database, a single schema, a single table, or a single index depending on your use case. A standard index rebuild will allow table reads and prevent table writes, until the action is complete.
Before rebuilding affected indexes, review the reindex documentation that matches the Postgres database major version of the affected Postgres instance. For PostgreSQL 15 for example, refer to reindexdb.
Procedure
Determine the primary pod name by running:
kubectl get pod -l postgres-instance=<INSTANCE-NAME>,role=read-write -n <NAMESPACE-NAME>`
where:
INSTANCE-NAME
is the name of the Postgres instanceNAMESPACE-NAME
is the name of the namespaceSample output:
NAME READY STATUS RESTARTS AGE
my-postgres-0 5/5 Running 0 25s
Connect to a container shell on the pod by using:
kubectl -n <NAMESPACE-NAME> exec -it <POD-NAME> -- bash
where:
POD-NAME
is the name of the pod returned from the previous outputNAMESPACE-NAME
is the name of the namespaceTo index all databases run:
reindexdb --all
The ouput should be similar to:
reindexdb: reindexing database "my-postgres"
reindexdb: reindexing database "postgres"
reindexdb: reindexing database "template1"