Upgrade a PostgreSQL pod to next major version
Upgrading a PostgreSQL pod to the next major version on Kubernetes
Situation
When you need to upgrade a PostgreSQL pod / statefulset from 9.6 to 11. You manage your own statefulset either through Helm or Kubernetes manifests. This may not be the right approach when using a PostgreSQL operator.
This approach works between any PostgreSQL versions (such as 11 to 14).
Goals:
- Upgrade from postgres 9.6 to postgres 11 with minimal risk and down time (or from 11 to 12).
- Repeatable.
- Easy to automate at large scale.
- Works on Kubernetes
Non-goals:
- Data validation scripts
- Provide production ready automation scripts to run this for large scale database deployments
Before you start
- Make sure you have back-ups of your data. Check how to create a persistent volume snapshot.
- This needs to be tweaked to work for your postgres set-up (mostly correct names of the
pvc
, correctsubPath
and postgres versions). - Please test this procedure in a test environment before updating your production PostgreSQL pods!
Step by step
This is done based on prior work done by tianon. Please note that you will need to verify this approach on your set-up before attempting to upgrade production databases. Also make sure you have a working back-up before starting.
Preperation
You will need two seperate volumes:
- The old data directory for postgres
- A new data directory for the new version of postgres
!! Pay attention to the logging output during an upgrade. It may generate a post-upgrade.sql file that needs to be run. !!
If any post-upgrade processing is required, pg_upgrade will issue warnings as it completes. It will also generate script files that must be run by the administrator. The script files will connect to each database that needs post-upgrade processing. (See 13. Post-Upgrade processing
on pgupgrade docs)
The upgrade step should handle this for you, but be sure to pay attention to the output of this. example:
upgrade_1 | fixing permissions on existing directory /var/lib/postgresql/11/data ... ok
upgrade_1 | creating subdirectories ... ok
upgrade_1 | selecting default max_connections ... 100
upgrade_1 | selecting default shared_buffers ... 128MB
upgrade_1 | selecting default timezone ... Etc/UTC
upgrade_1 | selecting dynamic shared memory implementation ... posix
upgrade_1 | creating configuration files ... ok
upgrade_1 | running bootstrap script ... ok
upgrade_1 | performing post-bootstrap initialization ... ok
upgrade_1 | syncing data to disk ... ok
Demo set-up
This section describes a demo set-up on how to upgrade a Postgres 9.6 deployment to Postgres 11 on Kubernetes. Adjust based on your own deployment. This is environment specific.
You can probably adopt the upgrade and upgrade-post-migration jobs to aid you in this.
First create the demo database. For this, we assume a Gitlab PostgreSQL database.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: gitlab-database
namespace: default
labels:
app.kubernetes.io/name: gitlab
app.kubernetes.io/component: database
spec:
selector:
matchLabels:
app.kubernetes.io/name: gitlab
app.kubernetes.io/component: database
serviceName: gitlab-database
replicas: 1
template:
metadata:
annotations:
backup.velero.io/backup-volumes: pg-data
labels:
app.kubernetes.io/name: gitlab
app.kubernetes.io/component: database
spec:
containers:
- name: pg
image: postgres:9.6
ports:
- name: pg
containerPort: 5432
env:
- name: POSTGRES_DB
value: gitlab
- name: POSTGRES_USER
value: 'gitlab'
- name: POSTGRES_PASSWORD
value: gitlab
volumeMounts:
- mountPath: /var/lib/postgresql/data
name: pg-data
subPath: data
volumeClaimTemplates:
- metadata:
name: pg-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
---
apiVersion: v1
kind: Service
metadata:
name: gitlab-database
namespace: default
spec:
selector:
app.kubernetes.io/name: gitlab
app.kubernetes.io/component: database
type: ClusterIP
ports:
- name: pg
port: 5432
targetPort: pg
Apply the manifests from above:
kubectl apply -f k8s/database.yaml
kubectl rollout status sts/gitlab-database -n default;
This should initalize a new database.
Next, scale the database to zero:
kubectl scale -n default sts/gitlab-database --replicas 0
Wait until the database has shutdown, before moving to the next step.
Upgrade job
Save the following into upgrade-job.yaml
:
apiVersion: batch/v1
kind: Job
metadata:
name: upgrade-postgres
namespace: default
spec:
template:
metadata:
labels:
app: upgrade-postgres
spec:
containers:
- image: tianon/postgres-upgrade:9.6-to-11
name: upgrade-postgres
env:
- name: PGUSER
value: gitlab
- name: POSTGRES_INITDB_ARGS
value: '-U gitlab'
volumeMounts:
- mountPath: /var/lib/postgresql/9.6/data
name: pg-data
subPath: data
- mountPath: /var/lib/postgresql/11/data
name: pg-data-temp
subPath: data
restartPolicy: Never
volumes:
- name: pg-data
persistentVolumeClaim:
claimName: pg-data-gitlab-database-0
- name: pg-data-temp
persistentVolumeClaim:
claimName: pg-data-temp
status: {}
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pg-data-temp
namespace: default
spec:
accessModes:
- ReadWriteMany
# Retain, so we can switch the PV to a new PVC
persistentVolumeReclaimPolicy: Retain
resources:
requests:
storage: 10Gi
Start the upgrade job:
kubectl apply -f k8s/upgrade-job.yaml
Example expected output:
$ k logs -f upgrade-postgres-mtgtd
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.
The database cluster will be initialized with locale "en_US.utf8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".
Data page checksums are disabled.
fixing permissions on existing directory /var/lib/postgresql/11/data ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default timezone ... Etc/UTC
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok
WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
Success. You can now start the database server using:
pg_ctl -D /var/lib/postgresql/11/data -l logfile start
Performing Consistency Checks
-----------------------------
Checking cluster versions ok
Checking database user is the install user ok
Checking database connection settings ok
Checking for prepared transactions ok
Checking for reg* data types in user tables ok
Checking for contrib/isn with bigint-passing mismatch ok
Checking for invalid "unknown" user columns ok
Creating dump of global objects ok
Creating dump of database schemas
gitlab
postgres
template1
ok
Checking for presence of required libraries ok
Checking database user is the install user ok
Checking for prepared transactions ok
If pg_upgrade fails after this point, you must re-initdb the
new cluster before continuing.
Performing Upgrade
------------------
Analyzing all rows in the new cluster ok
Freezing all rows in the new cluster ok
Deleting files from new pg_xact ok
Copying old pg_clog to new server ok
Setting next transaction ID and epoch for new cluster ok
Deleting files from new pg_multixact/offsets ok
Copying old pg_multixact/offsets to new server ok
Deleting files from new pg_multixact/members ok
Copying old pg_multixact/members to new server ok
Setting next multixact ID and offset for new cluster ok
Resetting WAL archives ok
Setting frozenxid and minmxid counters in new cluster ok
Restoring global objects in the new cluster ok
Restoring database schemas in the new cluster
template1
gitlab
postgres
ok
Copying user relation files
/var/lib/postgresql/9.6/data/base/16384/2613
/var/lib/postgresql/9.6/data/base/16384/2683
/var/lib/postgresql/9.6/data/base/16384/2995
/var/lib/postgresql/9.6/data/base/16384/2996
/var/lib/postgresql/9.6/data/base/12407/2613
/var/lib/postgresql/9.6/data/base/12407/2683
/var/lib/postgresql/9.6/data/base/12407/2995
/var/lib/postgresql/9.6/data/base/12407/2996
/var/lib/postgresql/9.6/data/base/1/2613
/var/lib/postgresql/9.6/data/base/1/2683
/var/lib/postgresql/9.6/data/base/1/2995
/var/lib/postgresql/9.6/data/base/1/2996
ok
Setting next OID for new cluster ok
Sync data directory to disk ok
Creating script to analyze new cluster ok
Creating script to delete old cluster ok
Checking for hash indexes ok
Upgrade Complete
----------------
Optimizer statistics are not transferred by pg_upgrade so,
once you start the new server, consider running:
./analyze_new_cluster.sh
Running this script will delete the old cluster's data files:
./delete_old_cluster.sh
Next, we can validate that the postgres 11 upgrade succeeded. Use this step to run any validation query before moving on and removing the temp volumes and old pg 9.6 data.
Save the following to validation-database.yaml
:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: gitlab-database-validation
namespace: default
labels:
app.kubernetes.io/name: gitlab
app.kubernetes.io/component: database-validation
spec:
selector:
matchLabels:
app.kubernetes.io/name: gitlab
app.kubernetes.io/component: database-validation
serviceName: gitlab-database
replicas: 1
template:
metadata:
annotations:
backup.velero.io/backup-volumes: pg-data
labels:
app.kubernetes.io/name: gitlab
app.kubernetes.io/component: database-validation
spec:
containers:
- name: pg
image: postgres:11
ports:
- name: pg
containerPort: 5432
env:
- name: POSTGRES_DB
value: gitlab
- name: POSTGRES_USER
value: 'gitlab'
- name: POSTGRES_PASSWORD
value: gitlab
volumeMounts:
- mountPath: /var/lib/postgresql/data
name: pg-data-temp
subPath: data
volumes:
- name: pg-data-temp
persistentVolumeClaim:
claimName: pg-data-temp
---
apiVersion: v1
kind: Service
metadata:
name: gitlab-database-validation
namespace: default
spec:
selector:
app.kubernetes.io/name: gitlab
app.kubernetes.io/component: database-validation
type: ClusterIP
ports:
- name: pg
port: 5432
targetPort: pg
Apply it to the cluster:
kubectl apply -f k8s/validation-database.yaml
This will start a sepearte statefulset with a PostgreSQL container running PG 11 on the temp data directory.
Use kubectl logs
to validate the database started succesfully and did not initialize a new database.
Run any validation queries you see fit, but note that any modifications made in this database will end-up becoming live after the next step.
Once done, tear down the validation database:
kubectl delete -f k8s/validation-database.yaml
Wait until the database has shutdown before moving on to the next step.
Switching the persistent volume
Here we have two options;
- Move the data from one PVC to the other.
- Switch the Persistent Volumes used by the PVC.
Alternatively, simply start-up the postgres 11 database on a new volume and continue to use that instead. Remove the old PVC once you go live. You will be done after the previous step (validation-database
).
Option 1) Move data between PVCs
This method will destroy the old data.
apiVersion: batch/v1
kind: Job
metadata:
name: upgrade-post-migration
namespace: default
spec:
template:
metadata:
labels:
app: upgrade-post-migration
spec:
containers:
- image: postgres:11
name: transfer-directory
command:
- /bin/bash
- -c
- |
ls -la /var/lib/postgresql/9.6/data && ls -la /var/lib/postgresql/11/data && \
rm -rf /var/lib/postgresql/9.6/data && cp -r /var/lib/postgresql/11/data /var/lib/postgresql/9.6/data && \
chown -R postgres:postgres /var/lib/postgresql/
volumeMounts:
- mountPath: /var/lib/postgresql/9.6/data
name: pg-data
subPath: data
- mountPath: /var/lib/postgresql/11/data
name: pg-data-temp
subPath: data
restartPolicy: Never
volumes:
- name: pg-data
persistentVolumeClaim:
claimName: pg-data-gitlab-database-0
- name: pg-data-temp
persistentVolumeClaim:
claimName: pg-data-temp
Start the upgrade-post-steps job:
kubectl apply -f k8s/upgrade-post-migration.yaml
Wait until this job has run to complation.
Option 2) Switch PV used by PVC
This method will preserve the old data, but requires manual work and is more complex to execute.
- Edit PV of the postgres 11 PVC to have
persistentVolumeReclaimPolicy
with a value ofRetain
. - Make sure you have a copy of the references between PVC and PV’s (
kubectl get pvc
should be sufficient). - Delete the PVC holding the PV
pg-data-temp
. - The pv should now have the status
Released
- In order to not nuke the old PV holding the postgres 9.6 data, also modify the
persistentVolumeReclaimPolicy
with a value ofRetain
. - Remove the
claimRef
from the Postgres 11PV
, so thePV
will move to StatusAvailable
. - Remove the old Postgres 9.6
PVC
- Create a new PVC with the exact same name, using a
volumeName
reference to the Postgres 11PV
You can also follow along to our how to fix PVC binding to wrong PV guide, as you will be doing the same thing here.
Starting Postgres 11
Now we can start postgres with the postgres:11 image:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: gitlab-database
namespace: default
labels:
app.kubernetes.io/name: gitlab
app.kubernetes.io/component: database
spec:
selector:
matchLabels:
app.kubernetes.io/name: gitlab
app.kubernetes.io/component: database
serviceName: gitlab-database
replicas: 1
template:
metadata:
annotations:
backup.velero.io/backup-volumes: pg-data
labels:
app.kubernetes.io/name: gitlab
app.kubernetes.io/component: database
spec:
containers:
- name: pg
image: postgres:11
ports:
- name: pg
containerPort: 5432
env:
- name: POSTGRES_DB
value: gitlab
- name: POSTGRES_USER
value: 'gitlab'
- name: POSTGRES_PASSWORD
value: gitlab
volumeMounts:
- mountPath: /var/lib/postgresql/data
name: pg-data
subPath: data
volumeClaimTemplates:
- metadata:
name: pg-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
kubectl apply -f k8s/database11.yaml
The postgres database should now be running PG 11.
To clean up, remove the following resources:
kubectl delete -f k8s/upgrade-job.yaml
kubectl delete -f k8s/upgrade-post-migration.yaml
A note on pg_hba
The pg_hba.conf
configuration file is not properly transfered when performing an upgrade job. You will most likely need to validate and/or reconfigure this.
When running with the official postgres images, you may need to execute the following command to allow connectivity to the database by regular users:
echo "host all all all md5" >> /var/lib/postgresql/data/pg_hba.conf
Troubleshooting
- If after you started the database, no applications can connect, verify you have configured the
pg_gba
correctly. Usekubectl port-forward
to manually connect to the database if needed. - You can validate disk usage in the pod by using
kubectl exec pod/<db> -- df -h
- Failures that occured during the migration job are most often due to incorrect
POSTGRES_INITDB_ARGS
configuration. - If the migration job failed to start right away, dubble check if you have the correct
subPath
for your volumes.