published: 2020-06-07
title: painless handling and migration on multiple PostgreSQL instances
tags: free-software
previous [‘Week 1 - learning to draw’] next [‘Python + pandas + matplotlib vs. R + tidyverse - a quick comparison’] parent directory [‘Blog’]
Touching databases has always been something that made my heart beat a little bit faster. Although I’ve become more comfortable with SQL syntax over the years, and I do my backups regularly and diligently, migrations still feel like there is some magic happening that I never fully understood.
I’m running PostgreSQL on an Ubuntu Server 18.04 that is used by
several applications. Over time, updates to the system brought with them
multiple PostgreSQL updates. However, the apt upgrade
won’t
migrate my databases when upgrading, rather it will spawn a new
PostgreSQL instance with a new major version which listens on a new
port.
Right now, I have three of those instances running (9.3, 9.6 and 10), listening on the ports 5432 (the default), 5433 and 5434, respectively:
sudo netstat -tulpn | grep postgres
tcp 0 0 0.0.0.0:5432 0.0.0.0:* LISTEN 1573/postgres
tcp 0 0 0.0.0.0:5433 0.0.0.0:* LISTEN 1572/postgres
tcp 0 0 0.0.0.0:5434 0.0.0.0:* LISTEN 1651/postgres
The instances are used by different services, the PostgreSQL 10 has been freshly installed with the last apt upgrade and is currently rather empty:
ps fax | grep postgres
1572 ? S 0:02 /usr/lib/postgresql/9.5/bin/postgres\
-D /var/lib/postgresql/9.5/main\
-c config_file=/etc/postgresql/9.5/main/postgresql.conf
2182 ? Ss 0:00 \_ postgres: checkpointer process
2183 ? Ss 0:00 \_ postgres: writer process
2184 ? Ss 0:00 \_ postgres: wal writer process
2185 ? Ss 0:01 \_ postgres: autovacuum launcher process
2186 ? Ss 0:09 \_ postgres: stats collector process
4516 ? Ss 0:59 \_ postgres: gitlab gitlabhq_production 127.0.0.1(42648) idle
4517 ? Ss 0:26 \_ postgres: gitlab gitlabhq_production 127.0.0.1(42650) idle
...
22295 ? Ss 0:00 \_ postgres: jiradbuser jiradb 127.0.0.1(51292) idle
...
1573 ? S 0:01 /usr/lib/postgresql/9.3/bin/postgres\
-D /var/lib/postgresql/9.3/main\
-c config_file=/etc/postgresql/9.3/main/postgresql.conf
2158 ? Ss 0:00 \_ postgres: checkpointer process
2159 ? Ss 0:00 \_ postgres: writer process
2160 ? Ss 0:00 \_ postgres: wal writer process
2161 ? Ss 0:01 \_ postgres: autovacuum launcher process
2162 ? Ss 0:02 \_ postgres: stats collector process
2654 ? Ss 0:00 \_ postgres: gitea gitea 127.0.0.1(49174) idle
...
1651 ? S 0:01 /usr/lib/postgresql/10/bin/postgres\
-D /var/lib/postgresql/10/main\
-c config_file=/etc/postgresql/10/main/postgresql.conf
2088 ? Ss 0:00 \_ postgres: checkpointer process
2089 ? Ss 0:00 \_ postgres: writer process
2090 ? Ss 0:00 \_ postgres: wal writer process
2092 ? Ss 0:00 \_ postgres: autovacuum launcher process
2093 ? Ss 0:02 \_ postgres: stats collector process
2094 ? Ss 0:00 \_ postgres: bgworker: logical replication launcher
One thing that took me a while to fully get the grasp of it: Each server version comes with its own set of client tools you want to use to handle your instance. However, I didn’t understand how this works and how I access these different versions.
Let’s start with the example of dumping a database to disk for backup
purpose. The pg_dump
command is the tool of choice here,
however, there is no pg_dump9.3
or pg_dump10
to be found anywhere in your $PATH
. Instead,
pg_dump
is a link to
/usr/share/postgresql-common/pg_wrapper
, a perl script
which determines the name of the command it has been called with and
executes it with the evaluated given flags, options and switches.
Default database settings for that command can also be specified in
either of ~/.postgresqlrc
or
/etc/postgresql-commen/user_clusters
.
To specify the version of the client tool we want to use, we provide
the --cluster
option to our pg_wrapper
.
The following commands will be run as postgres user:
sudo -iu postgres
postgres@ubuntuRechenknecht:~$ pg_dump --version
pg_dump (PostgreSQL) 9.3.16
postgres@ubuntuRechenknecht:~$ pg_dump --cluster 9.5/main --version
pg_dump (PostgreSQL) 9.5.19
postgres@ubuntuRechenknecht:~$ pg_dump --cluster 10/main --version
pg_dump (PostgreSQL) 10.12 (Ubuntu 10.12-0ubuntu0.18.04.1)
Similarly, we can connect to our databases with psql
,
which is another pg_wrapper
-handled command:
postgres@ubuntuRechenknecht:~$ psql
psql (10.12 (Ubuntu 10.12-0ubuntu0.18.04.1), server 9.3.16)
Type "help" for help.
postgres> \q
postgres@ubuntuRechenknecht:~$ psql --cluster 9.5/main
psql (10.12 (Ubuntu 10.12-0ubuntu0.18.04.1), server 9.5.19)
Type "help" for help.
postgres> \q
postgres@ubuntuRechenknecht:~$ psql --cluster 10/main
psql (10.12 (Ubuntu 10.12-0ubuntu0.18.04.1))
Type "help" for help.
Although you could also connect to your instance of choice by specifying the port number like this
postgres@ubuntuRechenknecht:~$ psql --port 5434
psql (10.12 (Ubuntu 10.12-0ubuntu0.18.04.1))
Type "help" for help.
Migrating a PostgreSQL database can be as simple as dumping in from the one instance and pushing it into the other. Just a few things that should be considered:
pg_dump
should always be the version of
the target database, to avoid incompatibilities.In my case, I had to migrate the database from our gitlab instance, since it did not meet the minimum PostgreSQL version[4] that is required to actually run the latest gitlab (yes, also PostgreSQL 10 does not meet that requirement, but it works while 9.5 does not).
postgres_gitlab_version [‘minimum PostgreSQL version’]
To dump the database from the old 9.5 instance with the new 10 client, I need to specify both, cluster and port:
pg_dump --cluster 10/main --port 5433 gitlabhq_production > /tmp/gitlabhq_production
I create the new database on the target instance, and set up the role and permissions needed for gitlab:
createdb --cluster 10/main gitlabhq_production
psql --cluster 10/main
postgres> CREATE ROLE gitlab WITH PASSWORD '********';
postgres> GRANT ALL ON DATABASE gitlabhq_production TO gitlab;
Now the database is ready to be fed with the dump:
psql --port 5434 gitlabhq_production < /tmp/gitlabhq_production
In case of gitlab, the final step would be to update the database
port in the gitlab configuration file
/etc/gitlab/gitlab.rb
...
['db_port'] = 5434
gitlab_rails...
and let gitlab reconfigure itself
sudo gitlab-ctl reconfigure
I feel much more confident now that I know how to handle my different PostgreSQL instances. Next I guess I will have to take some of them down, which I had avoided so far.