Painless handling and migration on multiple PostgreSQL instances
Touching databases has always been something that made my heart beat a little bit faster. Although I've become more comfortable with SQL syntax over the years, and I do my backups regularly and diligently, migrations still feel like there is some magic happening that I never fully understood.
I'm running PostgreSQL on an Ubuntu Server 18.04 that is used by several applications. Over time, updates to the system brought with them multiple PostgreSQL updates. However, the apt upgrade
won't migrate my databases when upgrading, rather it will spawn a new PostgreSQL instance with a new major version which listens on a new port.
Right now, I have three of those instances running (9.3, 9.6 and 10), listening on the ports 5432 (the default), 5433 and 5434, respectively:
sudo netstat -tulpn | grep postgres tcp 0 0 0.0.0.0:5432 0.0.0.0:* LISTEN 1573/postgres tcp 0 0 0.0.0.0:5433 0.0.0.0:* LISTEN 1572/postgres tcp 0 0 0.0.0.0:5434 0.0.0.0:* LISTEN 1651/postgres
The instances are used by different services, the PostgreSQL 10 has been freshly installed with the last apt upgrade and is currently rather empty:
ps fax | grep postgres 1572 ? S 0:02 /usr/lib/postgresql/9.5/bin/postgres\ -D /var/lib/postgresql/9.5/main\ -c config_file=/etc/postgresql/9.5/main/postgresql.conf 2182 ? Ss 0:00 \_ postgres: checkpointer process 2183 ? Ss 0:00 \_ postgres: writer process 2184 ? Ss 0:00 \_ postgres: wal writer process 2185 ? Ss 0:01 \_ postgres: autovacuum launcher process 2186 ? Ss 0:09 \_ postgres: stats collector process 4516 ? Ss 0:59 \_ postgres: gitlab gitlabhq_production 127.0.0.1(42648) idle 4517 ? Ss 0:26 \_ postgres: gitlab gitlabhq_production 127.0.0.1(42650) idle ... 22295 ? Ss 0:00 \_ postgres: jiradbuser jiradb 127.0.0.1(51292) idle ... 1573 ? S 0:01 /usr/lib/postgresql/9.3/bin/postgres\ -D /var/lib/postgresql/9.3/main\ -c config_file=/etc/postgresql/9.3/main/postgresql.conf 2158 ? Ss 0:00 \_ postgres: checkpointer process 2159 ? Ss 0:00 \_ postgres: writer process 2160 ? Ss 0:00 \_ postgres: wal writer process 2161 ? Ss 0:01 \_ postgres: autovacuum launcher process 2162 ? Ss 0:02 \_ postgres: stats collector process 2654 ? Ss 0:00 \_ postgres: gitea gitea 127.0.0.1(49174) idle ... 1651 ? S 0:01 /usr/lib/postgresql/10/bin/postgres\ -D /var/lib/postgresql/10/main\ -c config_file=/etc/postgresql/10/main/postgresql.conf 2088 ? Ss 0:00 \_ postgres: checkpointer process 2089 ? Ss 0:00 \_ postgres: writer process 2090 ? Ss 0:00 \_ postgres: wal writer process 2092 ? Ss 0:00 \_ postgres: autovacuum launcher process 2093 ? Ss 0:02 \_ postgres: stats collector process 2094 ? Ss 0:00 \_ postgres: bgworker: logical replication launcher
One thing that took me a while to fully get the grasp of it: Each server version comes with its own set of client tools you want to use to handle your instance. However, I didn't understand how this works and how I access these different versions.
Let's start with the example of dumping a database to disk for backup purpose. The pg_dump
command is the tool of choice here, however, there is no pg_dump9.3
or pg_dump10
to be found anywhere in your $PATH
. Instead, pg_dump
is a link to /usr/share/postgresql-common/pg_wrapper
, a perl script which determines the name of the command it has been called with and executes it with the evaluated given flags, options and switches.
Default database settings for that command can also be specified in either of ~/.postgresqlrc
or /etc/postgresql-commen/user_clusters
.
To specify the version of the client tool we want to use, we provide the --cluster
option to our pg_wrapper
.
The following commands will be run as postgres user:
sudo -iu postgres postgres@ubuntuRechenknecht:~$ pg_dump --version pg_dump (PostgreSQL) 9.3.16 postgres@ubuntuRechenknecht:~$ pg_dump --cluster 9.5/main --version pg_dump (PostgreSQL) 9.5.19 postgres@ubuntuRechenknecht:~$ pg_dump --cluster 10/main --version pg_dump (PostgreSQL) 10.12 (Ubuntu 10.12-0ubuntu0.18.04.1)
Similarly, we can connect to our databases with psql
, which is another pg_wrapper
-handled command:
postgres@ubuntuRechenknecht:~$ psql psql (10.12 (Ubuntu 10.12-0ubuntu0.18.04.1), server 9.3.16) Type "help" for help. postgres> \q postgres@ubuntuRechenknecht:~$ psql --cluster 9.5/main psql (10.12 (Ubuntu 10.12-0ubuntu0.18.04.1), server 9.5.19) Type "help" for help. postgres> \q postgres@ubuntuRechenknecht:~$ psql --cluster 10/main psql (10.12 (Ubuntu 10.12-0ubuntu0.18.04.1)) Type "help" for help.
Although you could also connect to your instance of choice by specifying the port number like this
postgres@ubuntuRechenknecht:~$ psql --port 5434 psql (10.12 (Ubuntu 10.12-0ubuntu0.18.04.1)) Type "help" for help.
Migrating a database from one instance to the other
Migrating a PostgreSQL database can be as simple as dumping in from the one instance and pushing it into the other. Just a few things that should be considered:
- The version of
pg_dump
should always be the version of the target database, to avoid incompatibilities. - The target database needs to be created, along with the roles and permissions needed to import the dump.
In my case, I had to migrate the database from our gitlab instance, since it did not meet the minimum PostgreSQL version that is required to actually run the latest gitlab (yes, also PostgreSQL 10 does not meet that requirement, but it works while 9.5 does not).
To dump the database from the old 9.5 instance with the new 10 client, I need to specify both, cluster and port:
pg_dump --cluster 10/main --port 5433 gitlabhq_production > /tmp/gitlabhq_production
I create the new database on the target instance, and set up the role and permissions needed for gitlab:
createdb --cluster 10/main gitlabhq_production psql --cluster 10/main postgres> CREATE ROLE gitlab WITH PASSWORD '********'; postgres> GRANT ALL ON DATABASE gitlabhq_production TO gitlab;
Now the database is ready to be fed with the dump:
psql --port 5434 gitlabhq_production < /tmp/gitlabhq_production
In case of gitlab, the final step would be to update the database port in the gitlab configuration file /etc/gitlab/gitlab.rb
... gitlab_rails['db_port'] = 5434 ...
and let gitlab reconfigure itself
sudo gitlab-ctl reconfigure
I feel much more confident now that I know how to handle my different PostgreSQL instances. Next I guess I will have to take some of them down, which I had avoided so far.