home blog tags about

Painless handling and migration on multiple PostgreSQL instances

written by Robin Schubert on 2020-06-07 | Tags: free-software

Touching databases has always been something that made my heart beat a little bit faster. Although I've become more comfortable with SQL syntax over the years, and I do my backups regularly and diligently, migrations still feel like there is some magic happening that I never fully understood.

I'm running PostgreSQL on an Ubuntu Server 18.04 that is used by several applications. Over time, updates to the system brought with them multiple PostgreSQL updates. However, the apt upgrade won't migrate my databases when upgrading, rather it will spawn a new PostgreSQL instance with a new major version which listens on a new port.

Right now, I have three of those instances running (9.3, 9.6 and 10), listening on the ports 5432 (the default), 5433 and 5434, respectively:

sudo netstat -tulpn | grep postgres

tcp     0   0 0.0.0.0:5432      0.0.0.0:*      LISTEN   1573/postgres
tcp     0   0 0.0.0.0:5433      0.0.0.0:*      LISTEN   1572/postgres
tcp     0   0 0.0.0.0:5434      0.0.0.0:*      LISTEN   1651/postgres

The instances are used by different services, the PostgreSQL 10 has been freshly installed with the last apt upgrade and is currently rather empty:

ps fax | grep postgres

 1572 ?  S    0:02 /usr/lib/postgresql/9.5/bin/postgres\
     -D /var/lib/postgresql/9.5/main\
     -c config_file=/etc/postgresql/9.5/main/postgresql.conf
 2182 ?  Ss   0:00  \_ postgres: checkpointer process
 2183 ?  Ss   0:00  \_ postgres: writer process
 2184 ?  Ss   0:00  \_ postgres: wal writer process
 2185 ?  Ss   0:01  \_ postgres: autovacuum launcher process
 2186 ?  Ss   0:09  \_ postgres: stats collector process
 4516 ?  Ss   0:59  \_ postgres: gitlab gitlabhq_production 127.0.0.1(42648) idle
 4517 ?  Ss   0:26  \_ postgres: gitlab gitlabhq_production 127.0.0.1(42650) idle
...
22295 ?  Ss   0:00  \_ postgres: jiradbuser jiradb 127.0.0.1(51292) idle
 ...

 1573 ?  S    0:01 /usr/lib/postgresql/9.3/bin/postgres\
     -D /var/lib/postgresql/9.3/main\
     -c config_file=/etc/postgresql/9.3/main/postgresql.conf
 2158 ?  Ss   0:00  \_ postgres: checkpointer process
 2159 ?  Ss   0:00  \_ postgres: writer process
 2160 ?  Ss   0:00  \_ postgres: wal writer process
 2161 ?  Ss   0:01  \_ postgres: autovacuum launcher process
 2162 ?  Ss   0:02  \_ postgres: stats collector process
 2654 ?  Ss   0:00  \_ postgres: gitea gitea 127.0.0.1(49174) idle
 ...

 1651 ?  S    0:01 /usr/lib/postgresql/10/bin/postgres\
     -D /var/lib/postgresql/10/main\
     -c config_file=/etc/postgresql/10/main/postgresql.conf
 2088 ?  Ss   0:00  \_ postgres: checkpointer process
 2089 ?  Ss   0:00  \_ postgres: writer process
 2090 ?  Ss   0:00  \_ postgres: wal writer process
 2092 ?  Ss   0:00  \_ postgres: autovacuum launcher process
 2093 ?  Ss   0:02  \_ postgres: stats collector process
 2094 ?  Ss   0:00  \_ postgres: bgworker: logical replication launcher

One thing that took me a while to fully get the grasp of it: Each server version comes with its own set of client tools you want to use to handle your instance. However, I didn't understand how this works and how I access these different versions.

Let's start with the example of dumping a database to disk for backup purpose. The pg_dump command is the tool of choice here, however, there is no pg_dump9.3 or pg_dump10 to be found anywhere in your $PATH. Instead, pg_dump is a link to /usr/share/postgresql-common/pg_wrapper, a perl script which determines the name of the command it has been called with and executes it with the evaluated given flags, options and switches.

Default database settings for that command can also be specified in either of ~/.postgresqlrc or /etc/postgresql-commen/user_clusters.

To specify the version of the client tool we want to use, we provide the --cluster option to our pg_wrapper.

The following commands will be run as postgres user:

sudo -iu postgres

postgres@ubuntuRechenknecht:~$ pg_dump --version
pg_dump (PostgreSQL) 9.3.16

postgres@ubuntuRechenknecht:~$ pg_dump --cluster 9.5/main --version
pg_dump (PostgreSQL) 9.5.19

postgres@ubuntuRechenknecht:~$ pg_dump --cluster 10/main --version
pg_dump (PostgreSQL) 10.12 (Ubuntu 10.12-0ubuntu0.18.04.1)

Similarly, we can connect to our databases with psql, which is another pg_wrapper-handled command:

postgres@ubuntuRechenknecht:~$ psql
psql (10.12 (Ubuntu 10.12-0ubuntu0.18.04.1), server 9.3.16)
Type "help" for help.

postgres> \q

postgres@ubuntuRechenknecht:~$ psql --cluster 9.5/main
psql (10.12 (Ubuntu 10.12-0ubuntu0.18.04.1), server 9.5.19)
Type "help" for help.

postgres> \q

postgres@ubuntuRechenknecht:~$ psql --cluster 10/main
psql (10.12 (Ubuntu 10.12-0ubuntu0.18.04.1))
Type "help" for help.

Although you could also connect to your instance of choice by specifying the port number like this

postgres@ubuntuRechenknecht:~$ psql --port 5434
psql (10.12 (Ubuntu 10.12-0ubuntu0.18.04.1))
Type "help" for help.

Migrating a database from one instance to the other

Migrating a PostgreSQL database can be as simple as dumping in from the one instance and pushing it into the other. Just a few things that should be considered:

  1. The version of pg_dump should always be the version of the target database, to avoid incompatibilities.
  2. The target database needs to be created, along with the roles and permissions needed to import the dump.

In my case, I had to migrate the database from our gitlab instance, since it did not meet the minimum PostgreSQL version that is required to actually run the latest gitlab (yes, also PostgreSQL 10 does not meet that requirement, but it works while 9.5 does not).

To dump the database from the old 9.5 instance with the new 10 client, I need to specify both, cluster and port:

pg_dump --cluster 10/main --port 5433 gitlabhq_production > /tmp/gitlabhq_production

I create the new database on the target instance, and set up the role and permissions needed for gitlab:

createdb --cluster 10/main gitlabhq_production

psql --cluster 10/main
postgres> CREATE ROLE gitlab WITH PASSWORD '********';
postgres> GRANT ALL ON DATABASE gitlabhq_production TO gitlab;

Now the database is ready to be fed with the dump:

psql --port 5434 gitlabhq_production < /tmp/gitlabhq_production

In case of gitlab, the final step would be to update the database port in the gitlab configuration file /etc/gitlab/gitlab.rb

...
gitlab_rails['db_port'] = 5434
...

and let gitlab reconfigure itself

sudo gitlab-ctl reconfigure

I feel much more confident now that I know how to handle my different PostgreSQL instances. Next I guess I will have to take some of them down, which I had avoided so far.

Creative Commons License