Tag: postgresql replication

PostgreSQL 12 — Cascading Replication

I’ve got replicated PostgreSQL database pairs that each have some 50TB of data. The server operating systems need to be upgraded, but there is a constraint: no in-place upgrades. I don’t get to veto that constraint (i.e. the fact that we could just cross our fingers and upgrade a replica … and, if it fails, built new and pull the data again doesn’t matter). Unfortunately, trying to add a second replica delays the existing replication. Since all write operations to to the RW server and reads to to the read-only replica … having the read-only copy a day or two out of sync whilst this secondary replica comes online is a non-starter.

Fortunately, you can cascade replication — seed the new replica from the current read-only replica. Create a new replication slot — here new-pg-ro-replica-pgdata. You need to verify the new server is in the pg_hba.conf file to authenticate with the replication account.

pg_basebackup -h pg-ro-replica.example.net -D /pgdata -U replicatorID -v -P --wal-method=stream --slot=new-pg-ro-replica-pgdata

Wait … wait … wait. It’ll finish eventually. Then tweak your recovery.conf

standby_mode = 'on'
primary_conninfo = 'host=pg-rw-replica.example.net port=5432 user=replicatorID password=your_password' sslmode=require
primary_slot_name = 'new-pg-ro-replica-pgdata'

And
touch /pgdata/standby.signal

Finally, start the server

pg_ctl start -D /pgdata

Voila — a second read-only replica. Now they can decom the old server.

Postgresql – Querying Hot Standby Server

We hit our maximum connection limit on some PostgreSQL servers — which made me wonder why the hot standby servers weren’t being used … well, at all. They’re equally big, expensive servers with loads of disk space. But they’re just sitting there “in case”.

So we directed some traffic over to the standby server. I’m also going to tweak a few settings related to user limits — increase the max connections since these are dedicated hosts and have plenty of available I/O, memory, CPU, etc resources; increase the number of reserved connections since replication filled up all of the reserved slots; implement a per-user connection limit on one account that runs a lot of threads — but directing some people who were only trying to look at data over to the standby server seemed like a quick fix.

Now, we discovered something interesting about how queries against the standby interact with replication. It makes a lot of sense when you start thinking about it — if you query against the writable replica, there’s some blocking that goes on. The system isn’t going to vacuum data that you’re currently trying to use. The standby, however, doesn’t have any way to clue the writable replica in to the fact you are trying to use some data. So the writable replica gets a delete, does its thing to hide those rows from future queries, and eventually auto-vacuum comes through and cleans up those rows. All of this gets pushed over to the standby … and there goes the data you were trying to read.

Odds of this happening on a query that takes eight seconds? Incredibly low! Odds increase, however, the longer a query runs. So some of our super massive reports started seeing an error indicating that their query was cancelled “due to a conflict with recovery”

There are two solutions in the PostgreSQL documentation — one is to increase the max_standby_streaming_delay value (there’s also an archive delay, but we aren’t particularly concerned about clients querying the server during recovery operations) the other is to avoid vacuuming data too quickly — either by setting hot_standby_feedback on the standby or increasing vacuum_defer_cleanup_age on the primary.

There’s a third option too — don’t use the standby for long-running queries. That’s easily done in our case … and doesn’t require tweaking any PostgreSQL settings. Ad hoc reporting and direct user access really shouldn’t be implementing such substantial queries (it’s always good to have a SQL expert plan out and optimize complex queries if that’s an option).