Category: Database

PostgreSQL 12 — Cascading Replication

I’ve got replicated PostgreSQL database pairs that each have some 50TB of data. The server operating systems need to be upgraded, but there is a constraint: no in-place upgrades. I don’t get to veto that constraint (i.e. the fact that we could just cross our fingers and upgrade a replica … and, if it fails, built new and pull the data again doesn’t matter). Unfortunately, trying to add a second replica delays the existing replication. Since all write operations to to the RW server and reads to to the read-only replica … having the read-only copy a day or two out of sync whilst this secondary replica comes online is a non-starter.

Fortunately, you can cascade replication — seed the new replica from the current read-only replica. Create a new replication slot — here new-pg-ro-replica-pgdata. You need to verify the new server is in the pg_hba.conf file to authenticate with the replication account.

pg_basebackup -h pg-ro-replica.example.net -D /pgdata -U replicatorID -v -P --wal-method=stream --slot=new-pg-ro-replica-pgdata

Wait … wait … wait. It’ll finish eventually. Then tweak your recovery.conf

standby_mode = 'on'
primary_conninfo = 'host=pg-rw-replica.example.net port=5432 user=replicatorID password=your_password' sslmode=require
primary_slot_name = 'new-pg-ro-replica-pgdata'

And
touch /pgdata/standby.signal

Finally, start the server

pg_ctl start -D /pgdata

Voila — a second read-only replica. Now they can decom the old server.

Postgresql with File System Compression – VDO and ZFS

Our database storage is sizable. To reduce the financial impact of storing so much data, we opted to use a compressed file system. This allows us to maintain, for example, 8TB of data in under 2TB of space. Unfortunately, the ZFS file system we use to compress our data is no longer “built in” with newer version of RedHat.

There are alternatives. BTRFS is a long-standing option, however it’s got reliability issues (we piloted BTRFS on one of the read only replicas, and the compression ratio is nowhere near as good — the 2TB of ZFS data filled the 10TB BTRFS disk even using the better compression option. And I/O was so slow there was a continual replication backlog). RedHat introduced Virtual Disk Optimizer to replace ZFS. In theory, it’s better since it also deduplicates data (e.g. if every one of us saved the same PPT presentation to the disk, only one copy would actually be stored). That’s great for email and file shares where a lot of people are likely to store the same information. Not so useful on a database server where there’s little to de-duplicate. It does, however, compress data … so we decided to try it out.

The results, unfortunately, are not spectacular. VDO does not allow you to do much customization of the compression. It’s on or off. I’ve found some people tweaking it up in unsupported ways, but the impetus behind trying VDO was that it’s supported by RedHat. Making unsupported changes to it defeats that purpose. And the compression that we’re seeing is far less than we get in ZFS. Our existing servers run between 4.5x and 6x compression

In VDO, however, we don’t even get a 2x compression factor. 11TB of information is stored in 8TB of space! That’s 1.4x

So, while we found the performance of VDO to be satisfactory and it’s really easy to use in newer RedHat releases … we’d have to increase our 20TB LUNs to 80TB to continue storing the data we store today. That seems like A Really Bad Idea(tm).

Seems like I’m going to have to sort out using OpenZFS on the new servers.

Oracle SQLNET.ORA Active Directory Anonymous Binds

A very, very long time ago (2002-ish), we moved to using AD to store our Oracle connections — it’s far easier to edit the one connection entry in Active Directory than to distribute the latest connection file to every desktop and server in the company. Frankly, they never get to the servers. Individuals enter the connections they need … and update them when something stops working and they find the new host/port/etc. Unfortunately, Oracle used an anonymous connection to retrieve the data. So we’ve had anonymous binds enabled in Active Directory ever since. I no longer support AD, so haven’t really kept up with it … until a coworker asked why this huge security vulnerability was specifically configured for our domain. And I gave him the whole history. While we were chatting, a quick search revealed that Oracle 21c and later clients actually can use a wallet for credentials in the sqlnet.ora file:

MES.LDAP_AUTHENTICATE_BIND = TRUE
NAMES.LDAP_AUTHENTICATE_BIND_METHOD = LDAPS_SIMPLE_AUTH
WALLET_LOCATION = (SOURCE = (METHOD = FILE)
(METHOD_DATA =  (DIRECTORY = /path/to/wallet.file)  )

From https://www.oracle.com/a/otn/docs/database/oracle-net-active-directory-naming.pdf

 

Postgresql and Timescale with RedHat VDO

RedHat is phasing out ZFS – there are several reasons for this move, but primarily ZFS is a closed source Solaris (now Oracle) codebase. While OpenZFS exists, it’s not quite ‘the same’. RedHat’s preferred solution is Virtual Data Optimizer (VDO). This page walks through the process of installing PostgreSQL and creating a database cluster on VDO and installing TimescaleDB extension on the database cluster for RedHat Enterprise 8 (RHEL8)

Before we create a VDO disk, we need to install it

yum install vdo kmod-kvdo

Then we need to create a vdo – here a VDO named ‘PGData’ is created on /dev/sdb – a 9TB volume on which we will hold 16TB

vdo create --name=PGData --device=/dev/sdb --vdoLogicalSize=16T

Check to verify that the object was created – it is /dev/mapper/PGData in this instance

vdo list

Now format the volume using xfs.

mkfs.xfs /dev/mapper/PGData

And finally add a mount point

# Create the mount point folder
mkdir /pgpool
# Update fstab to mount the new volume to that mount pint
cat /etc/fstab
/dev/mapper/PGData /pgpool xfs defaults,x-systemd.requires=vdo.service 0 0
# Load the updated fstab
systemctl daemon-reload
# and mount the volume
mount -a

it should be mounted at ‘/pgpool/’

The main reason for using VDO with Postgres is because of its compression feature – this is automatically enabled, although we may need to tweak settings as we test it.

We now have a place in our pool where we want our Postgres database to store its data. So let’s go ahead and install PostgreSQL,

here we are using RHEL8 and installing PostgreSQL 12

# Install the repository RPM:
dnf install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-8-x86_64/pgdg-redhat-repo-latest.noarch.rpm
dnf clean all
# Disable the built-in PostgreSQL module:
dnf -qy module disable postgresql
# Install PostgreSQL:
dnf install -y postgresql12-server

Once the installation is done we need to initiate the database cluster and start the server . Since we want our Postgres to store data in our VDO volume we need to initialize it into our custom directory, we can do that in many ways,

In all cases we need to make sure that the mount point of our zpool i.e., ‘/pgpool/pgdata/’ is owned by the ‘postgres’ user which is created when we install PostgreSQL. We can do that by running the below command before running below steps for starting the postgres server

mkdir /pgpool/pgdata
chown -R postgres:postgres /pgpool

Customize the systemd service by editing the postgresql-12 unit file and updateding the PGDATA environment variable

vdotest-uos:pgpool # grep Environment /usr/lib/systemd/system/postgresql-12.service
# Note: avoid inserting whitespace in these Environment= lines, or you may
Environment=PGDATA=/pgpool/pgdata

and  then initialize, enable and start our server as below

/usr/pgsql-12/bin/postgresql-12-setup initdb
systemctl enable postgresql-12
systemctl start postgresql-12

Here ‘/usr/pgsql-12/bin/’ is the bin directory of postgres installation you can substitute it with your bin directory path.

or

We can also directly give the data directory value while initializing db using below command

/usr/pgsql-12/bin/initdb -D /pgpool/pgdata/

and then start the server using

systemctl start postgresql-12

Now we have installed postgreSQL and started the server, we will install the Timescale extension for Postgres now.

add the time scale repo with below command

tee /etc/yum.repos.d/timescale_timescaledb.repo <<EOL
[timescale_timescaledb]
name=timescale_timescaledb
baseurl=https://packagecloud.io/timescale/timescaledb/el/8/\$basearch
repo_gpgcheck=1
gpgcheck=0
enabled=1
gpgkey=https://packagecloud.io/timescale/timescaledb/gpgkey
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
metadata_expire=300
EOL
sudo yum update -y

then install  it using below command

yum install -y timescaledb-postgresql-12

After installing we need to add ‘timescale’ to shared_preload_libraries in our postgresql.conf, Timescale gives us ‘timescaledb-tune‘ which can be used for this and also configuring different settings for our database. Since we initialize our PG database cluster in a custom location we need to point the direction of postgresql.conf to timescaledb-tune it also requires a path to our pg_config file we can do both by following command.

timescaledb-tune --pg-config=/usr/pgsql-12/bin/pg_config --conf-path=/pgpool/pgdata/postgresql.conf

After running above command we need to restart our Postgres server, we can do that by one of the below commands

systemctl restart postgresql-12

After restarting using one of the above commands connect to the database you want to use Timescale hypertables in and run below statement to load Timescale extension

CREATE EXTENSION IF NOT EXISTS timescaledb CASCADE;

you can check if Timescale is loaded by passing ‘\dx’ command to psql which will load the extension list.

in order to configure PostgreSQL to allow remote connection we need to do couple of changes as below

MongoDB: Changing Host in Replica Set

When we get replacement servers at work, they frequently build a new server with a temporary name and IP address with the plan of swapping the host name and IP with the decommed server. So my Server123 gets turned down, Server123-Temp gets renamed to Server123, and the IP from the old server is configured on the replacement. Everything is operating exactly as it was before even if the direct host name or IP address were used — great for not needing to update firewall rules and vpn profiles, but I encountered a bit of a problem with the MongoDB cluster.

When I initiated the replica set, I did not have to specify a host name. It pulled the host name from the system — which happily provided that temporary name that doesn’t really exist (it’s not in DNS). Which was fine — I could add the temporary name to /etc/hosts along with the future name that I’ve remapped to the current IP so my “new” VMs all talk to each other and the old servers don’t get mucked up.

But, eventually, I’d like the replica set to have the right names. Had I known about this ahead of time, I’d simply have changed the host name value on the box to be the permanent name, initialized the replica set, and returned the temporary name to the box. But I didn’t, and I didn’t really want to start from 0 with the database I’d restored. Luckily, it turns out there’s a process for re-creating the replica set without destroying the data.

First, edit the mongo.conf file and comment out the replica set block. Restart each server with the new config. Then delete the “local” database from each MongoDB server using mongo local --eval "db.dropDatabase()"

Uncomment the replica set block in mongo.conf and restart MongoDB again — initialize the replica set again (make sure the server “knows” it’s proper name first!)

MongoDB: Increasing Log Level

We had a problem with an application accessing our MongoDB cluster, and the log files didn’t provide much useful information. I could see the client connect and disconnect … but nothing in between. I discovered that the default logging level is very low. Good for disk utilization and I/O, but not great for troubleshooting.

db.runCommand({getParameter: 1, logLevel: 1}) # Get the current logging level
db.setLogLevel(3) # Fairly robust logging
db.setLogLevel(5) # don't try this is prod huge stream of text super logging
db.setLogLevel(0) # and set logging back to a low level once you are done troubleshooting

You can also increase the log level for individual components of MongoDB to minimize logging I/O:

db.setLogLevel(2, "accessControl" )

 

MongoDB: Setting Up a Replica Set

On one server create a key file. Copy this key file to all other servers that will participate in the replica set

mkdir -p /opt/mongodb/keys/
openssl rand -base64 756 > /opt/mongodb/keys/$(date '+%Y-%m-%d').key
chmod 400 /opt/mongodb/keys/$(date '+%Y-%m-%d').key
chown -R mongodb:mongodb /opt/mongodb/keys/$(date '+%Y-%m-%d').key

On each server, edit /etc/mongo.conf and add the keyfile to the security section and define a replica set

security:
 authorization: enabled
 keyFile:  /etc/mongodb/keys/mongo-key
#replication:
replication:
  replSetName: "myReplicaSet"

Restart MongoDB on each node.

On one server, use mongosh to enter the MongDB shell.

rs.initiate(
{
_id: "myReplicaSet",
members: [
{ _id: 0, host: "mongohost1.example.net" },
{ _id: 1, host: "mongohost2.example.net" },
{ _id: 2, host: "mongohost3.example.net" }
]
})

Use rs.status() to view the status of the replica set. If it is stuck in STARTING … check connectivity. If the port is open, I ran into a snag with some replacement servers. They’ve got temporary hostnames. But you cannot add a host on itself — it ignores that you typed mongohost1.example.net … and it takes it’s hostname value. And then sends that value to the other servers in the replica set. If you cannot change the hostname to match what you want, there is a process to change the hostname in a replicaset.

MongoDB: Where is my shell?!?

We are upgrading servers from really old MongoDB (4.2.15) to much new MongoDB (6.something). I am used to getting into the MongoDB shell using:

mongoserver:~ # mongo -u $STRMONGOUSER -p $STRMONGOPASS
MongoDB shell version v4.2.15
connecting to: mongodb://127.0.0.1:27017/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("5658a72f-fea0-4316-aa97-4b0c0ffab7ff") }
MongoDB server version: 4.2.15

Except the new server says there’s no such file. And there isn’t. A bit of research later, I learn that the shell is now called mongosh … which is a more reasonable name. It works the same way: mongosh -u $STRMONGOUSER -p $STRMONGOPASS gets me there, and all of the commands I know work.

Backing up (and restoring) *All* Data in MongoDB

The documentation on Mongo’s website tells you to use mongodump with a username, password, destination, and which database you want to back up. Except I wanted to back up and restore everything. Users, multiple databases, I don’t really know what else is in there hence I want everything instead of enumerating the things I want.

Turns out you can just omit the database name and it dumps everything

mongodump --uri="mongodb://<host URL/IP>:<Port>" -u $STRMONGODBUSER -p $STRMONGODBPASS

And restore with

mongorestore --uri="mongodb://<host URL/IP>:<Port>"

Since it’s a blank slate with no authentication or users defined yet.

MongoDB: Basics

We inherited a system that uses MongoDB, and I managed to get the sandbox online without actually learning anything about Mongo. The other environments, though, have data people care about set up in a replicated cluster of database servers. That seems like the sort of thing that’s going to require knowing more than “it’s a NoSQL database of some sort”.

It is a NoSQL database — documents are organized into ‘collections’ within the database. You can have multiple databases hosted on a server, too. A document is a group of key/value pairs with dynamic schema (i.e. you can just make up keys as you go).

There are GUI clients and a command-line shell … of course I’m going with the shell 🙂 There is a db function for basic CRUD operations using db.nameOfCollection then the operation type:

db.collectionName.insert({"key1": "string1", "key2" : false, "key3": 12345})
db.collectionName.find({key3 : {$gt : 10000} })
db.collectionName.update({key1 : "string1"}, {$set: {key3: 100}})
db.collectionName.remove({key1: "string1"});

CRUD operations can also be performed with NodeJS code — create a file with the script you want to run, then run “node myfile.js”

Create a document in a collection

var objMongoClient = require('mongodb').MongoClient;
var strMongoDBURI = "mongodb://mongodb.example.com:27017/";
  
objMongoClient.connect(strMongoDBURI, function(err, db) {
  if (err) throw err;
    var dbo = db.db("dbNameToSelect");
    var objRecord = { key1: "String Value1", key2: false };
    dbo.collection("collectionName").insertOne(objRecord, function(err, res) {
         if (err) throw err;
         console.log("document inserted");
         db.close();
    });
}); 

Read a document in a collection

var objMongoClient = require('mongodb').MongoClient;
var strMongoDBURI = "mongodb://mongodb.example.com:27017/";

objMongoClient.connect(strMongoDBURI, function(err, db) {
  if (err) throw err;
    var dbo = db.db("dbNameToSelect");
    var objQuery = { key1: "String Value 1" };
    dbo.collection("collectionName").find(objQuery).toArray(function(err, result) {
     if (err) throw err;
     console.log(result);
     db.close();
  });
}); 

Update a document in a collection

var objMongoClient = require('mongodb').MongoClient;
var strMongoDBURI = "mongodb://mongodb.example.com:27017/";

objMongoClient.connect(strMongoDBURI, function(err, db) {
if (err) throw err;
  var dbo = db.db("dbNameToSelect");
  var objRecord= { key1: "String Value 1" };
  dbo.collection("collectionName").deleteOne(objRecord, function(err, obj) {
    if (err) throw err;
    console.log("Record deleted");
    db.close();
});
}); 

Delete a document in a collection

var objMongoClient = require('mongodb').MongoClient;
var strMongoDBURI = "mongodb://mongodb.example.com:27017/";

objMongoClient.connect(strMongoDBURI, function(err, db) {
if (err) throw err;
  var dbo = db.db("dbNameToSelect");
  var objQuery = { key1: "String Value 1" };
  var objNewValues = { $set: {key3: 12345, key4: "Another string value" } };
  dbo.collection("collectionName").updateOne(objQuery, objNewValues , function(err, res) {
    if (err) throw err;
    console.log("Record updated");
    db.close();
   });
});