Category: Technology

September 13, 2023

Kubernetes / Containerd Image Pull Failure

We are in the process of moving our k8s environment from CentOS 7 to RHEL 8.8 hosts — which means the version of pretty much everything involved is being updated. All of the pods that use images from an internal registry fail to load. At first, we were thinking DNS resolution … but the test pods we spun up all resolved names appropriately.

2023-09-13 13:48:34 [root@k8s ~/]# kubectl describe pod data-sync-app-deployment-78d58f7cd4-4mlsb -n streams
Events:
  Type     Reason            Age                 From               Message
  ----     ------            ----                ----               -------
  Normal   Scheduled         15m                 default-scheduler  Successfully assigned kstreams/data-sync-app-deployment-78d58f7cd4-4mlsb to ltrkarkvm1593-uos
  Normal   Pulled            15m                 kubelet            Container image "docker.elastic.co/beats/filebeat:7.9.1" already present on machine
  Normal   Created           15m                 kubelet            Created container filebeat
  Normal   Started           15m                 kubelet            Started container filebeat
  Normal   BackOff           15m (x3 over 15m)   kubelet            Back-off pulling image "imageregistry.example.net:5000/myapp/app_uat"
  Warning  Failed            15m (x3 over 15m)   kubelet            Error: ImagePullBackOff
  Normal   Pulling           14m (x3 over 15m)   kubelet            Pulling image "imageregistry.example.net:5000/myapp/app_uat"
  Warning  Failed            14m (x3 over 15m)   kubelet            Failed to pull image "imageregistry.example.net:5000/myapp/app_uat": rpc error: code = Unknown desc = failed to pull and unpack image "imageregistry.example.net:5000/myapp/app_uat:latest": failed to resolve reference "imageregistry.example.net:5000/npm/app_uat:latest": get registry endpoints: parse endpoint url: parse " http://imageregistry.example.net:5000": first path segment in URL cannot contain colon
  Warning  Failed            14m (x3 over 15m)   kubelet            Error: ErrImagePull
  Warning  DNSConfigForming  31s (x73 over 15m)  kubelet            Search Line limits were exceeded, some search paths have been omitted, the applied search line is: kstreams.s            vc.cluster.local svc.cluster.local cluster.local mgmt.windstream.net dsys.windstream.net dnoc.windstream.net

I have found “first path segment in URL cannot contain colon” in reference to Go — and some previous versions at that. There were all sorts of suggestions for working around the issue — escaping the colon, starting with “//”, adding single or double quotes around the string, downgrading to a version of Go not impacted by the problem. Nothing worked.

A few hours with no progress, I thought some time investigating “how can I work around this?” was in order. Kubernetes is using containerd … so it should be feasible to pre-stage the image in containerd and then set our imagePullPolicy values to IfNotPresent or Never

To pre-seed the images in containerd so that they are available for kubernetes run:

ctr -n=k8s.io image pull -u $REGISTRYUSER:$REGISTRYPASSWORD --plain-http imageregistry.example.net:5000/myapp/app_uat:latest

This must be run on every k8s worker in the environment — if a pod tries to spin up on server2 but you’ve only seeded the image file on server1 … the pod will fail to load. We need to update this staged image every time we make changes to the application. Better than not using the new servers, so that’ll just be the process for a while.

Ultimately, the problem ended up being that a few of the workers had a leading space in the TOML file for the repo — how that got there, I have no idea. But once there was no longer extraneous white-space, we could deploy the pods without issue. Now that it’s working “as designed”, we deleted the pre-seeded image using:

ctr -n=k8s.io images rm ImageNameHere

September 7, 2023

MongoDB: Changing Host in Replica Set

When we get replacement servers at work, they frequently build a new server with a temporary name and IP address with the plan of swapping the host name and IP with the decommed server. So my Server123 gets turned down, Server123-Temp gets renamed to Server123, and the IP from the old server is configured on the replacement. Everything is operating exactly as it was before even if the direct host name or IP address were used — great for not needing to update firewall rules and vpn profiles, but I encountered a bit of a problem with the MongoDB cluster.

When I initiated the replica set, I did not have to specify a host name. It pulled the host name from the system — which happily provided that temporary name that doesn’t really exist (it’s not in DNS). Which was fine — I could add the temporary name to /etc/hosts along with the future name that I’ve remapped to the current IP so my “new” VMs all talk to each other and the old servers don’t get mucked up.

But, eventually, I’d like the replica set to have the right names. Had I known about this ahead of time, I’d simply have changed the host name value on the box to be the permanent name, initialized the replica set, and returned the temporary name to the box. But I didn’t, and I didn’t really want to start from 0 with the database I’d restored. Luckily, it turns out there’s a process for re-creating the replica set without destroying the data.

First, edit the mongo.conf file and comment out the replica set block. Restart each server with the new config. Then delete the “local” database from each MongoDB server using mongo local --eval "db.dropDatabase()"

Uncomment the replica set block in mongo.conf and restart MongoDB again — initialize the replica set again (make sure the server “knows” it’s proper name first!)

September 6, 2023

Redis Continually Receiving SIGTERM

I brought up a redis cluster — three servers which all logged basically nothing apart from the fact they were about to shut down. The service status showed as “Activating” — never started — and the server wasn’t doing anything useful.

The redis log reads:

2920940:signal-handler (1694019281) Received SIGTERM scheduling shutdown...
2921151:signal-handler (1694019374) Received SIGTERM scheduling shutdown...
2921518:signal-handler (1694019468) Received SIGTERM scheduling shutdown...
2921726:signal-handler (1694019561) Received SIGTERM scheduling shutdown...
2922133:signal-handler (1694019655) Received SIGTERM scheduling shutdown...
2922410:signal-handler (1694019748) Received SIGTERM scheduling shutdown...
2923173:signal-handler (1694019842) Received SIGTERM scheduling shutdown...
2923537:signal-handler (1694019935) Received SIGTERM scheduling shutdown...
2923747:signal-handler (1694020029) Received SIGTERM scheduling shutdown...
2924110:signal-handler (1694020122) Received SIGTERM scheduling shutdown...
2924319:signal-handler (1694020216) Received SIGTERM scheduling shutdown...
2924687:signal-handler (1694020309) Received SIGTERM scheduling shutdown...
2924900:signal-handler (1694020403) Received SIGTERM scheduling shutdown...
2925266:signal-handler (1694020496) Received SIGTERM scheduling shutdown...
2925467:signal-handler (1694020590) Received SIGTERM scheduling shutdown...

Turns out this is a hazard of copy/pasting a unit file from an older server — evidently redis cannot use a service type of “Forking” with systemd. To resolve the issue, edit /etc/systemd/system/redis.service and updating the type to “simple”. Use systemctl daemon-reload and then systemctl restart redis to launch redis with the new config … voila, I’ve got a cluster of three servers that are started and communicating.

September 5, 2023

MongoDB: Increasing Log Level

We had a problem with an application accessing our MongoDB cluster, and the log files didn’t provide much useful information. I could see the client connect and disconnect … but nothing in between. I discovered that the default logging level is very low. Good for disk utilization and I/O, but not great for troubleshooting.

db.runCommand({getParameter: 1, logLevel: 1}) # Get the current logging level
db.setLogLevel(3) # Fairly robust logging
db.setLogLevel(5) # don't try this is prod huge stream of text super logging
db.setLogLevel(0) # and set logging back to a low level once you are done troubleshooting

You can also increase the log level for individual components of MongoDB to minimize logging I/O:

db.setLogLevel(2, "accessControl" )

September 1, 2023

Tableau: Workbooks and Views Created or Modified By a Specific Individual

I had a manager looking to locate a ‘something in Tableau’ that was created by a specific individual — in this case, it was a terminated employee so “just ask the person” was not a viable solution. I put together a query to list all workbooks owned by or modified by an individual:

SELECT w.id, w.name, w.description, w.owner_id, w.modified_by_user_id, owner_system_users.email AS owner_email, modified_system_users.email AS modifier_email
     FROM  public.workbooks AS w
      LEFT OUTER JOIN public.users AS owner_users on w.owner_id = owner_users.id
      LEFT OUTER JOIN public.users AS modified_users ON w.owner_id = modified_users.id
      LEFT OUTER JOIN public.system_users AS owner_system_users ON owner_system_users.id = owner_users.system_user_id
		LEFT OUTER JOIN public.system_users AS modified_system_users ON modified_system_users.id = modified_users.system_user_id
      WHERE owner_system_users.name = 'UserLogonID';
--      WHERE owner_system_users.email LIKE '%Smith%' OR modified_system_users.email = '%Smith%'
		;

As well as a query to identify all views owned by an individual:

SELECT views.*, owner_system_users.email AS owner_email
     FROM  public.views 
      LEFT OUTER JOIN public.users AS owner_users on views.owner_id = owner_users.id
      LEFT OUTER JOIN public.system_users AS owner_system_users ON owner_system_users.id = owner_users.system_user_id

      WHERE owner_system_users.name = 'UserLogonID';
--      WHERE owner_system_users.email LIKE '%Smith%' OR modified_system_users.email = '%Smith%'
		;

The email address based search is most reasonable — our email addresses are algorithmically based on our names, so we always know what the address would have been. Many contractors, however, don’t have Office 365 licenses or mailboxes … so I have to fall back to finding their logon ID in those cases.

August 31, 2023

MongoDB: Setting Up a Replica Set

On one server create a key file. Copy this key file to all other servers that will participate in the replica set

mkdir -p /opt/mongodb/keys/
openssl rand -base64 756 > /opt/mongodb/keys/$(date '+%Y-%m-%d').key
chmod 400 /opt/mongodb/keys/$(date '+%Y-%m-%d').key
chown -R mongodb:mongodb /opt/mongodb/keys/$(date '+%Y-%m-%d').key

On each server, edit /etc/mongo.conf and add the keyfile to the security section and define a replica set

security:
 authorization: enabled
 keyFile:  /etc/mongodb/keys/mongo-key
#replication:
replication:
  replSetName: "myReplicaSet"

Restart MongoDB on each node.

On one server, use mongosh to enter the MongDB shell.

rs.initiate(
{
_id: "myReplicaSet",
members: [
{ _id: 0, host: "mongohost1.example.net" },
{ _id: 1, host: "mongohost2.example.net" },
{ _id: 2, host: "mongohost3.example.net" }
]
})

Use rs.status() to view the status of the replica set. If it is stuck in STARTING … check connectivity. If the port is open, I ran into a snag with some replacement servers. They’ve got temporary hostnames. But you cannot add a host on itself — it ignores that you typed mongohost1.example.net … and it takes it’s hostname value. And then sends that value to the other servers in the replica set. If you cannot change the hostname to match what you want, there is a process to change the hostname in a replicaset.

August 30, 2023

MongoDB: Where is my shell?!?

We are upgrading servers from really old MongoDB (4.2.15) to much new MongoDB (6.something). I am used to getting into the MongoDB shell using:

mongoserver:~ # mongo -u $STRMONGOUSER -p $STRMONGOPASS MongoDB shell version v4.2.15 connecting to: mongodb://127.0.0.1:27017/?compressors=disabled&gssapiServiceName=mongodb Implicit session: session { "id" : UUID("5658a72f-fea0-4316-aa97-4b0c0ffab7ff") } MongoDB server version: 4.2.15

Except the new server says there’s no such file. And there isn’t. A bit of research later, I learn that the shell is now called mongosh … which is a more reasonable name. It works the same way: mongosh -u $STRMONGOUSER -p $STRMONGOPASS gets me there, and all of the commands I know work.

August 29, 2023

Backing up (and restoring) All Data in MongoDB

The documentation on Mongo’s website tells you to use mongodump with a username, password, destination, and which database you want to back up. Except I wanted to back up and restore everything. Users, multiple databases, I don’t really know what else is in there hence I want everything instead of enumerating the things I want.

Turns out you can just omit the database name and it dumps everything

mongodump --uri="mongodb://<host URL/IP>:<Port>" -u $STRMONGODBUSER -p $STRMONGODBPASS

And restore with

mongorestore --uri="mongodb://<host URL/IP>:<Port>"

Since it’s a blank slate with no authentication or users defined yet.

August 25, 2023

Redis High Availability — Options

The two primary approaches to high availability with Redis are Redis Sentinel and Redis Cluster. There are also third-party solutions, but the provided budget is zero dollars. That … limiting.

Sentinel is the official high availability solution provided by Redis. It monitors Redis instances, detecting failures, and automatically handling failover to a replica. It also provides monitoring/alerting to advise administrators when a problem has been detected.

Sentinel does not provide much in the way of scalability (it adds additional ‘read only’ copies, but there is a single master) but this architecture better ensures consistency (i.e. the same data is present on all nodes). It does, however, promote a read replica to master in the event the master fails, so high availability is achieved.

More than half of the Sentinels need to consider a master down to invoke failover (quorum) – so we would want at least three nodes. We experienced issues with two-node Microsoft quorum-based clustering when the two nodes were unable to communicate. Each node considered its partner to be ‘down’ and decided to be the server in charge. And having two servers in charge corrupts data. With three nodes, should they all become separated … they cannot reach a quorum of two servers agreeing on states.

Cluster automatically distributes data across multiple Redis nodes (called shards). Doing so allows more data to be processed in parallel. Redis Cluster also supports replication and automatic failover.

Since clustering provides both high availability and scaling, if the write load is a consideration, this may be a preferred option; but distributed data means inconsistent data values may be encountered. If data consistency is paramount, clustering may be undesirable. Additionally, not all Redis clients support communicating with a clustered environment. We would need to have our vendor confirm that the application could use a clustered solution.

The minimum recommended environment for production is larger – six servers. This constitutes three master nodes and three replica nodes.

August 25, 2023

MongoDB: Basics

We inherited a system that uses MongoDB, and I managed to get the sandbox online without actually learning anything about Mongo. The other environments, though, have data people care about set up in a replicated cluster of database servers. That seems like the sort of thing that’s going to require knowing more than “it’s a NoSQL database of some sort”.

It is a NoSQL database — documents are organized into ‘collections’ within the database. You can have multiple databases hosted on a server, too. A document is a group of key/value pairs with dynamic schema (i.e. you can just make up keys as you go).

There are GUI clients and a command-line shell … of course I’m going with the shell 🙂 There is a db function for basic CRUD operations using db.nameOfCollection then the operation type:

db.collectionName.insert({"key1": "string1", "key2" : false, "key3": 12345})

db.collectionName.find({key3 : {$gt : 10000} })

db.collectionName.update({key1 : "string1"}, {$set: {key3: 100}})

db.collectionName.remove({key1: "string1"});

CRUD operations can also be performed with NodeJS code — create a file with the script you want to run, then run “node myfile.js”

Create a document in a collection

var objMongoClient = require('mongodb').MongoClient;
var strMongoDBURI = "mongodb://mongodb.example.com:27017/";
  
objMongoClient.connect(strMongoDBURI, function(err, db) {
  if (err) throw err;
    var dbo = db.db("dbNameToSelect");
    var objRecord = { key1: "String Value1", key2: false };
    dbo.collection("collectionName").insertOne(objRecord, function(err, res) {
         if (err) throw err;
         console.log("document inserted");
         db.close();
    });
});

Read a document in a collection

var objMongoClient = require('mongodb').MongoClient;
var strMongoDBURI = "mongodb://mongodb.example.com:27017/";

objMongoClient.connect(strMongoDBURI, function(err, db) {
  if (err) throw err;
    var dbo = db.db("dbNameToSelect");
    var objQuery = { key1: "String Value 1" };
    dbo.collection("collectionName").find(objQuery).toArray(function(err, result) {
     if (err) throw err;
     console.log(result);
     db.close();
  });
});

Update a document in a collection

var objMongoClient = require('mongodb').MongoClient;
var strMongoDBURI = "mongodb://mongodb.example.com:27017/";

objMongoClient.connect(strMongoDBURI, function(err, db) {
if (err) throw err;
  var dbo = db.db("dbNameToSelect");
  var objRecord= { key1: "String Value 1" };
  dbo.collection("collectionName").deleteOne(objRecord, function(err, obj) {
    if (err) throw err;
    console.log("Record deleted");
    db.close();
});
});

Delete a document in a collection

var objMongoClient = require('mongodb').MongoClient;
var strMongoDBURI = "mongodb://mongodb.example.com:27017/";

objMongoClient.connect(strMongoDBURI, function(err, db) {
if (err) throw err;
  var dbo = db.db("dbNameToSelect");
  var objQuery = { key1: "String Value 1" };
  var objNewValues = { $set: {key3: 12345, key4: "Another string value" } };
  dbo.collection("collectionName").updateOne(objQuery, objNewValues , function(err, res) {
    if (err) throw err;
    console.log("Record updated");
    db.close();
   });
});