Month: March 2022

March 30, 2022

Analyzing Postgresql Tmp Files

Postgresql stores temporary files for in-flight queries — these don’t normally hang around for long, but sorting a large amount of data or building a large hash can create a lot of temp files. A dead query that was sorting a large amount of data or …. well, we’ve gotten terabytes of temp files associated with multiple backend process IDs. The file names are algorithmic — a string “pgsql_tmp followed by the backend PID, a period, and then some other number. Thus, I can extract the PID from each file name and provide a summary of the processes associated with temp files.

To view a summary of the temp files within the pgsql_tmp folder, run the following command to print a count then a PID number:
ls /path/to/pgdata/base/pgsql_tmp | sed -nr 's/pgsql_tmp([0-9]*)\.[0-9]*/\1/p' | sort | uniq -c

A slightly longer command can be used to reverse the columns – producing a list of process IDs followed by the count of files for that PID – too:
ls /path/to/pgdata/base/pgsql_tmp | sed -nr 's/pgsql_tmp([0-9]*)\.[0-9]*/\1/p' | sort | uniq -c | sort -k2nr | awk '{printf("%s\t%s\n",$2,$1)}END{print}'

March 29, 2022

Tracking Down Which Pod is Exhausting IP Connections

We’ve been seeing an error that prevents clients from connecting to Postgresql servers – basically that all available connections are in use and the remaining connections are reserved for superuser and replication activity.

First, we need to determine what the connection limit is —

SELECT setting, source, sourcefile, sourceline FROM pg_settings WHERE name = 'max_connections';

And if there are any per-user connection limits – a limit of -1 means unlimited connections are allowed.

SELECT rolname, rolconnlimit FROM pg_roles

The next step is to identify what connections are exhausting available connections – are there a lot of long-running queries? Are there just more active queries than anticipated? Are there a bunch of idle connections?

SELECT pid, usename, client_addr, client_port 
 ,to_char(pg_stat_activity.query_start, 'YYYY-MM-DD HH:MI:SS') as query_start
 , state, query 
FROM pg_stat_activity
-- where state = 'idle'
-- and usename = 'app_user'
order by query_start;

In our case, there were over 100 idle connections using up about 77% of the available connections. Auto-vacuum, client read operations, and replication easily filled up the remaining available connections.

Because the clients keeping these idle connections open are an app running in a Kubernetes cluster, there’s an extra layer of complexity identifying where the connection is actually sourced. When you view the list of connections from the Postgresql server’s perspective, “client_addr” is the worker hosting the pod.

On the worker server, use conntrack to identify the actual source of the connection – the IP address in “-d” is the IP address of the Postgresql server. To isolate a specific connection, select a “client_port” from the list of connections (37900 in this case) and grep for the port. You will see the src IP of the individual POD.

lhost1750:~ # conntrack -L -f ipv4 -d 10.24.29.140 -o extended | grep 37900
ipv4 2 tcp 6 86394 ESTABLISHED src=10.244.4.80 dst=10.24.29.140 sport=37900 dport=5432 src=10.24.29.140 dst=10.24.29.155 sport=5432 dport=37900 [ASSURED] mark=0 use=1
conntrack v1.4.4 (conntrack-tools): 27 flow entries have been shown.

Then use kubeadm to identify which pod is assigned that address:

lhost1745:~ # kubectl get po --all-namespaces -o wide | grep "10.244.4.80"
kstreams kafka-stream-app-deployment-1336-d8f7d7456-2n24x 2/2 Running 0 10d 10.244.4.80 lhost0.example.net <none> <none>

In this case, we’ve got an application automatically scaling up that can have 25 connections help open and idle … so there isn’t really a solution other than increasing the number of available connections to a number that’s appropriate given the number of client connections we plan on leaving open. I also want to enact a connection limit on the individual account – if there are 250 connections available on the Postgresql server, then limit the application to 200 of those connections.

March 25, 2022

2>/dev/null

A few times now, I’ve encountered individuals with cron jobs or bash scripts where a command execution ends in 2>/dev/null … and the individual is stymied by the fact it’s not working but there’s no clue as to why. The error output is being sent into a big black hole never to escape!

The trick here is to understand file descriptors — 1 is basically a shortcut name for STDOUT and 2 is basically a shortcut name for STDERR (0 is STDIN, although that’s not particularly relevant here). So 2>/dev/null says “take all of the STDERR stuff and redirect it to /dev/null”.

Sometimes you’ll see both STDERR and STDOUT being redirected either to a file or to /dev/null — in that case you will see 2>&1 where the ampersand prior to the “1” indicates the stream is being redirected to a file descriptor (2>1 would direct STDOUT to a file named “1”) — so >/dev/null 2>&1 is the normal way you’d see it written. Functionally, >/dev/null 1>&2 would be the same thing … but redirecting all output into error is, conceptually, a little odd.

To visualize all of this, use a command that will output something to both STDERR and STDOUT — for clarify, I’ve used “1>/dev/null” (redirect STDOUT to /devnull) in conjunction with 2>&1 (redirect STDERR to STDOUT). As written in the text above, the number 1 is generally omitted and just >/dev/null is written.

March 24, 2022

SSL Connection Failure from Docker Image

We have a script that’s used to securely retrieve passwords … a script which failed when run from a Docker container.

* could not load PEM client certificate, OpenSSL error error:140AB18E:SSL routines:SSL_CTX_use_certificate:ca md too weak, (no key found, wrong pass phrase, or wrong file format?)

Appears root of issue is tied to Debian OS that’s used in the python:3.7-slim container that’s being used. Newer iterations of some Linux OS’s have a default setting in the openssl config that provide a setting for SSL_CTX_set_security_level that precludes communication with password server.

Remediating this at the server end is not a reasonable approach, so client config needs to be changed to allow connection to be established. Setting security level to 1 allows connection to proceed, so proposed including additional instruction in Dockerfile that uses sed to update the configuration parameter.

sed -i 's/DEFAULT@SECLEVEL=2/DEFAULT@SECLEVEL=1/' /etc/ssl/openssl.cnf

Once that setting was updated, the script worked perfectly as it does on our physical and VM servers.

March 22, 2022

Duck Egg Challah

We’ve got a lot of duck eggs, so I’ve been trying to find recipes where the extra rich egg … and I thought of challah because most recipes I’ve seen call for an extra yolk or two. Using duck eggs instead of the chicken eggs produced a really delicious, buttery loaf of bread.

4 1/2 cups all purpose flour
1 Tbsp yeast
2 tsp salt
3/4 cups warm water
6 Tbsp olive oil
4 Tbsp maple syrup
3 duck eggs (one was a double yolk)

Put the yeast in the warm water and let it sit until it becomes frothy.

Mix the dry ingredients, add the oil, two of the eggs, maple, and yeast/water and knead until a smooth dough ball forms.

Allow to raise until doubled in size, gently punch down. Shape and allow to raise again until doubled. Mix one egg with a tablespoon or two of water and brush over the dough.

I was making rolls for some fancy sausages we got from the farmers market — so I formed them into oval rolls and baked them at 375F for about 20 minutes. You can make ropes and braid a normal challah loaf — that would bake at 350F for about 35 minutes.

March 21, 2022

Reporting Last Patch Dates on Fedora / RedHat / CentOS Systems

I needed to verify the last time a bunch of servers were patched — basically to ensure compliance with the stated quarterly patching interval. This python script pulls the list of installed packages and the date for each package, sorts the info by date DESC, and then reports the latest date on any packages — as well as the number of packages updated on that date. If there’s only one … the system still might bear some investigation. But if a couple of dozen packages were updated in the past quarter … we don’t need to be too worried about turning up on the out-of-compliance report.

import subprocess
import re
import datetime
from collections import OrderedDict

def getFirstElement(odictInput):
    '''
        This function returns the first element from an ordered collection (an arbitrary element if an unordered collection is passed in)
        Input -- odictInput -- ordered collection
        Output -- type varies -- first element of ordered collection, arbitrary element of unordered collection

    '''
    return next(iter(odictInput))

listHosts = ['host01.example.com', 'host02.example.com', 'host03.example.com','host04.example.com','host05.example.com']

for strHost in listHosts:
        dictPatchDates = {}

        objResults = subprocess.Popen(['ssh', strHost, 'rpm', '-qa', '--last'],stdout=subprocess.PIPE)
        for strLine in objResults.stdout:
                strPackageInfo  = strLine.decode('utf-8').rstrip()
                listPackageInfo = re.split(r'\s*([a-zA-Z]{3,}\s[0-9]{2,}\s[a-zA-Z]{3,}\s[0-9]{2,})',strPackageInfo)
                strUpdateDate = listPackageInfo[1]
                dateUpdateDate = datetime.datetime.strptime(strUpdateDate, "%a %d %b %Y").date()
                if dictPatchDates.get(dateUpdateDate) is not None:
                        dictPatchDates[dateUpdateDate] = dictPatchDates[dateUpdateDate] + 1
                else:
                        dictPatchDates[dateUpdateDate] = 1

        dictOrderedPatchDates = OrderedDict(sorted(dictPatchDates.items(), key=lambda t: t[0],reverse=True))
        dateLatestPatch = getFirstElement(dictOrderedPatchDates)
        print(f"{strHost}\t{dateLatestPatch}\t{dictOrderedPatchDates[dateLatestPatch]}")

March 20, 2022

Hazelnut Flowers

The hazelnut flowers now have bright magenta protrusions popping out of the buds

And the pollen is starting to emerge too.

March 20, 2022

Building Gerbera on Fedora

There is a great deal of documentation available for building Gerbera from source on a variety of Linux flavors. Unfortunately, Fedora isn’t one of those (and the package names don’t exactly match up to let you replace “apt-get” with “yum” and be done). So I am quickly documenting the process we followed to build Gerbera from source.

The Fedora build of Gerbera has the binaries in /usr/bin and the manual build places the gerbera binary in /usr/local/bin — the build updates the unit file to reflect this change, but this means you want to back up any customizations you’ve made to the unit file before running “make install”.

You need the build system — cmake, g++, etc and the devel packages from the following table as required by your build options

Additional packages that we needed to install: automake, autoconf, libtool

Library	Fedora Package	Required?	Note	Compile-time option	Default
libpupnp	libupnp-devel	XOR libnpupnp	pupnp
libnpupnp	Build from source (if needed)	XOR libupnp	I was only able to locate this as a source, not available from Fedora repos	WITH_NPUPNP	Disabled
libuuid	libuuid-devel	Required	Not required on *BSD
pugixml	pugixml-devel	Required	XML file and data support
libiconv	glibc-headers	Required	Charset conversion
sqlite3	sqlite-devel	Required	Database storage
zlib	zlib-devel	Required	Data compression
fmtlib	fmt-devel	Required	Fast string formatting
spdlog	spdlog-devel	Required	Runtime logging
duktape	duktape-devel	Optional	Scripting Support	WITH_JS	Enabled
mysql	mariadb-devel	Optional	Alternate database storage	WITH_MYSQL	Disabled
curl	libcurl-devel	Optional	Enables web services	WITH_CURL	Enabled
taglib	taglib-devel	Optional	Audio tag support	WITH_TAGLIB	Enabled
libmagic	file-devel	Optional	File type detection	WITH_MAGIC	Enabled
libmatroska	libmatroska-devel	Optional	MKV metadata required for MKV	WITH_MATROSKA	Enabled
libebml	libebml-devel	Optional	MKV metadata required for MKV	WITH_MATROSKA	Enabled
ffmpeg/libav	ffmpeg-free-devel	Optional	File metadata	WITH_AVCODEC	Disabled
libexif	libexif-devel	Optional	JPEG Exif metadata	WITH_EXIF	Enabled
libexiv2	exiv2-devel	Optional	Exif, IPTC, XMP metadata	WITH_EXIV2	Disabled
lastfmlib	liblastfm-devel	Optional	Enables scrobbling	WITH_LASTFM	Disabled
ffmpegthumbnailer	ffmpegthumbnailer-devel	Optional	Generate video thumbnails	WITH_FFMPEGTHUMBNAILER	Disabled
inotify	glibc-headers	Optional	Efficient file monitoring	WITH_INOTIFY
libavformat	libavformat-free-devel	Required for 2.0
libavutil	libavutil-free-devel	Required for 2.0
libavcodec	libavcodec-free-devel	Required for 2.0

Then follow the generalized instructions — cd into the folder where you want to run the build and run (customizing the cmake line as you wish):

git clone https://github.com/gerbera/gerbera.git
mkdir build
cd build
cmake ../gerbera -DWITH_MAGIC=1 -DWITH_MYSQL=1 -DWITH_CURL=1 -DWITH_INOTIFY=1 -DWITH_JS=1 -DWITH_TAGLIB=1 -DWITH_AVCODEC=1 -DWITH_FFMPEGTHUMBNAILER=0 -DWITH_EXIF=1 -DWITH_EXIV2=1 -DWITH_SYSTEMD=1 -DWITH_LASTFM=0 -DWITH_DEBUG=1
make -j4
sudo make install

As with the Gerbera binary, the Fedora build places the web content in /usr/share/gerbera and the manual build places the web content into /usr/local/share/gerbera — yes, you can change the paths in the build, and I’m sure you can clue Gerbera into the new web file location. I opted for the quick/easy/lazy solution of running

mv /usr/share/gerbera /usr/share/gerbera/old
ln -s /usr/local/share/gerbera /usr/share/

To symlink the location my config thinks the web components should be located to the new files.

On the first start of Gerbera, SQL scripts may be run to update the database — don’t stop or kill the service during this process there’s no checkpoint restart of the upgrade process. We backed up /etc/gerbera/gerbera.db prior to starting our Gerbera installation. We’ve also wiped the database files to start from scratch and test changes that impacted how items are ingested into the database.

Fin.

March 19, 2022

Crocus Time

Crocus flowers are starting to sprout up around the yard — years ago, Anya and I planted crocus bulbs randomly throughout the front yard. Some have split into three or four plants already, and some are still single little flowers. We’ve even got some sprouting up elsewhere — maybe they’re spreading seeds too?

March 19, 2022

Reverse Osmosis Maple Sap Stats

We collected nine gallons of sap with SG of 1.009 = 2.3 Brix

We ran all of the sap through the reverse osmosis system at 60psi and had sap with SG 1.011 = 2.8 Brix

We ran the concentrated sap through the reverse osmosis system a second time, this time at 80psi and had sap with SG 1.022 = 5.6 Brix.

The “pure water” output SG was about 1.003 — we re-ran this through the RO as well.

At the end of the day, we have about 4 gallons of sap at 5.6% sugar, another gallon from the “pure water” run that’s a lower SG, and four gallons of water that’s removed.

Notes for the future:

We want to see what a single pass at higher pressure does — is it multiple passes that farther concentrated the sap or the higher pressure?
We took SG readings and converted to brix using an online converter. Next time, we should just take the readings in Brix 🙂
We might need a different refractometer to get accurate readings near 1 … not sure how accurate our tool is at the low end of the range.