Category: Technology

DNF — What Provides This File?

“Dependency hell” used to be a big problem — you’d download one package, attempt to install it, and find out you needed three other packages. Download one of them, attempt to install it, and learn about five other packages. Fifty seven packages later, you forgot what you were trying to install in the first place and went home. Or, I suppose, you managed to install that first package and actually use it. The advent of repo-based deployments — where dependencies can be resolved and automatically downloaded — has mostly eliminated dependency hell. But, occasionally, you’ll have a manual install that says “oh, I cannot complete. I need libgdkglext-x11-1.0.so.0 or libminizip.so.1 … and, if there’s a package that’s named libgdkglext-x11 or libminizip … you’re good. There’s not. Fortunately, you can use “dnf provides” to search for a package that provides a specific file — thus learning that you need the gtkglext-libs and minizip-compat packages to resolve your dependencies.

Adding MariaDB/MySQL Users

Quick notes on creating a database user — MariaDB and MySQL use a combination of username and source host to determine access. This means ‘me’@’localhost’ and ‘me’@’remotehost’ can have different passwords and privilege sets. How do you know what the hostname is for your connection? I usually try to connect and read the host from the error message — it’ll say ‘someone’@’something’ cannot access the database.

# Create a user that is allowed to connect from a specific host
create user 'username'@'hostname' identified by 'S0m3P@s5w0rd';
GRANT ALL PRIVILEGES ON dbname.* TO 'username'@'hostname';

# Create a user that is allowed to connect from a specific IP
create user 'username1'@'10.5.1.2' identified by 'S0m3P@s5w0rd';
GRANT ALL PRIVILEGES ON dbname.* TO 'username1'@'10.5.1.2';

# Create a user that is allowed to connect from database server
create user 'username2'@'localhost' identified by 'S0m3P@s5w0rd';
GRANT ALL PRIVILEGES ON dbname.* TO 'username2'@'localhost';

# Create a user that is allowed to connect from any host
create user 'username3'@'%' identified by 'S0m3P@s5w0rd';
GRANT ALL PRIVILEGES ON dbname.* TO 'username3'@'%';

# Flush so new privileges are effective
flush privileges;

# View list of database users
SELECT User, Host FROM mysql.user;
+----------------+------------+
| User           | Host       |
+----------------+------------+
| username3      | %          |
| username2      | 10.5.1.2   |
| username       | hostname   |
| root           | 127.0.0.1  |
| root           | ::1        |
| root           | localhost  |
+----------------+------------+
6 rows in set (0.000 sec)

Bootleg Zoom Recordings

Our township has been holding meetings in Zoom, but the recordings are for the purpose of generating minutes only. It’s an end-run around document retention management (it’s not a record, so we don’t need to retain it) … but it also means residents cannot just catch the meeting whenever they’ve got time. In theory, it seemed easy enough to use a screen recording program (I use OBS Studio) to record the meetings. Problem is, though … I’m one of the people who isn’t generally available in the late afternoon / early evening when they schedule their meetings. Surely I could join the meeting early & just edit out the “dead air” within the recording …

(0) To configure Zoom client settings, you need to sign up for an account. Now that your client is signed in, there are settings.

(1) Audio isn’t automatically connected. In the settings window, under “Audio”, there is an option to automatically join computer audio. While you’re there, mute the mic when joining a meeting too.

(2) The window into which the main video is placed might be 2″ across. Under “General” is an option to use full screen when joining a meeting.

(3) Disable your video when joining the meeting — there’s no reason for anyone to stare at the wall behind my desk!

Hopefully these settings allow me to successfully record a Township meeting to watch later. In OBS Studio, I’ve got the audio input linked to my speakers and have the mic disabled as a recording source … this means someone talking near my computer doesn’t ruin my recording. There’s another meeting tomorrow … we’ll see if I’ve finally got it!

Update — this works unless the meeting organize throws up a “this meeting is being recorded” banner. In which case, you have a big dialog box in the middle of the meeting window. I give up, and we’ll just ensue someone’s at the computer to actually set up the recording.

Proof of Concept

Reading about the meat processing that’s been attacked by ransomware, and thinking about the petrol pipeline … this really seems like proof of concept stuff to me. I’m sure there’s some ‘making money’ and more than a little ego stroking involved. Before we purchase and implement some major system at work (or spend a lot of time developing code), we run a proof of concept test. A quick, slimmed down implementation that runs on some virtual system that lets people see how it’ll work without sinking the time and money into a full-scale implementation. If the thing seems useful, then we buy it and have a capital budget for implementation. If it wasn’t useful … well, we lost some time, but not much.

Attacking small players in various industries to see what kind of impact you have have … seems a lot like a proof of concept series of attacks. How well secured was the company? What kind of incident response were they able to mount? How much access did you manage? What came offline? What was the public impact?

Microsoft’s 1601 Time Base

Microsoft uses the number of 100-nanosecond intervals since 01 January 1601. Why? No idea. But I’ve had to deal with their funky large integer for a DateTime value as long as I’ve been working with AD. I’ve written functions to turn it int something useful, but that’s a lot of effort when I see a lockoutTime and need to know how recent that is. Enter w32tm which has an “ntte” switch — this allows me to readily tell that the lockout was at 3:01 today and something I need to be investigating.

On UUIDs

RFC 4122 UUID Versions:

1 — Datetime and MAC based
48-bit MAC address, 60-bit timestamp, 13-14 bit uniquifying sequence

2 — Datetime and MAC based with DCE security
8 least significant clock sequence numbers and least significant 32 bits of timestamp. RFC doesn’t reallly provide details on DCE security

3 — Hashed Namespace
MD5 hash of namespace

4 — Random
6 pre-determined bits (4 bits for version, 2-3 bits for variant 1 or 2) and 122 bits for 2^122 possible v4 variant 1 UUIDs

5 — Hashed Namespace
SHA-1 hash of namespace

In my case, I hesitate to use a v1 or v2 UUID because I have scripts executing in cron on the same host. The probability of the function being called at the same microsecond time seems higher than the pseudo-random number generator popping the same value in the handful of hours for which the UUIDs will be persisted for deduplication.

v3 or v5 UUIDs are my fallback position if we’re seeing dups in v4 — the namespace would need to glom together the script name and microsecond time to make a unique string when multiple scripts are running the function concurrently.

Kafka Troubleshooting (for those who enjoy reading network traces)

I finally had a revelation that allowed me to definitively prove that I am not doing anything strange that is causing duplicated messages to appear in the Kafka stream — it’s a clear text protocol! That means you can use Wireshark, tcpdump, etc to capture everything that goes over the wire. This shows that the GUID I generated for the duplicated message only appears one time in the network trace. Whatever funky stuff is going on that makes the client see it twice? Not me 😊

I used tcpdump because the batch server doesn’t have tshark (and it’s not my server, so I’m not going to go requesting additional binaries if there’s something sufficient for my need already available). Ran tcpdump -w /srv/data/ljr.cap port 9092 to grab everything that transits port 9092 while my script executed. Once the batch completed, I stopped tcpdump and transferred the file over to my workstation to view the capture in Wireshark. Searched the packet bytes for my duplicated GUID … and there’s only one.

Confluent Kafka Queue Length

The documentation for the Python Confluent Kafka module includes a len function on the producer. I wanted to use the function because we’re getting a number of duplicated messages on the client, and I was trying to isolate what might be causing the problem. Unfortunately, calling producer.len() failed indicating there’s no len() method. I used dir(producer) to show that, no, there isn’t a len() method.

I realized today that the documentation is telling me that I can call the built-in len() function on a producer to get the queue length.

Code:

print(f"Before produce there are {len(producer)} messages awaiting delivery")
producer.produce(topic, key=bytes(str(int(cs.timestamp) ), 'utf8'), value=cs.SerializeToString() )
print(f"After produce there are {len(producer)} messages awaiting delivery")
producer.poll(0) # Per https://github.com/confluentinc/confluent-kafka-python/issues/16 for queue full error
print(f"After poll0 there are {len(producer)} messages awaiting delivery")

Output:

Before produce there are 160 messages awaiting delivery
After produce there are 161 messages awaiting delivery
After poll0 there are 155 messages awaiting delivery

Boolean Opts in Python

I have a few command line arguments on a Python script that are most readily used if they are boolean. I sometimes need a “verbose” option for script debugging — print a lot of extra stuff to show what’s going on, and I usually want a “dry run” option where the script reads data, performs calculations, and prints results to the screen without making any changes or sending data anywhere (database, email, etc). To use command line arguments as boolean values, I use a function that converts a variety of possible inputs to True/False.

def string2boolean(strInput):
    """
    :param strInput: String string to be converted to boolean
    :return: Boolean representation of input
    """
    if isinstance(strInput, bool):
        return strInput
    if strInput.lower() in ('yes', 'true', 't', 'y', '1'):
        return True
    elif strInput.lower() in ('no', 'false', 'f', 'n', '0'):
        return False
    else:
        raise argparse.ArgumentTypeError('Boolean value expected.')

Use “type” when adding the argument to run the input through your function.

    parser.add_argument('-r', '--dryrun', action='store', type=string2boolean, dest='boolDryRun', default=False, help="Preview data processing without sending data to DB or Kafka. Valid values: 'true' or 'false'.")