Month: September 2024

Fedora 40: NFTables not logging

We upgraded Anya’s laptop to Fedora 40, and Skype has evidently moved from an installable RPM to a snap package. Which didn’t work with the firewall rules we built earlier in the year (video and audio calls would not connect); and, worse, nothing logs out. Looks like the netfilter kernel logging isn’t enabled

Enabled the logging:

echo 1 | sudo tee /proc/sys/net/netfilter/nf_log_all_netns

And, voila, we’ve got log records from nftables. And now Skype works … so I don’t know what to add. Sigh!

Azure DevOps Pipeline Error – Veracode Scan Fails

Pipeline Error:

Build Failed: Error: Exiting Veracode Upload and Scan Task: App not in state where new builds are allowed.

Resolution: There’s a scan in Veracode that never completed. Log into the web UI and delete it!

View the scans in the sandbox. Select the one that says “Request Incomplete”

Use the ellipsis button to select “Delete Request”

Confirm deletion.

Voila – now you can re-run the pipeline and the scan will proceed.

Kafka Streams, Consumer Groups, and Stickiness

The Java application I recently inherited had a lot of … quirks. One of the strangest was that it calculated throughput statistics based on ‘start’ values in a cache that was only refreshed every four hours. So at a minute past the data refresh, the throughput is averaged out over that minute. At three hours and fifty nine minutes past the data refresh, the throughput is averaged out over three hours and fifty nine minutes. In the process of correcting this (reading directly from the cached data rather than using an in-memory copy of the cached data), I noticed that the running application paused a lot as the Kafka group was re-balanced.

Which is especially odd because I’ve got a stable number of clients in each consumer group. But pods restart occasionally, and there was nothing done to attempt to stabilize partition assignment.

Which was odd because Kafka has had mechanisms to reduce re-balancing — StickyAssignor added in 0.11

        // Set the partition assignment strategy to StickyAssignor
        config.put(ConsumerConfig.PARTITION_ASSIGNMENT_STRATEGY_CONFIG, "org.apache.kafka.clients.consumer.StickyAssignor");

And groupInstanceId in 2.3.0

        // Set the group instance ID
        String groupInstanceId = UUID.randomUUID().toString();
        config.put(ConsumerConfig.GROUP_INSTANCE_ID_CONFIG, groupInstanceId);

Now, I’m certain that a UUID isn’t the best way to go about crafting your group instance ID name … but it produces a “name” that isn’t likely to be duplicated. Since deploying this change, I went from seeing three or four re-balance operations an hour to zero.

Kafka Streams Group Members and Topic Partitions

I encountered an oddity in a Java application that uses Kafka Streams to implement a scalable application that reads data from Kafka topics. Data is broken out into multiple topics, and there are Kubernetes pods (“workers”) reading from each topic. The pods have different numbers of replicas defined. But it appears that no one ever aligned the topic partitions with the number of workers being deployed.

Kafka Streams assigns “work” to group members by partition. If you have ten partitions and five workers, each worker processes the data from two partitions. However, when the numbers don’t line up … some workers get more partitions than others. Were you to have eleven partitions and five workers, four workers would get data from two partitions and the fifth worker gets data from three.

Worse – in some cases we have more workers than partitions. Those extra workers are using up some resources, but they’re not actually processing data.

It’s a quick fix — partitions can be added mostly invisibly (the consumer group will be re-balanced, write operations won’t really change. New data just starts getting placed in the new partitions), so I increased our partition counts to be 2x the number of workers. This allows us to add a few workers to a topic if it gets backlogged, but the configuration evenly distributes the work across all of the normally running pods.

Springfield

Governor DeWine had an editorial in the NY Times this morning — https://www.nytimes.com/2024/09/20/opinion/springfield-haitian-migrants-ohio.html — he starts out OK. Essentially I’ve got a history in that area, I know the area, and I know this nonsense to be untrue. But then he says he still supports Trump because of the horrible problems with immigration elsewhere.

Without mentioning that a bipartisan attempt to do something got scuttled by the very guy I am telling you will fix it.

Worse – without thinking maybe some of those reports are made up or embellished like this situation has been? If I know first-hand what is going on in Springfield, and I know that Trump’s portrayal of it is wildly inaccurate? Why wouldn’t I question how accurate his depiction of Colorado or even the Mexican border is?!?

They just let you steal!

This is another case of simplifying a complex situation to make someone look bad: Cali lets you walk right out if you are stealing $950 or less. And ignoring the real problem (if you want to have all of these crimes investigated and prosecuted, your local/state taxes need to go up so departments can staff accordingly and enough jails can be built to house all of these criminals).

Every state has a law that differentiates between misdemeanor theft and felony theft. e.g. https://law.justia.com/codes/arkansas/title-5/subtitle-4/chapter-36/subchapter-1/section-5-36-103/ for Arkansas at $1000. I think most people would agree that someone who steals ten dollars worth of merchandise and someone who steals three grand worth of merchandise should get different punishments. Punishments are generally defined along with the classification of the crime. And each state’s law reflects this. Different states have different dollar amounts — and $950 sounds like a lot of money. But compared to the other states? California’s demarcation is pretty middle of the pack.

Unfortunately the entire criminal justice system is overloaded. Police may be too busy to deal with your theft complaint. The prosecutor may not get around to filing charges. And what happens if they do get a conviction? Now we need to find somewhere to detain the thief. And, again, if there’s a person stealing cars, a person kidnapping minors, and a person stealing the latest video game … ideally, we’d have resources to punish all of them. But we don’t. And I’d be way more upset if the kidnapper skated because they had a task force down at the GameStop gathering video from all the surrounding stores to track down this thief.

Prop 47 absolutely has some flaws. There’s a Prop 36 this year that seems like an attempt to fix some of the unintended consequences — two or more theft charges under $950 would be a felony. Stealing from multiple places that add up to over $950 would be a felony. Won’t know if that passes until November, but it’s not like all the liberals are dancing around saying this was 100% perfect and everyone else should do it too.

 

Alabama: $500
Alaska: $750
Arizona: $1,000
Arkansas: $1,000
California: $950
Colorado: $2,000
Connecticut: $2,000
Delaware: $1,500
Florida: $750
Georgia: $1,500
Hawaii: $750
Idaho: $1,000
Illinois: $500
Indiana: $750
Iowa: $300
Kansas: $1,500
Kentucky: $1,000
Louisiana: $1,000
Maine: $1,000
Maryland: $1,500
Massachusetts: $1,200
Michigan: $1,000
Minnesota: $1,000
Mississippi: $1,000
Missouri: $750
Montana: $1,500
Nebraska: $1,500
Nevada: $1,200
New Hampshire: $1,000
New Jersey: $200
New Mexico: $500
New York: $1,000
North Carolina: $1,000
North Dakota: $1,000
Ohio: $1,000
Oklahoma: $1,000
Oregon: $1,000
Pennsylvania: $2,000
Rhode Island: $1,500
South Carolina: $2,000
South Dakota: $1,000
Tennessee: $1,000
Texas: $2,500
Utah: $1,500
Vermont: $900
Virginia: $1,000
Washington: $750
West Virginia: $1,000
Wisconsin: $2,500
Wyoming: $1,000

PostgreSQL 12 — Cascading Replication

I’ve got replicated PostgreSQL database pairs that each have some 50TB of data. The server operating systems need to be upgraded, but there is a constraint: no in-place upgrades. I don’t get to veto that constraint (i.e. the fact that we could just cross our fingers and upgrade a replica … and, if it fails, built new and pull the data again doesn’t matter). Unfortunately, trying to add a second replica delays the existing replication. Since all write operations to to the RW server and reads to to the read-only replica … having the read-only copy a day or two out of sync whilst this secondary replica comes online is a non-starter.

Fortunately, you can cascade replication — seed the new replica from the current read-only replica. Create a new replication slot — here new-pg-ro-replica-pgdata. You need to verify the new server is in the pg_hba.conf file to authenticate with the replication account.

pg_basebackup -h pg-ro-replica.example.net -D /pgdata -U replicatorID -v -P --wal-method=stream --slot=new-pg-ro-replica-pgdata

Wait … wait … wait. It’ll finish eventually. Then tweak your recovery.conf

standby_mode = 'on'
primary_conninfo = 'host=pg-rw-replica.example.net port=5432 user=replicatorID password=your_password' sslmode=require
primary_slot_name = 'new-pg-ro-replica-pgdata'

And
touch /pgdata/standby.signal

Finally, start the server

pg_ctl start -D /pgdata

Voila — a second read-only replica. Now they can decom the old server.

OpenSearch 2.x CACerts Permission Error

In my dev OpenSearch 2.x environment, I get a strange error indicating that the application cannot read the cacerts file — except the file is world readable, selinux is disabled, and there’s nothing actually preventing access from the OS level.

[2024-09-17T12:48:52,666][ERROR][c.a.d.a.h.j.AbstractHTTPJwtAuthenticator] [linux1569.mgmt.windstream.net] Error creating JWT authenticator. JWT authentication will not work
com.amazon.dlic.util.SettingsBasedSSLConfigurator$SSLConfigException: Error loading trust store from /opt/elk/opensearch/jdk/lib/security/cacerts
        at com.amazon.dlic.util.SettingsBasedSSLConfigurator.initFromKeyStore(SettingsBasedSSLConfigurator.java:338) ~[opensearch-security-2.15.0.0.jar:2.15.0.0]
        at com.amazon.dlic.util.SettingsBasedSSLConfigurator.configureWithSettings(SettingsBasedSSLConfigurator.java:196) ~[opensearch-security-2.15.0.0.jar:2.15.0.0]
        at com.amazon.dlic.util.SettingsBasedSSLConfigurator.buildSSLContext(SettingsBasedSSLConfigurator.java:117) ~[opensearch-security-2.15.0.0.jar:2.15.0.0]
        at com.amazon.dlic.util.SettingsBasedSSLConfigurator.buildSSLConfig(SettingsBasedSSLConfigurator.java:131) ~[opensearch-security-2.15.0.0.jar:2.15.0.0]
        at com.amazon.dlic.auth.http.jwt.keybyoidc.HTTPJwtKeyByOpenIdConnectAuthenticator.getSSLConfig(HTTPJwtKeyByOpenIdConnectAuthenticator.java:65) ~[opensearch-security-2.15.0.0.jar:2.15.0.0]
        at com.amazon.dlic.auth.http.jwt.keybyoidc.HTTPJwtKeyByOpenIdConnectAuthenticator.initKeyProvider(HTTPJwtKeyByOpenIdConnectAuthenticator.java:47) ~[opensearch-security-2.15.0.0.jar:2.15.0.0]
        at com.amazon.dlic.auth.http.jwt.AbstractHTTPJwtAuthenticator.<init>(AbstractHTTPJwtAuthenticator.java:89) [opensearch-security-2.15.0.0.jar:2.15.0.0]
        at com.amazon.dlic.auth.http.jwt.keybyoidc.HTTPJwtKeyByOpenIdConnectAuthenticator.<init>(HTTPJwtKeyByOpenIdConnectAuthenticator.java:26) [opensearch-security-2.15.0.0.jar:2.15.0.0]
        at java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62) ~[?:?]
        at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502) ~[?:?]
        at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486) ~[?:?]
        at org.opensearch.security.support.ReflectionHelper.instantiateAAA(ReflectionHelper.java:62) [opensearch-security-2.15.0.0.jar:2.15.0.0]
        at org.opensearch.security.securityconf.DynamicConfigModelV7.lambda$newInstance$1(DynamicConfigModelV7.java:432) [opensearch-security-2.15.0.0.jar:2.15.0.0]
        at java.base/java.security.AccessController.doPrivileged(AccessController.java:319) [?:?]
        at org.opensearch.security.securityconf.DynamicConfigModelV7.newInstance(DynamicConfigModelV7.java:430) [opensearch-security-2.15.0.0.jar:2.15.0.0]
        at org.opensearch.security.securityconf.DynamicConfigModelV7.buildAAA(DynamicConfigModelV7.java:329) [opensearch-security-2.15.0.0.jar:2.15.0.0]
        at org.opensearch.security.securityconf.DynamicConfigModelV7.<init>(DynamicConfigModelV7.java:102) [opensearch-security-2.15.0.0.jar:2.15.0.0]
        at org.opensearch.security.securityconf.DynamicConfigFactory.onChange(DynamicConfigFactory.java:288) [opensearch-security-2.15.0.0.jar:2.15.0.0]
        at org.opensearch.security.configuration.ConfigurationRepository.notifyAboutChanges(ConfigurationRepository.java:570) [opensearch-security-2.15.0.0.jar:2.15.0.0]
        at org.opensearch.security.configuration.ConfigurationRepository.notifyConfigurationListeners(ConfigurationRepository.java:559) [opensearch-security-2.15.0.0.jar:2.15.0.0]
        at org.opensearch.security.configuration.ConfigurationRepository.reloadConfiguration0(ConfigurationRepository.java:554) [opensearch-security-2.15.0.0.jar:2.15.0.0]
        at org.opensearch.security.configuration.ConfigurationRepository.loadConfigurationWithLock(ConfigurationRepository.java:538) [opensearch-security-2.15.0.0.jar:2.15.0.0]
        at org.opensearch.security.configuration.ConfigurationRepository.reloadConfiguration(ConfigurationRepository.java:531) [opensearch-security-2.15.0.0.jar:2.15.0.0]
        at org.opensearch.security.configuration.ConfigurationRepository.initalizeClusterConfiguration(ConfigurationRepository.java:284) [opensearch-security-2.15.0.0.jar:2.15.0.0]
        at org.opensearch.security.configuration.ConfigurationRepository.lambda$initOnNodeStart$10(ConfigurationRepository.java:439) [opensearch-security-2.15.0.0.jar:2.15.0.0]
        at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
Caused by: java.security.AccessControlException: access denied ("java.io.FilePermission" "/opt/elk/opensearch/jdk/lib/security/cacerts" "read")
        at java.base/java.security.AccessControlContext.checkPermission(AccessControlContext.java:488) ~[?:?]
        at java.base/java.security.AccessController.checkPermission(AccessController.java:1071) ~[?:?]
        at java.base/java.lang.SecurityManager.checkPermission(SecurityManager.java:411) ~[?:?]
        at java.base/java.lang.SecurityManager.checkRead(SecurityManager.java:742) ~[?:?]
        at java.base/sun.nio.fs.UnixPath.checkRead(UnixPath.java:789) ~[?:?]
        at java.base/sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:49) ~[?:?]
        at java.base/sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:171) ~[?:?]
        at java.base/sun.nio.fs.LinuxFileSystemProvider.readAttributes(LinuxFileSystemProvider.java:99) ~[?:?]
        at java.base/java.nio.file.spi.FileSystemProvider.readAttributesIfExists(FileSystemProvider.java:1270) ~[?:?]
        at java.base/sun.nio.fs.UnixFileSystemProvider.readAttributesIfExists(UnixFileSystemProvider.java:191) ~[?:?]
        at java.base/java.nio.file.Files.isDirectory(Files.java:2319) ~[?:?]
        at org.opensearch.security.support.PemKeyReader.checkPath(PemKeyReader.java:214) ~[opensearch-security-2.15.0.0.jar:2.15.0.0]
        at org.opensearch.security.support.PemKeyReader.resolve(PemKeyReader.java:290) ~[opensearch-security-2.15.0.0.jar:2.15.0.0]
        at org.opensearch.security.support.PemKeyReader.resolve(PemKeyReader.java:276) ~[opensearch-security-2.15.0.0.jar:2.15.0.0]
        at com.amazon.dlic.util.SettingsBasedSSLConfigurator.initFromKeyStore(SettingsBasedSSLConfigurator.java:327) ~[opensearch-security-2.15.0.0.jar:2.15.0.0]
        ... 25 more

Looks like Java has its own security mechanism — the java.policy needed to be updated to allow read access to cacerts (why!?!?!?)

vi /opt/elk/opensearch/jdk/conf/security/java.policy

# Add this grant:

    permission java.io.FilePermission "/opt/elk/opensearch/jdk/lib/security/cacerts", "read";


Why so militant

Someone asked why some feminists are so anti-anything-feminine. I think a silly analogy makes it easier to understand:

If there were a law that required everyone to eat pizza for dinner every day, and a whole freedom movement evolved to ensure we could all pick our own dinner? I expect some people would be so adverse to pizza as to never eat it again. Electing to eat pizza (because I love pizza, just not every day) could be seen as an insult to the Dinner Liberation movement.

Viewing the movement as action so you never had to eat pizza again rather than action so you could chose from the entire world of options including the one that had formerly been forced upon you.

Trusting Science

Kinda hard question for me, as a scientist, if I trust science or trust experts. Few who ask are honestly curious – they’ve got an agenda. I generally trust Science and Experts. *But* I also know that Science and Experts aren’t always right. They are generally right with the information they had available at the time, the measuring tools they had available at the time, etc. It’s surprisingly easy to do nothing wrong and still manage to arrive at the wrong conclusion. There are some things that have remained consistent over enough time and testing that they’re generally accepted as true (scientific theories). But even that name … scientists aren’t out there claiming it’s the complete, never changing truth. It’s the current theory.

What I don’t trust second-hand accounts of science or experts. There is generally a peer-reviewed publication that makes a cumbersome read. With a lot of details you don’t really need. But! It’s also exactly what was studied, how it was studied, what conclusions the researchers drew, how statistically significant the findings were, and other factors that should be included in future studies. A newspaper article claiming researchers say XYZ? I’ll use my internet search engine of choice to find the actual article if I’m interested in the claim. It’s a newspaper’s summary of a PR guy’s summary of the abstract written by an expert to explain something that requires domain knowledge to understand well.