Category: System Administration

Fedora — Disabling IPv6

Since it’s the third time I’ve had to do this so far this year, I’m going to write down how I disable IPv6 in Fedora. Add these lines to /etc/sysctl.conf

[lisa@server~]# grep ipv6 /etc/sysctl.conf
net.ipv6.conf.all.disable_ipv6=1
net.ipv6.conf.default.disable_ipv6=1
net.ipv6.conf.lo.disable_ipv6=1

Then load the sysctl settings (sysctl -p) or reboot.

Without IPv6, if you do X-redirection, you may get an error indicating the redirection was refused. In journalctl, there’s an error “error: Failed to allocate internet-domain X11 display socket”. Evidently you’ve got to configure sshd to use IPv4 by setting “AddressFamily inet” in /etc/ssh/sshd_config

[lisa@server~/]# grep AddressFamily /etc/ssh/sshd_config
AddressFamily inet

 

Fedora – Why were my packets dropped?

We’ve been seeing dropped packets on one of our servers — that usually means more data is coming in than can be processed, but it’s nice to confirm rather than guess. The command “netstat -s” displays summary statistics that are nicely grouped into causes:

TcpExt:
16 invalid SYN cookies received
88 resets received for embryonic SYN_RECV sockets
18 packets pruned from receive queue because of socket buffer overrun
2321 ICMP packets dropped because they were out-of-window
838512 TCP sockets finished time wait in fast timer

Notes on Adding Drive to Linux Host

Create partition, format, find UUID, and add line to fstab to mount the volume

[lisa@linuxhost ~]# parted /dev/sdb
GNU Parted 3.2.153
Using /dev/sdb
Welcome to GNU Parted! Type ‘help’ to view a list of commands.
(parted) mklabel GPT
Warning: The existing disk label on /dev/sdb will be destroyed and all data on this disk will be lost. Do you want to continue?
Yes/No? y
(parted) mkpart primary 2048s 100%
(parted) q
Information: You may need to update /etc/fstab.

[lisa@linuxhost ~]# mkfs.xfs -f /dev/sdb1
meta-data=/dev/sdb1 isize=512 agcount=4, agsize=163839872 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=1, sparse=1, rmapbt=0
= reflink=1
data = bsize=4096 blocks=655359488, imaxpct=5
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0, ftype=1
log =internal log bsize=4096 blocks=319999, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0

[lisa@linuxhost ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 20G 0 disk
├─sda1 8:1 0 1G 0 part /boot
└─sda2 8:2 0 19G 0 part
├─fedora-lisa 253:0 0 17G 0 lvm /
└─fedora-swap 253:1 0 2G 0 lvm [SWAP]
sdb 8:16 0 2.5T 0 disk
└─sdb1 8:17 0 2.5T 0 part
sr0 11:0 1 650M 0 rom
[lisa@linuxhost ~]# blkid | grep sdb1
/dev/sdb1: UUID=”801ebed3-ddd6-459d-bd62-04a0a75f91b8″ TYPE=”xfs” PARTLABEL=”primary” PARTUUID=”b9a9a340-28f5-4efb-b649-af804ef5bc4c”

Add a line to /etc/fstab to mount the volume — here I’m mounting it to /mnt/data/mythtv:

UUID=801ebed3-ddd6-459d-bd62-04a0a75f91b8 /mnt/data/mythtv xfs defaults 0 0

Discourse acme.sh Script Failure

I had a hellacious time updating the certificate on my Dockerized Discourse server — the acme.sh script doesn’t have a slash delimiter between the hostname and the ./well-known folder within the URI. Which means the request fails. Repeatedly.

 

[Sat Oct 10 00:01:09 UTC 2020] _post_url='https://acme-v02.api.letsencrypt.org/acme/chall-v3/7784162898/nr42-g'
[Sat Oct 10 00:01:09 UTC 2020] _CURL='curl -L --silent --dump-header /shared/letsencrypt/http.header -g '
[Sat Oct 10 00:01:10 UTC 2020] _ret='0'
[Sat Oct 10 00:01:10 UTC 2020] code='200'
[Sat Oct 10 00:01:10 UTC 2020] trigger validation code: 200
[Sat Oct 10 00:01:10 UTC 2020] sleep 2 secs to verify
[Sat Oct 10 00:01:12 UTC 2020] checking
[Sat Oct 10 00:01:12 UTC 2020] url='https://acme-v02.api.letsencrypt.org/acme/chall-v3/7784162898/nr42-g'
[Sat Oct 10 00:01:12 UTC 2020] payload
[Sat Oct 10 00:01:12 UTC 2020] POST
[Sat Oct 10 00:01:12 UTC 2020] _post_url='https://acme-v02.api.letsencrypt.org/acme/chall-v3/7784162898/nr42-g'
[Sat Oct 10 00:01:12 UTC 2020] _CURL='curl -L --silent --dump-header /shared/letsencrypt/http.header -g '
[Sat Oct 10 00:01:13 UTC 2020] _ret='0'
[Sat Oct 10 00:01:13 UTC 2020] code='200'
[Sat Oct 10 00:01:13 UTC 2020] discourse.example.com:Verify error:Fetching https://discourse.example.com.well-known/acme-challenge/XY02T_40TL92IADByQ45JMj4JzC2qJCatVd2odJMAlU: Invalid host in redirect target
[Sat Oct 10 00:01:13 UTC 2020] pid
[Sat Oct 10 00:01:13 UTC 2020] No need to restore nginx, skip.

 

Turns out that’s my bad config — I’ve got a reverse proxy in front of Discourse, and we don’t use the clear text http site. The reverse proxy just bounces you over to the https site. Two problems — one, I failed to put the trailing slash after my redirect, s http://discourse.example.com/.well-known/blah is being redirected to https://discourse.example.com.well-known/blah

<VirtualHost 10.1.2.3:80>
ServerName discourse.example.com
ServerAlias discourse

Redirect 301 / https://discourse.example.com

</VirtualHost>

 

That’s easy enough to fix — add the trailing slash I should have had anyway. But the subsequent problem is that the bootstrap nginx config that is used to serve up the validation page only listens on port 80. So I cannot redirect the clear-text traffic over to the SSL site. I have to reverse proxy the clear text site as well (at least whenever the certificate needs to be renewed).

ProxyPass / https://discourse.example.com/
ProxyPassReverse / https://discourse.example.com/

Voila, a web server with an updated certificate.

Using Process Monitor To Troubleshoot Applications

SysInternals used to produce a suite of tools for working with Microsoft Windows systems — the company appears to have been acquired by Microsoft, and the tools continue to be developed. I used PSKill and PSExec to automate a lot of system administration tasks. ProcessMonitor is like truss/strace for Windows. Unlike the HFS standard, Windows files end up all over the place (plus info is stashed in the registry). Sometimes applications or services fall over for no reason. Process monitor reports out

When you open procmon, you can build filters to exclude uninteresting operations — there’s a default set of exclusions (no need to log out what procmon is doing!)

Adding exclusions for specific process names can eliminate a lot of I/O — I was looking to troubleshoot a problem on a Domain Controller that had nothing to do with AD specifically, so excluding activity by lsass.exe significantly reduced the amount of data being logged. If I’m using a browser to troubleshoot the problem, I’ll exclude the firefox.exe or chrome.exe binary too.

From the filter screen, click “OK” to begin grabbing data. The easiest thing I’ve found to do is stop capturing data when the program opens (use ctrl-a followed by ctrl-x to clear the already logged stuff). Stage whatever you want to log, use ctrl-e to start capturing. Perform the actions you want to log, return to procmon and use ctrl-e to stop again.

You’ll see reads (and writes) against the registry, including the specific keys. Network operations. File reads and writes. In the “Result” and “Detail” column, you can determine if the operation was successful. There are a lot of expected not found failures — I see these in truss/strace logs too, programs try a bunch of different things and one of them needs to work.

I’ve had programs using a specific, undocumented file for a critical operation — like the service would fail to start because the file didn’t exist. And seeing the path and file open failure allowed me to create that needed file and run my service. I’ve wanted to find out where a program stashes data, and procmon makes that easy to identify.

List Extensions Within Folder

It didn’t occur to me that Apache serves everything under a folder and the .git folder may well be under a folder (you can have your project up a level so there’s a single folder at the root of the project & that folder is DocumentRoot for the web site). Without knowing specific file names, you cannot get anything since directory browsing is disabled. But git has a well-known structure so browsing to /.git/index or really scary for someone who stuffs their password in the repo URL /.git/config is there and Apache happily serves it unless you’ve provided instructions otherwise.

A coworker brought up the intriguing idea of, instead of blocking the .git folder so things subordinate to .git are never served, having a specific list of known good extensions the web server was willing to serve. Which … ironically was one of the things I really didn’t like about IIS. Kind of like the extra frustration of driving behind someone who is going the speed limit. Frustrating because I want to go faster, extra frustrating because they aren’t actually wrong.

But configuring a list of good-to-serve extensions means you’ve got to get a handle on what extensions are on your server in the first place. This command provides a list of extensions and a count per extension (so you can easily identify one-offs that may not be needed):

find /path/to/search/ -type f | perl -ne 'print $1 if m/\.([^.\/]+)$/' | sort | uniq -c

 

Tar Excluding Git Folders

You can, of course, use –exclude and avoid adding the .git folders to your tar archive, but I discovered a really cool option that excludes the folders created by a whole host of version control systems:

--exclude-vcs

Which, as of version 1.32, means excluding CVS, RCS, SCCS, git, SVN, Arch, Bazaar, Mercurial, and Darcs as follows:

  • CVS/ — recursive
  • RCS/ — recursive
  • SCCS/ — recursive
  • .git/ — recursive
  • .gitignore
  • .gitmodules
  • .gitattributes
  • .cvsignore
  • .svn/ — recursive
  • .arch-ids/ — recursive
  • {arch}/ — recursive
  • =RELEASE-ID
  • =meta-update
  • =update
  • .bzr
  • .bzrignore
  • .bzrtags
  • .hg
  • .hgignore
  • .hgrags
  • _darcs

Cyberark — Error Listing Accounts

I was getting an odd error from my attempt to list accounts in Cyberark — “Object reference not set to an instance of an object”. Searching the Internet yielded a lot of issues that weren’t my problem (ampersands in account names in an older version, issues with SSL {and, seriously, someone says disable SSL on the connection they use to retrieve passwords!?! And not just random someone, but RAND?!?}). My issue turned out to be that I was copy/pasting code and used requests.post instead of requests.get — attempting to POST to a GET URL generates this error too.

DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): cyberark.example.com:443
DEBUG:urllib3.connectionpool:https://cyberark.example.com:443 “POST /PasswordVault/API/auth/Cyberark/Logon HTTP/1.1” 200 182
Before request, header is {‘Content-Type’: ‘application/json’, ‘Authorization’: ‘5TQz5WVjYm5tMjBh5C00M5YyLT50MjYt5Tc2Y5I2ZDI5…AwMDA5MDA7’}
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): cyberark.example.com:443
DEBUG:urllib3.connectionpool:https://cyberark.example.com:443 “POST /PasswordVault/api/Accounts?search=sample_account&searchType=contains HTTP/1.1” 500 97
{“ErrorCode”:”CAWS00001E”,”ErrorMessage”:”Object reference not set to an instance of an object.”} 500 Internal Server Error

PrivateTmp Strangeness — apachectl v/s systemctl

This is a very strange problem — we had a web server upgraded recently. We use “sudo apachectl start” to bring up the server since the server is maintained by a dedicated Unix support team, and the site worked fine. Until … Sunday morning after the log rotation. Then Box Spout was unable to access the XML data for an Excel file to compress it. Lots of errors:

 

[Mon Sep 14 10:59:39.137728 2020] [:error] [pid 57117] [client 10.1.2.3:49276] PHP Warning: ZipArchive::close(): Zlib error: stream error in /path/to/site/classes/vendor/box/spout/src/Spout/Writer/Common/Helper/ZipHelper.php on line 199, referer: https://hostname.example.com/path/to/code.php
[Mon Sep 14 10:59:39.137785 2020] [:error] [pid 57117] [client 10.1.2.3:49276] PHP Warning: fopen(/tmp/xlsx5f5f934699dcd9.17695276.zip): failed to open stream: No such file or directory in /path/to/site/classes/vendor/box/spout/src/Spout/Writer/Common/Helper/ZipHelper.php on line 213, referer: https://hostname.example.com/path/to/code.php
[Mon Sep 14 10:59:39.137803 2020] [:error] [pid 57117] [client 10.1.2.3:49276] PHP Warning: stream_copy_to_stream() expects parameter 1 to be resource, boolean given in /path/to/site/classes/vendor/box/spout/src/Spout/Writer/Common/Helper/ZipHelper.php on line 214, referer: https://hostname.example.com/path/to/code.php
[Mon Sep 14 10:59:39.137814 2020] [:error] [pid 57117] [client 10.1.2.3:49276] PHP Warning: fclose() expects parameter 1 to be resource, boolean given in /path/to/site/classes/vendor/box/spout/src/Spout/Writer/Common/Helper/ZipHelper.php on line 215, referer: https://hostname.example.com/path/to/code.php
[Mon Sep 14 10:59:39.138360 2020] [:error] [pid 57117] [client 10.1.2.3:49276] PHP Fatal error: Uncaught exception ‘Box\\Spout\\Common\\Exception\\IOException’ with message ‘Cannot perform I/O operation outside of the base folder: /tmp’ in /path/to/site/classes/vendor/box/spout/src/Spout/Common/Helper/FileSystemHelper.php:130\nStack trace:\n#0 /path/to/site/classes/vendor/box/spout/src/Spout/Common/Helper/FileSystemHelper.php(82): Box\\Spout\\Common\\Helper\\FileSystemHelper->throwIfOperationNotInBaseFolder(‘/tmp/xlsx5f5f93…’)\n#1 /path/to/site/classes/vendor/box/spout/src/Spout/Writer/XLSX/Helper/FileSystemHelper.php(369): Box\\Spout\\Common\\Helper\\FileSystemHelper->deleteFile(‘/tmp/xlsx5f5f93…’)\n#2 /path/to/site/classes/vendor/box/spout/src/Spout/Writer/XLSX/Internal/Workbook.php(134): Box\\Spout\\Writer\\XLSX\\Helper\\FileSystemHelper->zipRootFolderAndCopyToStream(Resource id #26)\n#3 /path/to/site/ in /path/to/site/classes/vendor/box/spout/src/Spout/Common/Helper/FileSystemHelper.php on line 130, referer: https://hostname.example.com/path/to/code.php
[Mon Sep 14 10:59:39.139468 2020] [:error] [pid 57117] [client 10.1.2.3:49276] PHP Warning: ZipArchive::close(): Zlib error: stream error in /path/to/site/classes/vendor/box/spout/src/Spout/Writer/Common/Helper/ZipHelper.php on line 199, referer: https://hostname.example.com/path/to/code.php
[Mon Sep 14 10:59:39.139504 2020] [:error] [pid 57117] [client 10.1.2.3:49276] PHP Warning: fopen(/tmp/xlsx5f5f934699dcd9.17695276.zip): failed to open stream: No such file or directory in /path/to/site/classes/vendor/box/spout/src/Spout/Writer/Common/Helper/ZipHelper.php on line 213, referer: https://hostname.example.com/path/to/code.php
[Mon Sep 14 10:59:39.139515 2020] [:error] [pid 57117] [client 10.1.2.3:49276] PHP Warning: stream_copy_to_stream() expects parameter 1 to be resource, boolean given in /path/to/site/classes/vendor/box/spout/src/Spout/Writer/Common/Helper/ZipHelper.php on line 214, referer: https://hostname.example.com/path/to/code.php
[Mon Sep 14 10:59:39.139533 2020] [:error] [pid 57117] [client 10.1.2.3:49276] PHP Warning: fclose() expects parameter 1 to be resource, boolean given in /path/to/site/classes/vendor/box/spout/src/Spout/Writer/Common/Helper/ZipHelper.php on line 215, referer: https://hostname.example.com/path/to/code.php
[Mon Sep 14 10:59:39.139599 2020] [:error] [pid 57117] [client 10.1.2.3:49276] PHP Fatal error: Uncaught exception ‘Box\\Spout\\Common\\Exception\\IOException’ with message ‘Cannot perform I/O operation outside of the base folder: /tmp’ in /path/to/site/classes/vendor/box/spout/src/Spout/Common/Helper/FileSystemHelper.php:130\nStack trace:\n#0 /path/to/site/classes/vendor/box/spout/src/Spout/Common/Helper/FileSystemHelper.php(82): Box\\Spout\\Common\\Helper\\FileSystemHelper->throwIfOperationNotInBaseFolder(‘/tmp/xlsx5f5f93…’)\n#1 /path/to/site/classes/vendor/box/spout/src/Spout/Writer/XLSX/Helper/FileSystemHelper.php(369): Box\\Spout\\Common\\Helper\\FileSystemHelper->deleteFile(‘/tmp/xlsx5f5f93…’)\n#2 /path/to/site/classes/vendor/box/spout/src/Spout/Writer/XLSX/Internal/Workbook.php(134): Box\\Spout\\Writer\\XLSX\\Helper\\FileSystemHelper->zipRootFolderAndCopyToStream(Resource id #26)\n#3 /path/to/site in /path/to/site/classes/vendor/box/spout/src/Spout/Common/Helper/FileSystemHelper.php on line 130, referer: https://hostname.example.com/path/to/code.php

 

The postupdate script is “systemctl reload httpd.service” — so not exactly the same thing we used to launch the service originally. But I’ve never seen differing behavior between apachectl and systemctl started HTTPD instances. Quick/dirty solution is to disable PrivateTmp, but I’m hoping to be able to isolate why exactly the postupdate script appears to break the service’s access to the private tmp space.

30 November 2020 addendum — In discussing the issue with RedHat, they suggested using either

/sbin/killall -HUP httpd

or

/bin/systemctl restart httpd.service > /dev/null 2>/dev/null || true

Doing this has allowed continual access to the Private Tmp space after log rotation. Woohoo! Not sure why the default configuration that came from the Apache httpd package didn’t work (i.e. it’s not like we built some funky weird log rotation script). But success is good enough for me.