David Goodwin – Page 2 – Linux, PHP, geeky stuff … boring man.

January 31, 2024January 31, 2024

Minimal WordPress Fail2ban integration

I used to have a fail2ban filter etc setup to look for POST requests to wp-login.php; but the size of the Apache log files on one server made this infeasible (it took fail2ban too long to parse/process the files). Also, doing a filter on the Apache log file looking for POST /wp-login … means you are also catching someone successfully logging in.

Perhaps this is a better approach :

Assumptions

You’re using PHP configured with an error_log = /var/log/php.log
- If this isn’t configured, PHP will probably log to the webserver’s error log file (/var/log/apache2/error.log perhaps).

The Apache/PHP processes are able to write to the error_log file.

You’re using Debian or Ubuntu Linux

Add a ‘must use’ wordpress plugin

Put this in … /path/to/your/site/wp-content/mu-plugins/log-auth-failures.php

(It must be wp-content/mu-plugins … )

<?php
add_action( ‘wp_login_failed’, ‘login_failed’ );
function login_failed( $username ) {
error_log(“WORDPRESS LOGIN FAILURE {$_SERVER[‘REMOTE_ADDR’]} – user $username from ” . __FILE__);
}

(Yes, obviously you don’t have to use error_log, you could do something else, and there’s a good argument not to log $username as it’s ultimately user supplied data that might just mess things up)

Fail2ban config

Then in /etc/fail2ban/jail.d/wordpress-login.conf :

[wordpress-login]
enabled = true
filter = wordpress-login
action = iptables-multiport[name=wp, port="80,443", protocol=tcp]
logpath = /var/log/php.log
maxretry = 5

If you have PHP logging somewhere else, change the logpath appropriately.

Finally in /etc/fail2ban/filter.d/wordpress-login.conf put :

[Definition]

# PHP error_log is logging to /var/log/php.log something like :
#[31-Jan-2024 20:34:10 UTC] WORDPRESS LOGIN FAILURE 1.2.3.4  - user admin.person from /var/www/vhosts/mysite/httpdocs/wp-content/mu-plugins/log-auth-failures.php

failregex = WORDPRESS LOGIN FAILURE <HOST> - user 


ignoreregex =

Extra bonus points for making the failregex stricter, or stop including $username in the log output (which perhaps makes it vulnerable to some sort of injection attack).

There’s probably a good argument for using a different file (not the PHP error_log) so other random error messages can’t confuse fail2ban, which might also allow you to specify a more friendly date format for fail2ban etc….

Finally …

Restart fail2ban and watch it’s log file (and /var/log/php.log).

service fail2ban restart

December 30, 2023December 30, 2023

Upgrade some things

Well, I sort of realised I had a web server or two that were still on Debian Buster, and it was time to move to Bullseye or Bookworm. As usual the Debian upgrade procedure was mostly pretty straight forward and uneventful.

Interesting findings :

“hitch“, which I use as an SSL frontend to varnish, doesn’t seem to get along all that well with systemd and silently fails if your config has “daemon = on” setting in /etc/hitch/hitch.conf. Annoyingly when trying to test the configuration with “hitch -t” you will get an error like: “No x509 certificate PEM file specified for frontend ‘default’!” – the solution to that is to specify the config file – i.e : hitch -t --config /etc/hitch/hitch.conf

hitch hasn’t had a release in it’s packagecloud.io repository for the last 3 years; so the debian supported variant looks more appealing.

In other news, I noticed this post where someone moaned about systemd-resolved the other day – https://www.reddit.com/r/linux/comments/18kh1r5/im_shocked_that_almost_no_one_is_talking_about/ – I’ve had similar problems to the people on the thread (resolved stops working etc) so thought it was time to try and use ‘unbound‘ instead.

apt-get install unbound

and then tell /etc/resolv.conf to use 127.0.0.1 for DNS.

annoyingly, unbound-control stats isn’t quite as pretty as resolvectl statistics but oh well.

echo -e “nameserver 127.0.0.1\nnameserver 8.8.8.8\noptions timeout:4” >/etc/resolv.conf

and an /etc/unbound/unbound.conf file that looks perhaps like :

server: interface: 127.0.0.1 access-control: 127.0.0.0/8 allow access-control: ::1/128 allow # The following line will configure unbound to perform cryptographic # DNSSEC validation using the root trust anchor. auto-trust-anchor-file: "/var/lib/unbound/root.key" tls-cert-bundle: "/etc/ssl/certs/ca-certificates.crt"


remote-control:
    control-enable: yes
    # by default the control interface is is 127.0.0.1 and ::1 and port 8953
    # it is possible to use a unix socket too
    control-interface: /run/unbound.ctl

forward-zone: name: "." forward-tls-upstream: yes forward-addr: 1.1.1.1@853#cloudflare-dns.com forward-addr: 1.0.0.1@853#cloudflare-dns.com

(Unfortunately my ISP is shitty, and doesn’t yet give me an ipv6 address).

Looking at https://1.1.1.1/help – I do sometimes see that ‘DNS over TLS’ is “yes”…. so I guess something is right; annoyingly I don’t see anything useful from unbound’s stats (unbound-control stats) to show it’s done a secure query…

“unbound-host” (another debian package) – will helpfully tell you whether a lookup was done ‘securely’ or not – e.g.

$ unbound-host google.com -D -v
google.com has address 142.250.178.14 (insecure)
google.com has IPv6 address 2a00:1450:4009:815::200e (insecure)
google.com mail is handled by 10 smtp.google.com. (insecure)

which seems a little odd to me (I’d have thought google would support dns sec), but some domains do work – e.g.

$ unbound-host mythic-beasts.com -D -v mythic-beasts.com has address 93.93.130.166 (secure) mythic-beasts.com has IPv6 address 2a00:1098:0:82:1000:0:1:2 (secure) mythic-beasts.com mail is handled by 10 mx1.mythic-beasts.com. (secure) mythic-beasts.com mail is handled by 10 mx2.mythic-beasts.com. (secure)

November 10, 2023

Tumbleweed…

Does anyone else care about having a blog any longer?

September 25, 2023October 4, 2023

Beelink SER6 Max

“New PC Time”

I’ve had an ASUS PN50 (AMD 4800u processor) as my desktop/daily driver for sometime, and it’s nice and power efficient, but increasingly I found it being slow.

I eventually discovered I could turn on the CPU ‘boost’ feature (doh!) – but doing that seemed to result in it crashing within the next 24-48 hours…. which isn’t good. I don’t know if it’s a hardware or Linux problem – but I had already sort of decided it was time to consider upgrading to something with more ‘ooomph’.

So, I came across a slightly dodgy looking listing on Amazon for a Beelink SER6 max (32gb RAM, 500GiB SSD). The SER6 Max is a fairly new release, and Beelink are a relatively cheap, newish supplier of hardware with some past quality issues. Anyway, I thought I’d stop dithering over it, and buy it and rely on Amazon’s returns policy if there were problems with the PC/hardware.

My reason for choosing the SER6 Max was that it had enough rear ports for all three of my monitors, most other minipc variants don’t. I did contemplate the Geekom AS6 (which is an ASUS PN53 with the same CPU as this beelink, but it has slower RAM and I was concerned it might be noisy).

So, I “pulled the trigger” on https://www.amazon.co.uk/dp/B0C279T4P6 and on a whim I tried installing Siduction Linux…. so now I’ve got full disk encryption and what looks like a fairly up to date stack of stuff (with XFCE).

The SER6 has at least passed a token memory test, and some system tests – so I’m fairly optimistic about it, although I did have one hard lock up / crash yesterday which is unexplained.

(1 week later, and it seems well stable/reliable … )

September 12, 2023September 12, 2023

bash – escaping variables for use within commands

Escaping quotes within variables is always painful in bash (somehow) – e.g.

foo”bar

and it’s not obvious that you’d need to write e.g.

“foo”\””bar”

(at least to me).

Thankfully a bash built in magical thing can be used to do the escaping for you.

In my case, I need to pass a ‘PASSWORD’ variable through to run within a container. The PASSWORD variable needs escaping so it can safely contain things like ; or quote marks (” or ‘).

e.g. docker compose run app /bin/bash "echo $PASSWORD > /some/file"

or e.g. ssh user@server “echo $PASSWORD > /tmp/something”

The fix is to use the ${PASSWORD@Q} variable syntax – for example:

#!/bin/bash

FOO=”bar’\”baz”

ssh user@server “echo $FOO > /tmp/something”

This will fail, with something like : “bash: -c: line 1: unexpected EOF while looking for matching `''“

As she shell at the remote end it seeing echo bar'"baz and expects the quote mark to be closed.

So using the @Q magic –

ssh user@server “echo ${FOO@Q} > /tmp/something”

which will result in /tmp/something containing “bar'”baz” which is correct.

September 10, 2023September 10, 2023

asus pn50 and cpufreq/boost

I’ve been using an ASUS PN50 (that’s a mini pc, with an AMD Ryzen 4800u processor – so sort of a laptop without a screen) as my desktop for ages.

Increasingly I’ve found it sluggish and I was contemplating replacing it with something newer, and then I discovered why the CPU speed in /proc/cpuinfo was always 1400mhz….

I needed to :

echo 1 > /sys/devices/system/cpu/cpufreq/boost
Once that’s done, the CPU cores can go up to about 4.2Ghz … #doh

In other news – https://www.phoronix.com/news/Linux-Per-Policy-CPUFreq-Boost looks interesting.

Unfortunately now my minipc’s fan is always speeding up / slowing down when it used to be pretty quiet :-/

Thanks to https://www.reddit.com/r/MiniPCs/comments/16cuzd8/asus_pn50_unlock_cpu_speed_under_linux/

August 29, 2023

Resizing a VM’s disk within Azure

Random notes on resizing a disk attached to an Azure VM …

Check what you have already –

az disk list --resource-group MyResourceGroup --query '[*].{Name:name,Gb:diskSizeGb,Tier:accountType}' --output table

might output something a bit like :

Name Gb
———————————————- —-
foo-os 30
bar-os 30
foo-data 512
bar-data 256

So here, we can see the ‘bar-data’ disk is only 256Gb.

Assuming you want to change it to be 512Gb (Azure doesn’t support an arbitary size, you need to choose a supported size…)

az disk update --resource-group MyResourceGroup --name bar-data --size-gb 512

Then wait a bit …

In my case, the VMs are running Debian Buster, and I see this within the ‘dmesg‘ output after the resize has completed (on the server itself).

[31197927.047562] sd 1:0:0:0: [storvsc] Sense Key : Unit Attention [current] [31197927.053777] sd 1:0:0:0: [storvsc] Add. Sense: Capacity data has changed [31197927.058993] sd 1:0:0:0: Capacity data has changed

Unfortunately the new size doesn’t show up straight away to the O/S, so I think you either need to reboot the VM or (what I do) –

echo 1 > /sys/class/block/sda/device/rescan

at which point the newer size appears within your ‘lsblk‘ output – and the filesystem can be resized using e.g. resize2fs

July 13, 2023

Don’t forget to defragment /home if you’re using BTRFS

As root: (as a regular user it just won’t work) –

btrfs filesystem defragment /home -r

You probably want to run that weekly.

I eventually noticed Thunderbird and phpStorm were being really slow and laggy … at which point I realised the cron job I had (as my non-root user) wasn’t working.

(using filefrag /path/to/file you can see the change in the number of extents change after defragmenting)

June 20, 2023June 20, 2023

Excessive uptime(!?)

Somewhere on the internet there’s a mailserver with a larger uptime, I guess?

[root@xxxxxxxx ~]# uname -a
Linux xxxxxxxxxxxxxxx 2.6.18-419.el5 #1 SMP Fri Feb 24 22:47:42 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

[root@xxxxxxxx ~]# uptime
09:34:38 up 2290 days,  1:47,  ....

I don’t think anyone dares to reboot it …. (this is a server the customer was going to migrate off about 5 years ago …. somehow it’s still in use)

(2290 days is a little over 6 years)

June 17, 2023June 17, 2023

btrfs & ext4 – error handling when the hardware fails …

I have a mini PC (old intel NUC) I use for taking backups of my desktop. It has a single 4TiB ssd in it.

Filesystem Type Size Used Avail Use% Mounted on /dev/sda3 ext4 916G 80G 790G 10% / /dev/sda4 btrfs 2.8T 106G 2.7T 4% /backup

I’ve been using btrfs for ages for /backup as I use the snapshot functionality of btrfs with an hourly rsync job from my desktop to copy changes over.

Recently the fan on the NUC failed, and while overheating (I think) it appears to have written garbage in various places (this was seen on the ext4 rootfs as well as the /backup btrfs volume).

BTRFS

Trying to scrub the filesystem highlights the problems –

root@nectarine:~# btrfs scrub status /backup
UUID:             36f93b26-6187-4874-8cc6-4d4bd092e7d8
Scrub resumed:    Sat Jun 17 13:48:33 2023
Status:           finished
Duration:         1:21:28
Total to scrub:   1.23TiB
Rate:             263.66MiB/s
Error summary:    csum=60
  Corrected:      0
  Uncorrectable:  60
  Unverified:     0

(As I only have one underlying block device, it’s not possible for it to repair itself).

I now also see messages like this in ‘dmesg’ –

[ 3570.123946] BTRFS error (device sda4): unable to fixup (regular) error at logical 1870167986176 on dev /dev/sda4
[ 3570.128866] BTRFS error (device sda4): bdev /dev/sda4 errs: wr 0, rd 0, flush 0, corrupt 199, gen 0
[ 3570.128862] BTRFS warning (device sda4): checksum error at logical 1870167683072 on dev /dev/sda4, physical 1477245284352, root 8890, inode 3750321, offset 384077824, length 4096, links 1 (path: .icedove/e1kre066.default-release-2/ImapMail/imap.gmail-2.com/INBOX-1)

Before trying to re-initialise the checksum tree (And then just let the corrupt files expire out of the filesystem with time as they get rsync’ed over) I thought I’d try :

root@nectarine:~# btrfs check -p /dev/sda4 
Opening filesystem to check...
Checking filesystem on /dev/sda4
UUID: 36f93b26-6187-4874-8cc6-4d4bd092e7d8
[1/7] checking root items                      (0:00:10 elapsed, 6406461 items checked)
Segmentation faultents                         (0:00:02 elapsed, 7542 items checked)

So that didn’t work very well.

So I thought I might as well try just re-initialising the checksum tree –

root@nectarine:~# btrfs check -p --init-csum-tree /dev/sda4 
Creating a new CRC tree
WARNING:

	Do not use --repair unless you are advised to do so by a developer
	or an experienced user, and then only after having accepted that no
	fsck can successfully repair all types of filesystem corruption. Eg.
	some software or hardware bugs can fatally damage a volume.
	The operation will start in 10 seconds.
	Use Ctrl-C to stop it.
10 9 8 7 6 5 4 3 2 1
Starting repair.
Opening filesystem to check...
Checking filesystem on /dev/sda4
UUID: 36f93b26-6187-4874-8cc6-4d4bd092e7d8
Reinitialize checksum tree
kernel-shared/extent_io.c:650: free_extent_buffer_internal: BUG_ON `eb->refs < 0` triggered, value 1
btrfs(+0x2b1f7)[0x5590e079d1f7]
btrfs(+0x2b381)[0x5590e079d381]
btrfs(+0x2b68e)[0x5590e079d68e]
btrfs(alloc_extent_buffer+0x77)[0x5590e079e740]
btrfs(read_tree_block+0x47)[0x5590e0796066]
btrfs(read_node_slot+0x47)[0x5590e078f7fd]
btrfs(btrfs_next_sibling_tree_block+0x95)[0x5590e0792900]
btrfs(+0x19e14)[0x5590e078be14]
btrfs(+0x1a8a8)[0x5590e078c8a8]
btrfs(iterate_extent_inodes+0x68)[0x5590e078d5dc]
btrfs(fill_csum_tree+0x46b)[0x5590e07f9440]
btrfs(+0x74bf2)[0x5590e07e6bf2]
btrfs(main+0x3d3)[0x5590e078a203]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xea)[0x7ff38d37fd0a]
btrfs(_start+0x2a)[0x5590e078a86a]
Aborted

So I don’t feel that worked all that well.

I guess I’ll copy off the data I don’t want to lose, and just reformat it. I was hoping the repair tools (btrfs-progs v6.2, kernel 6.1.34) had hopefully matured since I last broke a btrfs filesystem (a few years ago). I guess not?

I know btrfs is at least alerting me to issues with the data – which ext4 definitely isn’t (given /var/lib/dpkg/status contained a load of trash) – so I’ll give it credit for that. It’s just a shame the ‘repair’ tools aren’t working that well.

ext4

This isn’t written to much on this system – there’s a munin daemon running (so /var/lib/munin will have been written to) and a few log files.

Interestingly, when I first noticed a problem with the device, after logging in, I instinctively ran ‘apt-get update’ (I was hoping a reboot would fix it, at which point I might as well make sure any updates were installed).

Running ‘apt-get update’ resulted in /var/lib/dpkg/status being full of rubbish.

After the PC had been turned on for a few hours, ext4 eventually figured there were problems with it – by logging this :

[11591.230282] munin-html[22255]: segfault at a400000e ip 0000557783eaf0e9 sp 00007ffca1d969f0 error 4 in perl[557783de1000+185000] likely on CPU 3 (core 1, socket 0)
[11591.230298] Code: 4e 0c 89 56 08 83 e9 09 83 f9 01 76 14 83 fa 01 76 3f 83 ea 01 89 55 08 48 83 c4 10 5d c3 0f 1f 00 48 8b 70 08 48 85 f6 74 e3 <f6> 46 0e 10 74 dd 48 c7 40 08 00 00 00 00 8b 56 08 83 fa 01 76 22
[11591.432906] munin-graph[22257]: segfault at 55a6b77c7df0 ip 000055a64601ebc2 sp 00007ffcd88c5150 error 4 in perl[55a645fc0000+185000] likely on CPU 3 (core 1, socket 0)
[11591.432927] Code: 0f 1f 84 00 00 00 00 00 48 8b 4f 10 48 85 c9 74 5f 48 83 ec 08 48 8b 87 30 01 00 00 48 8b 50 10 48 39 d1 75 4c 48 85 f6 74 55 <48> 8b 04 f1 48 85 c0 74 20 48 8d 97 50 01 00 00 48 39 d0 74 14 8b
[12723.693630] EXT4-fs error (device sda3): htree_dirblock_to_tree:1080: inode #28706704: comm find: Directory block failed checksum
[12723.693673] Aborting journal on device sda3-8.
[12723.696920] EXT4-fs error (device sda3): ext4_journal_check_start:83: comm systemd-journal: Detected aborted journal
[12723.696945] EXT4-fs error (device sda3): ext4_journal_check_start:83: comm rs:main Q:Reg: Detected aborted journal
[12723.708257] EXT4-fs (sda3): Remounting filesystem read-only

Rebooting and running : fsck -Cy /dev/sda3 MIGHT have fixed the rootfs.