btrfs & ext4 – error handling when the hardware fails …

I have a mini PC (old intel NUC) I use for taking backups of my desktop. It has a single 4TiB ssd in it.

Filesystem Type Size Used Avail Use% Mounted on
/dev/sda3 ext4 916G 80G 790G 10% /
/dev/sda4 btrfs 2.8T 106G 2.7T 4% /backup

I’ve been using btrfs for ages for /backup as I use the snapshot functionality of btrfs with an hourly rsync job from my desktop to copy changes over.

Recently the fan on the NUC failed, and while overheating (I think) it appears to have written garbage in various places (this was seen on the ext4 rootfs as well as the /backup btrfs volume).

BTRFS

Trying to scrub the filesystem highlights the problems –

root@nectarine:~# btrfs scrub status /backup
UUID:             36f93b26-6187-4874-8cc6-4d4bd092e7d8
Scrub resumed:    Sat Jun 17 13:48:33 2023
Status:           finished
Duration:         1:21:28
Total to scrub:   1.23TiB
Rate:             263.66MiB/s
Error summary:    csum=60
  Corrected:      0
  Uncorrectable:  60
  Unverified:     0

(As I only have one underlying block device, it’s not possible for it to repair itself).

I now also see messages like this in ‘dmesg’ –

[ 3570.123946] BTRFS error (device sda4): unable to fixup (regular) error at logical 1870167986176 on dev /dev/sda4
[ 3570.128866] BTRFS error (device sda4): bdev /dev/sda4 errs: wr 0, rd 0, flush 0, corrupt 199, gen 0
[ 3570.128862] BTRFS warning (device sda4): checksum error at logical 1870167683072 on dev /dev/sda4, physical 1477245284352, root 8890, inode 3750321, offset 384077824, length 4096, links 1 (path: .icedove/e1kre066.default-release-2/ImapMail/imap.gmail-2.com/INBOX-1)

Before trying to re-initialise the checksum tree (And then just let the corrupt files expire out of the filesystem with time as they get rsync’ed over) I thought I’d try :

root@nectarine:~# btrfs check -p /dev/sda4 
Opening filesystem to check...
Checking filesystem on /dev/sda4
UUID: 36f93b26-6187-4874-8cc6-4d4bd092e7d8
[1/7] checking root items                      (0:00:10 elapsed, 6406461 items checked)
Segmentation faultents                         (0:00:02 elapsed, 7542 items checked)

So that didn’t work very well.

So I thought I might as well try just re-initialising the checksum tree –

root@nectarine:~# btrfs check -p --init-csum-tree /dev/sda4 
Creating a new CRC tree
WARNING:

	Do not use --repair unless you are advised to do so by a developer
	or an experienced user, and then only after having accepted that no
	fsck can successfully repair all types of filesystem corruption. Eg.
	some software or hardware bugs can fatally damage a volume.
	The operation will start in 10 seconds.
	Use Ctrl-C to stop it.
10 9 8 7 6 5 4 3 2 1
Starting repair.
Opening filesystem to check...
Checking filesystem on /dev/sda4
UUID: 36f93b26-6187-4874-8cc6-4d4bd092e7d8
Reinitialize checksum tree
kernel-shared/extent_io.c:650: free_extent_buffer_internal: BUG_ON `eb->refs < 0` triggered, value 1
btrfs(+0x2b1f7)[0x5590e079d1f7]
btrfs(+0x2b381)[0x5590e079d381]
btrfs(+0x2b68e)[0x5590e079d68e]
btrfs(alloc_extent_buffer+0x77)[0x5590e079e740]
btrfs(read_tree_block+0x47)[0x5590e0796066]
btrfs(read_node_slot+0x47)[0x5590e078f7fd]
btrfs(btrfs_next_sibling_tree_block+0x95)[0x5590e0792900]
btrfs(+0x19e14)[0x5590e078be14]
btrfs(+0x1a8a8)[0x5590e078c8a8]
btrfs(iterate_extent_inodes+0x68)[0x5590e078d5dc]
btrfs(fill_csum_tree+0x46b)[0x5590e07f9440]
btrfs(+0x74bf2)[0x5590e07e6bf2]
btrfs(main+0x3d3)[0x5590e078a203]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xea)[0x7ff38d37fd0a]
btrfs(_start+0x2a)[0x5590e078a86a]
Aborted

So I don’t feel that worked all that well.

I guess I’ll copy off the data I don’t want to lose, and just reformat it. I was hoping the repair tools (btrfs-progs v6.2, kernel 6.1.34) had hopefully matured since I last broke a btrfs filesystem (a few years ago). I guess not?

I know btrfs is at least alerting me to issues with the data – which ext4 definitely isn’t (given /var/lib/dpkg/status contained a load of trash) – so I’ll give it credit for that. It’s just a shame the ‘repair’ tools aren’t working that well.

ext4

This isn’t written to much on this system – there’s a munin daemon running (so /var/lib/munin will have been written to) and a few log files.

Interestingly, when I first noticed a problem with the device, after logging in, I instinctively ran ‘apt-get update’ (I was hoping a reboot would fix it, at which point I might as well make sure any updates were installed).

Running ‘apt-get update’ resulted in /var/lib/dpkg/status being full of rubbish.

After the PC had been turned on for a few hours, ext4 eventually figured there were problems with it – by logging this :

[11591.230282] munin-html[22255]: segfault at a400000e ip 0000557783eaf0e9 sp 00007ffca1d969f0 error 4 in perl[557783de1000+185000] likely on CPU 3 (core 1, socket 0)
[11591.230298] Code: 4e 0c 89 56 08 83 e9 09 83 f9 01 76 14 83 fa 01 76 3f 83 ea 01 89 55 08 48 83 c4 10 5d c3 0f 1f 00 48 8b 70 08 48 85 f6 74 e3 <f6> 46 0e 10 74 dd 48 c7 40 08 00 00 00 00 8b 56 08 83 fa 01 76 22
[11591.432906] munin-graph[22257]: segfault at 55a6b77c7df0 ip 000055a64601ebc2 sp 00007ffcd88c5150 error 4 in perl[55a645fc0000+185000] likely on CPU 3 (core 1, socket 0)
[11591.432927] Code: 0f 1f 84 00 00 00 00 00 48 8b 4f 10 48 85 c9 74 5f 48 83 ec 08 48 8b 87 30 01 00 00 48 8b 50 10 48 39 d1 75 4c 48 85 f6 74 55 <48> 8b 04 f1 48 85 c0 74 20 48 8d 97 50 01 00 00 48 39 d0 74 14 8b
[12723.693630] EXT4-fs error (device sda3): htree_dirblock_to_tree:1080: inode #28706704: comm find: Directory block failed checksum
[12723.693673] Aborting journal on device sda3-8.
[12723.696920] EXT4-fs error (device sda3): ext4_journal_check_start:83: comm systemd-journal: Detected aborted journal
[12723.696945] EXT4-fs error (device sda3): ext4_journal_check_start:83: comm rs:main Q:Reg: Detected aborted journal
[12723.708257] EXT4-fs (sda3): Remounting filesystem read-only

Rebooting and running : fsck -Cy /dev/sda3 MIGHT have fixed the rootfs.

systemd-resolve (DNS is always to blame)

For the record, this is using systemd v247, from Debian’s buster-backports.

I think I was enticed by the cool aid, hoping to be able to have DNSSEC or DNSoverTLS …. and caching … and to be fair, it appeared to work on all the servers I’d installed it on (although they were just ‘boring’ LAMP style webservers).

Anyway, everything seemed to be going well, with the default /etc/resolv.conf like :

nameserver 127.0.0.53

options edns0

and /etc/systemd/resolved.conf looking like :

[Resolve]
DNS=8.8.8.8#dns.google 8.8.4.4#dns.google 1.1.1.1
FallbackDNS=1.1.1.1 8.8.4.4 9.9.9.9
LLMNR=no
DNSOverTLS=opportunistic
DNSSEC=no
Cache=yes

Unfortunately, on one relatively busy server which makes multiple HTTP requests out every second, I saw sporadic failures where curl would report a timeout for e.g. graph.facebook.com (>10 connect time).

The timeouts seemed to be grouped together (no timeouts for a number of hours, and then a load of requests would fail) and obviously to be annoying this only happened in production and wasn’t something I could reproduce.

As best I can tell, a failure to lookup was being cached, so all requests for a specific hostname would then fail until the cache expired (30 seconds?)

So I end up having /etc/resolv.conf looking a bit more like a traditional one with 8.8.8.8 as the first nameserver and some custom options to lower the retry time and hopefully trigger multiple DNS lookup attempts.

So, perhaps …. perhaps … systemd-resolve isn’t quite ready for production yet?

faster rsync (ssh cipher choice)

Perhaps the bottleneck isn’t always bandwidth – but does changing ssh cipher make any difference?

Using a derivative of :

rsync -W --delete --no-owner --no-group --no-perms \
    -e ssh \
    -arv /source/ remote@destination:/destination/path/

In unscientific tests, it looks like ssh parameters might do something when copying a 4GiB file between two random virtual machines in different data centres, but both in London.

SSH Variant Speed
-e “ssh” ~45MB/s
-e “ssh -x -T” ~44MB/s
-e “ssh -x -T -c chacha20-poly1305@openssh.com” ~42MB/s
-e “ssh -x -T -c aes128-ctr” ~47MB/s
-e “ssh -x -T -c aes256-gcm@openssh.com” ~50MB/s
-e “ssh -x -T -c aes128-gcm@openssh.com “ ~45MB/s

I’m not sure if these results are particularly insightful / useful.

(re)building varnish modules

I’m using Varrsh 6 LTS in some places, and need a way to rebuild dependent modules …. which seem to need recompiling even for a minor feature release (E.g. 6.0.1 to 6.0.2).

I use dynamic (DNS routing), var and vsthrottle.

Firstly, here’s a Dockerfile –

FROM debian:buster as builder

ARG VARNISH_VERSION=6.0.8-1~buster

ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get -qy update && \
    apt-get -qy install eatmydata apt-transport-https lsb-release ca-certificates curl gnupg wget && \
    apt-get clean

RUN echo "\
Package: varnish\n\
Pin: version ${VARNISH_VERSION}\n\
Pin-Priority: 1001 \
\
Package: varnish-dev \n\
Pin: version ${VARNISH_VERSION} \n\
Pin-Priority: 1001 \
" >> /etc/apt/preferences.d/varnish 

RUN echo "deb https://packagecloud.io/varnishcache/varnish60lts/debian/ buster main" > /etc/apt/sources.list.d/varnish.list

RUN wget -qO /tmp/varnish.gpg https://packagecloud.io/varnishcache/varnish60lts/gpgkey && \
    apt-key add /tmp/varnish.gpg && \
    apt-get -q update && \
    eatmydata -- apt-get -qy install varnish varnish-dev automake libtool make libncurses-dev pkg-config python3-docutils unzip libgetdns10 libgetdns-dev

RUN apt-cache policy varnish

WORKDIR /tmp

RUN wget -qO /tmp/varnish.zip https://github.com/varnish/varnish-modules/archive/refs/heads/6.0.zip && \
    unzip /tmp/varnish.zip && \
    cd varnish-modules-6.0 && \
    bash bootstrap && \
    ./configure --disable-dependency-tracking && \
    make && \
    make check && \
    make install 

RUN wget -qO /tmp/dynamic.zip https://github.com/nigoroll/libvmod-dynamic/archive/refs/heads/6.0.zip && \
    unzip /tmp/dynamic.zip && \
    cd libvmod-dynamic-6.0 && \
    bash autogen.sh && \
    bash configure && \
    make && \
    make install


FROM debian:buster
    
WORKDIR /srv/export
COPY --from=builder /usr/lib/varnish/vmods/libvmod_dynamic.so /srv/export/
COPY --from=builder /usr/lib/varnish/vmods/libvmod_proxy.so /srv/export/
COPY --from=builder /usr/lib/varnish/vmods/libvmod_var.so /srv/export/
COPY --from=builder /usr/lib/varnish/vmods/libvmod_vsthrottle.so /srv/export/
COPY --from=builder /usr/lib/varnish/vmods/libvmod_header.so /srv/export/

and then, I copy the files out of that build pipeline (dare i call it that?) with this shell script

#!/bin/bash

set -eux

# Build a new set of varnish modules.

# Each version of varnish needs it's own build of some modules - moving from e.g. varnish 6.0.7~1-stretch to 6.0.8~1-stretch 
# isn't possible without these modules being rebuilt.

[ -d $(pwd)/tmp ] && rm -Rf $(pwd)/tmp

docker build --pull -f Dockerfile -t builder .

mkdir tmp

docker run -v $(pwd)/tmp:/srv/tmp -ti builder bash -c 'cp /srv/export/* /srv/tmp'

Then it’s just a case of running ‘build.sh’ and waiting …. and you’ll find the files you want in ‘tmp’.

AWS vs Azure … round 1, fight!

So, for whatever reason, I need to move some virtual machines and things from AWS (EC2, RDS), to an Azure. I have a few years experience with AWS, but until recently I’ve not really used Azure ….

Here are some initial notes……

  • AWS tooling feels more mature (with the ‘stock’ ansible that ships with Ubuntu 20.10, I’m not able to create a virtual machine in Azure without having python module errors appear)
  • AWS EBS disks are more flexible – I can enlarge and/or change their performance profile at runtime (no downtime). With Azure, I have to shutdown the server before I can change them.
  • AWS SSL certificates are better (for Azure I had to install a LetsEncrypt application and integrate it with my DNS provider ( e.g. https://github.com/shibayan/keyvault-acmebot ). AWS has it’s certificate service that issues free certs built in, and if the domain is already in Route53 there’s hardly anything to do.
  • Azure gives you more control over availability (with its concept of availability sets, it allows you to have some control over VM placement and order of updates being applied). It also gives Placement Groups – allowing you to influence physical placement of resources to reduce latency etc.
  • Azure feels more ‘commercial’ (with the various different third party products appearing in the portal when you search etc).
  • Azure has worse support for IPv6 (e.g. if you have a VPN within your Virtual Network you can’t have IPv6).
  • Azure doesn’t seem to offer ARM based Virtual Machines and fewer AMD equivalents (see also: EC2 Graviton 2).
  • Azure’s pricing feels harder to understand – there’s often a ‘standard’ and ‘premium’ option for most products, but the description of differences is often buried in documentation away from the portal ….. I often see ‘Pricing unavailable’.
    • Do I want a premium IP address?
    • Do I need Ultra or Premium SSDs or will Standard SSD suffice? Will I be able to change/revert if I’ve chosen the wrong one without deleting and recreating something?
    • Why do I need to choose a VPN server SKU?
  • Azure networks all have outbound NAT based internet access by default – so even if you’ve not assigned a public IP address to the resource, it can reach out. At the same time, you can also buy a NAT Gateway. If you give a VM a public IP address then it will use that for it’s outbound traffic.
  • Azure has a lot of services in ‘preview’ (to me beta). At the time of writing (March 2021), it doesn’t yet offer a production ready ….
    • MySQL database service that has zone redundancy (i.e. no real high availability)
    • Storage equivalent of EFS (NFS is in preview)
  • Azure does provide a working serial console for VMs, which is quite handy when systemd decides to throw a fit on bootup (2021/04/02 – AWS apparently now provides this too!).
  • Azure doesn’t let you detach the root volume from a stopped server to mount it elsewhere (e.g. for maintenance to fix something that won’t boot up!).
  • When deleting a VM in Azure, it’s necessary to manually delete linked disks. In AWS they can be cleaned up at the same time.

rsyslog filtering (with loggly)

If you’re a bit slow on the uptake, like me … this might help.

Basic logging to Loggly is simple enough –

References : https://www.loggly.com/docs/rsyslog-tls-configuration/ gets you to add in an omfwd action and a template with auth details in …

However, when you also want to mix in sending Apache logs to loggly, and at the same time want to suppress sending some lines ….. life becomes a bit harder.

Here’s what worked for me anyway… replace MAGIC_AUTH_TOKEN_HERE with your loggly auth details.

Place this in /etc/rsyslog.d/loggly.conf.

# Setup disk assisted queues
$WorkDirectory /var/spool/rsyslog # where to place spool files
$ActionQueueFileName fwdRule1     # unique name prefix for spool files
$ActionQueueMaxDiskSpace 1g       # 1gb space limit (use as much as possible)
$ActionQueueSaveOnShutdown on     # save messages to disk on shutdown
$ActionQueueType LinkedList       # run asynchronously
$ActionResumeRetryCount -1        # infinite retries if host is down

#RsyslogGnuTLS
$DefaultNetstreamDriverCAFile /etc/rsyslog.d/keys/ca.d/logs-01.loggly.com_sha12.crt


$ActionSendStreamDriver gtls # use gtls netstream driver
$ActionSendStreamDriverMode 1 # require TLS
$ActionSendStreamDriverAuthMode x509/name # authenticate by hostname
$ActionSendStreamDriverPermittedPeer *.loggly.com

template(name="LogglyFormat" type="string"
string="< %pri%>%protocol-version% %timestamp:::date-rfc3339% %HOSTNAME% %app-name% %procid% %msgid% [MAGIC_AUTH_TOKEN_HERE tag=\"Syslog\"] %msg%\n"
)


module(load="imfile") 

# Apache file inputs :

input(type="imfile"
    File="/var/log/apache2/access.log"
    Tag="apache-access"
    Severity="info"
    Facility="local7")

input(type="imfile"
    File="/var/log/apache2/error.log"
    Tag="apache-error"
    Severity="error"
    Facility="local7")


# Format for Apache things.
$template LogglyFormatApache,"< %pri%>%protocol-version% %timestamp:::date-rfc3339% %HOSTNAME% %app-name% %procid% %msgid% [MAGIC_AUTH_TOKEN_HERE  tag=\"apache\" ] %msg%\n"

if ( $programname == 'apache-access' ) and not ( $msg contains "/something-to-skip/" ) then {
     action(
        type="omfwd" 
        protocol="tcp" 
        target="logs-01.loggly.com" 
        port="6514" template="LogglyFormatApache" 
        StreamDriver="gtls" 
        StreamDriverMode="1" 
        StreamDriverAuthMode="x509/name" 
        StreamDriverPermittedPeers="*.loggly.com"
    )
    stop
} 

# no further processing of apache-access things 
if ( $programname == 'apache-access') then stop

if ( $programname == 'apache-error' ) then {
         action(
                type="omfwd" 
                protocol="tcp" 
                target="logs-01.loggly.com" 
                port="6514" template="LogglyFormatApache" 
                StreamDriver="gtls" 
                StreamDriverMode="1" 
                StreamDriverAuthMode="x509/name" 
                StreamDriverPermittedPeers="*.loggly.com"
        )
    stop
} 

if ( $programname == 'apache-error') then stop

# Anything else ... sent to loggly.
action(
    type="omfwd" 
    protocol="tcp" 
    target="logs-01.loggly.com" 
    port="6514" template="LogglyFormatApache" 
    StreamDriver="gtls" 
    StreamDriverMode="1" 
    StreamDriverAuthMode="x509/name" 
    StreamDriverPermittedPeers="*.loggly.com"
)

First steps with a Pixelbook

So, my 2009 MacBook Pro decided to slowly die … and after dithering for about 3 years over what to buy to replace it …. I chose a Google Pixelbook (i7 variant, 500Gb NVME disk etc) (via eBay).

Here are some findings …

  • Installing Linux within the supported VM environment is straight forward (see docs) but it’s a 4.14 kernel with Debian Stretch. Given it’s using BTRFS I’d prefer a newer kernel (or at least the ability to choose what kernel the VM boots…)
  • I can’t seem to find a way of getting a clipboard manager that works across all applications (so I can copy+paste multiple things between windows). I’ve been using ClipIt for years on my main desktop.
  • Sharing files between the Linux environment and native ChromeOS is kind of annoying (go into the Files app, and drag/drop the file(s) around). The UI hints at there being shared folders, but I’m guessing they’ll be enabled in a future release.
  • Sound from a Linux app doesn’t work (when running vlc within the Linux VM, there is no sound); apparently a known bug so I’ll hope it’ll get fixed soon.
  • It’s fast. Especially browsing the web.
  • It’s not burnt my lap yet (unlike the MBP)
  • It’s possible to get sound to stutter from e.g. Play Music, if you’re doing a reasonable amount of I/O (like PHPStorm rebuilding it’s indexes)
  • Installing PHPStorm (and other Linux apps) was fairly straight forward (either via apt or however I’d normally do it in Linux) and generally works fine …
  • There’s no “right click” for the mouse pad; instead you do a double finger tap. You can ctrl+click or use a two finger tap.
  • Tablet mode is great for Android Apps – I’ve tried a couple of toddler apps and they just worked fine.
  • Not all Android apps work properly – e.g. using Authenticator Plus for 2FA auth codes – doesn’t seem to be able to sync with my Google Drive backup and when opening it, there are always two windows for some reason.
  • Thankfully you can “right click” on the launcher tray and configure it to auto-hide and pin apps you use often.

I’m toying with the idea of replacing ChromeOS with a native Linux install; but I’ve not yet seen enough evidence to suggest that it’ll work well.

Hopefully the Campfire project will have a release soon …. Until then I’ll be watching https://www.reddit.com/r/pixelbook etc

Using hitch with varnish on Debian Jessie

I ended up needing to install hitch on a server recently, so the https:// traffic could be routed through Varnish (along with the existing ‘http’ stuff) for performance reasons.

The server only runs WordPress sites, so there are WordPress specific things in the Varnish configuration (vcl) file below.

Versions: Varnish 5.2, Hitch 1.4.4, Apache 2.4 and Debian Jessie.

Continue reading “Using hitch with varnish on Debian Jessie”

postsrsd monit config

This might work to configure monit on Debian (Jessie) to monitor postsrsd.

check process postsrsd matching "/usr/sbin/postsrsd"
    group postsrsd
    start program = "/etc/init.d/postsrsd start"
    stop  program = "/etc/init.d/postsrsd stop"
    if failed host localhost port 10001 then restart 
    if failed host localhost port 10002 then restart