tailscale account switching (further fumblings)

So, I’ve been using headscale for the last few months, combined with a cheap low spec VM from MythicBeasts.com (as my VPN “server” or at least exit node).

Recently, we decided to ditch using bastion ssh (jump) hosts at work, and move to use a VPN instead. This saves us from having a VM running ssh from listening for inbound connections.

Then I wondered how I could access both the work and my home tailscale networks from my laptop etc.

Initially I came across a blog/article discussing how to access two tailscale networks at once, which involved using linux network namespaces and adding various iptables rules etc. I sort of had a go, but it didn’t seem to want to work and it felt like it was going cause me trouble.

So I thought I’d probably have to keep switching tailscale networks somehow (e.g. tailscale down ; tailscale up ….–server … etc ). But this means I need to keep approving the it on the headscale side etc.

Then I saw there’s a ‘tailscale switch’ command ….

# tailscale switch --list
ID Tailnet Account
0101 my.headscale.server david*
1010 some-label david.goodwin@work.corp

and switching is just a “tailscale switch some-label” or “tailscale switch my.headscale.server

That’s a bit easier than having to reauthenticate with the appropriate tailscale network etc.

A little over ten years ago ….

A little over 10 years ago, in a previous role/company, I designed and implemented a website hosting environment (with a catchy name of “w 3 p cloud”) to ….

  • support WordPress/LAMP like environments
  • have some sort of process/file isolation between sites, so a malware infection in one shouldn’t be able to spread/reach other sites
  • have resource limits in place (the business also liked the idea of charging for more “firepower”, I just wanted to try and stop one site from doing a denial of service on others)
  • be hosted in AWS (EC2) because it was cool to be moving to the cloud (despite the cost)

Eventually, I settled on using LXC containers with a Varnish server as a HTTP frontend router. Hosting within AWS (EC2) was basically a non-negotiable requirement and there weren’t many alternatives either.

A crude web UI was added for managing sites, which was quickly adapted to have a JSON API on top. Then background tasks were added – involving a job queue (originally gearman, later on beanstalk) and some management of iptables rules.

Fast forward to 2026, and it’s still (just about) in use.

As a programmer/developer, we often don’t think too much about the distant future when creating something. We’ve got immediate deadlines, and thinking more than 2-3 years is difficult. There are plenty of uncertainties in life afterall!

Over the years, there have been multiple upgrades of various bits (Varnish, PHP, Debian release, Linux kernels etc).

I think it had between 5-8k sites at it’s peak, and I often found it amusing when I realised I was ordering/using a website hosted on it as a member of the public.

Anyway, all good things come to an end, I guess …. and finally a migration to something bigger/better/faster/shinier has begun, as this count of sites being hosted in it shows:

graph of number of sites being hosted over time, showing a recent decline

Trying out headscale (tailscale vpn stuff)

For sometime, I’ve been using Wireguard for a VPN to use when I’m out and about etc.

As I’m fairly stupid, I used wg-quick to generate the config – however when the config looks a bit like this –


[Peer]
PublicKey = cm+t2u0giNynMkcX1+afPu6SlKyLMeTe8iWKhT1FsDk=
AllowedIPs = 10.0.0.13/32
Endpoint = 192.168.122.13:51820
....

I began to find management became a problem – i.e which computer is that exactly ?

wg show does give you something a bit like this –


...
peer: cm+t2u0giNynMkcX1+afPu6SlKyLMeTe8iWKhT1FsDk=
endpoint: 192.168.122.13:51820
allowed ips: 10.0.0.13/32
...

which is sort of useful, but it still doesn’t tell me a human name. I’ve tried leaving comments in the config before, but they just get wiped out.

I’ve often thought about using TailScale, but wasn’t overly happy with the idea of some third party being involved. Eventually I came across headscale – which offers a self-hosted option for the backend (so your devices use the tailscale frontend).

After a bit of poking around over the weekend I now have this: headscale nodes list

which is a bit nicer –

I’m still pretty new to using Tailscale for a VPN, but I did at least eventually get my phone to join the network, and everything seems to work.

It’s sort of interesting that tailscale doesn’t add an entry into your routing table – but instead adds a few iptables rules in (nat) to mess around with things.

Upgrade some things

Well, I sort of realised I had a web server or two that were still on Debian Buster, and it was time to move to Bullseye or Bookworm. As usual the Debian upgrade procedure was mostly pretty straight forward and uneventful.

Interesting findings :

  • hitch“, which I use as an SSL frontend to varnish, doesn’t seem to get along all that well with systemd and silently fails if your config has “daemon = on” setting in /etc/hitch/hitch.conf. Annoyingly when trying to test the configuration with “hitch -t” you will get an error like: “No x509 certificate PEM file specified for frontend ‘default’!” – the solution to that is to specify the config file – i.e : hitch -t --config /etc/hitch/hitch.conf
  • hitch hasn’t had a release in it’s packagecloud.io repository for the last 3 years; so the debian supported variant looks more appealing.

In other news, I noticed this post where someone moaned about systemd-resolved the other day – https://www.reddit.com/r/linux/comments/18kh1r5/im_shocked_that_almost_no_one_is_talking_about/ – I’ve had similar problems to the people on the thread (resolved stops working etc) so thought it was time to try and use ‘unbound‘ instead.

apt-get install unbound

and then tell /etc/resolv.conf to use 127.0.0.1 for DNS.

annoyingly, unbound-control stats isn’t quite as pretty as resolvectl statistics but oh well.

echo -e “nameserver 127.0.0.1\nnameserver 8.8.8.8\noptions timeout:4” >/etc/resolv.conf

and an /etc/unbound/unbound.conf file that looks perhaps like :

server:
interface: 127.0.0.1
access-control: 127.0.0.0/8 allow
access-control: ::1/128 allow
# The following line will configure unbound to perform cryptographic
# DNSSEC validation using the root trust anchor.
auto-trust-anchor-file: "/var/lib/unbound/root.key"
tls-cert-bundle: "/etc/ssl/certs/ca-certificates.crt"

remote-control:
control-enable: yes
# by default the control interface is is 127.0.0.1 and ::1 and port 8953
# it is possible to use a unix socket too
control-interface: /run/unbound.ctl

forward-zone:
name: "."
forward-tls-upstream: yes
forward-addr: 1.1.1.1@853#cloudflare-dns.com
forward-addr: 1.0.0.1@853#cloudflare-dns.com

(Unfortunately my ISP is shitty, and doesn’t yet give me an ipv6 address).

Looking at https://1.1.1.1/help – I do sometimes see that ‘DNS over TLS’ is “yes”…. so I guess something is right; annoyingly I don’t see anything useful from unbound’s stats (unbound-control stats) to show it’s done a secure query…

“unbound-host” (another debian package) – will helpfully tell you whether a lookup was done ‘securely’ or not – e.g.

$ unbound-host google.com -D -v
google.com has address 142.250.178.14 (insecure)
google.com has IPv6 address 2a00:1450:4009:815::200e (insecure)
google.com mail is handled by 10 smtp.google.com. (insecure)

which seems a little odd to me (I’d have thought google would support dns sec), but some domains do work – e.g.

$ unbound-host mythic-beasts.com -D -v
mythic-beasts.com has address 93.93.130.166 (secure)
mythic-beasts.com has IPv6 address 2a00:1098:0:82:1000:0:1:2 (secure)
mythic-beasts.com mail is handled by 10 mx1.mythic-beasts.com. (secure)
mythic-beasts.com mail is handled by 10 mx2.mythic-beasts.com. (secure)

Beelink SER6 Max

“New PC Time”

I’ve had an ASUS PN50 (AMD 4800u processor) as my desktop/daily driver for sometime, and it’s nice and power efficient, but increasingly I found it being slow.

I eventually discovered I could turn on the CPU ‘boost’ feature (doh!) – but doing that seemed to result in it crashing within the next 24-48 hours…. which isn’t good. I don’t know if it’s a hardware or Linux problem – but I had already sort of decided it was time to consider upgrading to something with more ‘ooomph’.

So, I came across a slightly dodgy looking listing on Amazon for a Beelink SER6 max (32gb RAM, 500GiB SSD). The SER6 Max is a fairly new release, and Beelink are a relatively cheap, newish supplier of hardware with some past quality issues. Anyway, I thought I’d stop dithering over it, and buy it and rely on Amazon’s returns policy if there were problems with the PC/hardware.

My reason for choosing the SER6 Max was that it had enough rear ports for all three of my monitors, most other minipc variants don’t. I did contemplate the Geekom AS6 (which is an ASUS PN53 with the same CPU as this beelink, but it has slower RAM and I was concerned it might be noisy).

So, I “pulled the trigger” on https://www.amazon.co.uk/dp/B0C279T4P6 and on a whim I tried installing Siduction Linux…. so now I’ve got full disk encryption and what looks like a fairly up to date stack of stuff (with XFCE).

The SER6 has at least passed a token memory test, and some system tests – so I’m fairly optimistic about it, although I did have one hard lock up / crash yesterday which is unexplained.

(1 week later, and it seems well stable/reliable … )

Resizing a VM’s disk within Azure

Random notes on resizing a disk attached to an Azure VM …

Check what you have already –

az disk list --resource-group MyResourceGroup --query '[*].{Name:name,Gb:diskSizeGb,Tier:accountType}' --output table

might output something a bit like :

Name Gb
———————————————- —-
foo-os 30
bar-os 30
foo-data 512
bar-data 256

So here, we can see the ‘bar-data’ disk is only 256Gb.

Assuming you want to change it to be 512Gb (Azure doesn’t support an arbitary size, you need to choose a supported size…)

az disk update --resource-group MyResourceGroup --name bar-data --size-gb 512

Then wait a bit …

In my case, the VMs are running Debian Buster, and I see this within the ‘dmesg‘ output after the resize has completed (on the server itself).

[31197927.047562] sd 1:0:0:0: [storvsc] Sense Key : Unit Attention [current]
[31197927.053777] sd 1:0:0:0: [storvsc] Add. Sense: Capacity data has changed
[31197927.058993] sd 1:0:0:0: Capacity data has changed

Unfortunately the new size doesn’t show up straight away to the O/S, so I think you either need to reboot the VM or (what I do) –

echo 1 > /sys/class/block/sda/device/rescan

at which point the newer size appears within your ‘lsblk‘ output – and the filesystem can be resized using e.g. resize2fs

systemd-resolve (DNS is always to blame)

For the record, this is using systemd v247, from Debian’s buster-backports.

I think I was enticed by the cool aid, hoping to be able to have DNSSEC or DNSoverTLS …. and caching … and to be fair, it appeared to work on all the servers I’d installed it on (although they were just ‘boring’ LAMP style webservers).

Anyway, everything seemed to be going well, with the default /etc/resolv.conf like :

nameserver 127.0.0.53

options edns0

and /etc/systemd/resolved.conf looking like :

[Resolve]
DNS=8.8.8.8#dns.google 8.8.4.4#dns.google 1.1.1.1
FallbackDNS=1.1.1.1 8.8.4.4 9.9.9.9
LLMNR=no
DNSOverTLS=opportunistic
DNSSEC=no
Cache=yes

Unfortunately, on one relatively busy server which makes multiple HTTP requests out every second, I saw sporadic failures where curl would report a timeout for e.g. graph.facebook.com (>10 connect time).

The timeouts seemed to be grouped together (no timeouts for a number of hours, and then a load of requests would fail) and obviously to be annoying this only happened in production and wasn’t something I could reproduce.

As best I can tell, a failure to lookup was being cached, so all requests for a specific hostname would then fail until the cache expired (30 seconds?)

So I end up having /etc/resolv.conf looking a bit more like a traditional one with 8.8.8.8 as the first nameserver and some custom options to lower the retry time and hopefully trigger multiple DNS lookup attempts.

So, perhaps …. perhaps … systemd-resolve isn’t quite ready for production yet?

(re)building varnish modules

I’m using Varrsh 6 LTS in some places, and need a way to rebuild dependent modules …. which seem to need recompiling even for a minor feature release (E.g. 6.0.1 to 6.0.2).

I use dynamic (DNS routing), var and vsthrottle.

Firstly, here’s a Dockerfile –

FROM debian:buster as builder

ARG VARNISH_VERSION=6.0.8-1~buster

ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get -qy update && \
    apt-get -qy install eatmydata apt-transport-https lsb-release ca-certificates curl gnupg wget && \
    apt-get clean

RUN echo "\
Package: varnish\n\
Pin: version ${VARNISH_VERSION}\n\
Pin-Priority: 1001 \
\
Package: varnish-dev \n\
Pin: version ${VARNISH_VERSION} \n\
Pin-Priority: 1001 \
" >> /etc/apt/preferences.d/varnish 

RUN echo "deb https://packagecloud.io/varnishcache/varnish60lts/debian/ buster main" > /etc/apt/sources.list.d/varnish.list

RUN wget -qO /tmp/varnish.gpg https://packagecloud.io/varnishcache/varnish60lts/gpgkey && \
    apt-key add /tmp/varnish.gpg && \
    apt-get -q update && \
    eatmydata -- apt-get -qy install varnish varnish-dev automake libtool make libncurses-dev pkg-config python3-docutils unzip libgetdns10 libgetdns-dev

RUN apt-cache policy varnish

WORKDIR /tmp

RUN wget -qO /tmp/varnish.zip https://github.com/varnish/varnish-modules/archive/refs/heads/6.0.zip && \
    unzip /tmp/varnish.zip && \
    cd varnish-modules-6.0 && \
    bash bootstrap && \
    ./configure --disable-dependency-tracking && \
    make && \
    make check && \
    make install 

RUN wget -qO /tmp/dynamic.zip https://github.com/nigoroll/libvmod-dynamic/archive/refs/heads/6.0.zip && \
    unzip /tmp/dynamic.zip && \
    cd libvmod-dynamic-6.0 && \
    bash autogen.sh && \
    bash configure && \
    make && \
    make install


FROM debian:buster
    
WORKDIR /srv/export
COPY --from=builder /usr/lib/varnish/vmods/libvmod_dynamic.so /srv/export/
COPY --from=builder /usr/lib/varnish/vmods/libvmod_proxy.so /srv/export/
COPY --from=builder /usr/lib/varnish/vmods/libvmod_var.so /srv/export/
COPY --from=builder /usr/lib/varnish/vmods/libvmod_vsthrottle.so /srv/export/
COPY --from=builder /usr/lib/varnish/vmods/libvmod_header.so /srv/export/

and then, I copy the files out of that build pipeline (dare i call it that?) with this shell script

#!/bin/bash

set -eux

# Build a new set of varnish modules.

# Each version of varnish needs it's own build of some modules - moving from e.g. varnish 6.0.7~1-stretch to 6.0.8~1-stretch 
# isn't possible without these modules being rebuilt.

[ -d $(pwd)/tmp ] && rm -Rf $(pwd)/tmp

docker build --pull -f Dockerfile -t builder .

mkdir tmp

docker run -v $(pwd)/tmp:/srv/tmp -ti builder bash -c 'cp /srv/export/* /srv/tmp'

Then it’s just a case of running ‘build.sh’ and waiting …. and you’ll find the files you want in ‘tmp’.

docker-ce + Debian Buster + iptables

I found docker wouldn’t start for me on my Buster desktop.
journalctl -u docker -f showed :

Aug 15 09:35:50 walnut dockerd[28612]: failed to start daemon: Error initializing network controller: error obtaining controller instance: failed to create NAT chain DOCKER: iptables failed: iptables -t nat -N DOCKER: iptables v1.8.2 (nf_tables):  CHAIN_ADD failed (No such file or 
Aug 15 09:35:50 walnut dockerd[28612]: (exit status 4)

Fixing, yet again, seems a case of replacing nft/nftables stuff with the legacy iptables counterparts –

update-alternatives --set ip6tables /usr/sbin/ip6tables-legacy
update-alternatives --set iptables /usr/sbin/iptables-legacy