Excessive uptime(!?)

Somewhere on the internet there’s a mailserver with a larger uptime, I guess?

[root@xxxxxxxx ~]# uname -a
Linux xxxxxxxxxxxxxxx 2.6.18-419.el5 #1 SMP Fri Feb 24 22:47:42 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

[root@xxxxxxxx ~]# uptime
09:34:38 up 2290 days,  1:47,  ....

I don’t think anyone dares to reboot it …. (this is a server the customer was going to migrate off about 5 years ago …. somehow it’s still in use)

(2290 days is a little over 6 years)

systemd-resolve (DNS is always to blame)

For the record, this is using systemd v247, from Debian’s buster-backports.

I think I was enticed by the cool aid, hoping to be able to have DNSSEC or DNSoverTLS …. and caching … and to be fair, it appeared to work on all the servers I’d installed it on (although they were just ‘boring’ LAMP style webservers).

Anyway, everything seemed to be going well, with the default /etc/resolv.conf like :

nameserver 127.0.0.53

options edns0

and /etc/systemd/resolved.conf looking like :

[Resolve]
DNS=8.8.8.8#dns.google 8.8.4.4#dns.google 1.1.1.1
FallbackDNS=1.1.1.1 8.8.4.4 9.9.9.9
LLMNR=no
DNSOverTLS=opportunistic
DNSSEC=no
Cache=yes

Unfortunately, on one relatively busy server which makes multiple HTTP requests out every second, I saw sporadic failures where curl would report a timeout for e.g. graph.facebook.com (>10 connect time).

The timeouts seemed to be grouped together (no timeouts for a number of hours, and then a load of requests would fail) and obviously to be annoying this only happened in production and wasn’t something I could reproduce.

As best I can tell, a failure to lookup was being cached, so all requests for a specific hostname would then fail until the cache expired (30 seconds?)

So I end up having /etc/resolv.conf looking a bit more like a traditional one with 8.8.8.8 as the first nameserver and some custom options to lower the retry time and hopefully trigger multiple DNS lookup attempts.

So, perhaps …. perhaps … systemd-resolve isn’t quite ready for production yet?

Traefik + Azure Kubernetes

Just a random note or two …

At work we moved to use Azure for most of our hosting, for ‘reasons’. We run much of our workload through kubernetes.

The Azure portal has a nice integration to easily deploy a project from a github repo into Kubernetes, and when it does, it puts each project in it’s own namespace.

In order to deploy some new functionality, I finally bit the bullet and tried to get some sort of Ingress router in place. I chose to use Traefik.

Some random notes ….

  1. You need to configure/run Traefik with –providers.kubernetescrd.allowCrossNamespace=true, without this it’s not possible for e.g. Traefik (in the ‘traefik’ namespace) to use MyCoolApi in the ‘api’ namespace. The IngressRoute HAS to be in the same namespace as traefik is running in …. and the IngressRoute needs to reference a service in a different namespace…
  2. While you’re poking around, you probably want to load traefik with –log.level=DEBUG
  3. Use cert-manager for LetsEncrypt certificates (see https://www.andyroberts.nz/posts/aks-traefik-https/ for some details)
  4. You need to make sure you’re using a fairly recent Kubernetes variant – ours was on 1.19.something, which helpfully just silently”didn’t work” when trying to get the cross namespace stuff working.
  5. Use k9s as a quick way to view logs/pods within the cluster.

Example Ingress Route

apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  namespace: traefik
  name: projectx-ingressroute
  annotations:
    kubernetes.io/ingress.class: traefik
    cert-manager.io/cluster-issuer: my-ssl-cert

spec:
  entryPoints:
    - websecure    
  routes:
    - kind: Rule
      match: Host(`mydomain.com`) && PathPrefix(`/foo`) 
      services:
        - name: foo-api-service
          namespace: foo-namespace
          port: 80
  tls:
    secretName: my-ssl-cert-tls
    domains:
    - main: mydomain.com

Initially I tried to use traefik’s inbuilt LetsEncrypt provider support; and wanted to have a shared filesystem (azure storage, cifs etc) so multiple Traefik replicas could both share the same certificate store…. unfortunately this just won’t work, as the CIFS share gets mounted with 777 perms, which Traefik refuses to put up with.

Postfix – qshape

Somehow I’ve only just found out about ‘qshape’.

It’s a nice little tool to help show what’s going on, on a postfix based mail server.

You can summarise by sender (-s) and choose the queue …. which is a bit easier than trying to spot patterns in the ‘mailq’ output.

Squid 3.4.x for with transparent ssl proxying/support for Debian Wheezy.

I needed  a variant of Squid which supported transparent SSL interception (i.e via iptables redirection) so I could log outgoing HTTPS requests without the client being aware.

The stock wheezy variant doesn’t support SSL (see : Debian Bug Report).

Even after recompiling Wheezy’s squid3 it didn’t seem to work (perhaps my stupidity) so I ended up moving to the latest-and-greatest squid (3.4.9 at the time of writing) and getting that to work. Brief notes follow.

Continue reading “Squid 3.4.x for with transparent ssl proxying/support for Debian Wheezy.”