Checking varnish configuration syntax

If you’ve updated your varnish server’s configuration, there doesn’t seem to be an equivalent of ‘apachectl configtest’ for it, but you can do :

varnishd -C -f /etc/varnish/default.vcl

If everything is correct, varnish will then dump out the generated configuration. Otherwise you’ll get an error message pointing you to a specific line number.

Automated snapshot backup of an Amazon EBS volume

I found the following Python script online, but it didn’t really work too well :

http://aws-musings.com/manage-ebs-snapshots-with-a-python-script/

EBS – Elastic Block Storage …

I had to easy_install boto, to get it to work.

I’m not sure the Debian python-boto package in Lenny is up to date.

Anyway, $server now has :

from boto.ec2.connection import EC2Connection
from boto.ec2.regioninfo import RegionInfo

from datetime import datetime
import sys

# Substitute your access key and secret key here
aws_access_key = 'MY_AWS_ACCESS_KEY'
aws_secret_key = 'MY_AWS_SECRET_KEY'
# Change to your region/endpoint...
region = RegionInfo(endpoint='eu-west-1.ec2.amazonaws.com', name='eu-west-1')

if len(sys.argv) < 3:
    print "Usage: python manage_snapshots.py volume_id number_of_snapshots_to_keep description"     
    print "volume id and number of snapshots to keep are required. description is optional"
    sys.exit(1) 

vol_id = sys.argv[1] 
keep = int(sys.argv[2]) 
conn = EC2Connection(aws_access_key, aws_secret_key, region=region) 
volumes = conn.get_all_volumes([vol_id]) 
print "%s" % repr(volumes) 
volume = volumes[0] 
description = 'Created by manage_snapshots.py at ' + datetime.today().isoformat(' ') 
if len(sys.argv) > 3:
    description = sys.argv[3]

if volume.create_snapshot(description):
    print 'Snapshot created with description: ' + description

snapshots = volume.snapshots()
snapshot = snapshots[0]

def date_compare(snap1, snap2):
    if snap1.start_time < snap2.start_time:
        return -1
    elif snap1.start_time == snap2.start_time:
        return 0
    return 1

snapshots.sort(date_compare)
delta = len(snapshots) - keep
for i in range(delta):
    print 'Deleting snapshot ' + snapshots[i].description
    snapshots[i].delete()

And then plonk something like the following in /etc/cron.daily/backup_ebs :

for volume in vol-xxxx vol-yyyyy vol-zzzz
do
	/path/to/above/python/script.py $volume 7 "Backup of $volume on $(date +%F-%H:%m)"
done

Which keeps 7 backups for each volume with a time/date stamp in each description.

Varnish + Zope – Multiple zope instances behind a single varnish cache

I run multiple Zope instances on one server. Each Zope instance listens on a different port (localhost:100xx). Historically I’ve just used Apache as a front end which forwards requests to the Zope instance.

Unfortunately there are periods of the year when one site gets a deluge of requests (for example; when hosting a school site, if it snows overnight, all the parents will check the site in the morning at around about 8am).

Zope is not particularly quick on it’s own – Apache’s “ab” reports that a dual core server with plenty of RAM can manage about 7-14 requests per second – which isn’t that many when you consider each page on a Plone site will have a large number of dependencies (css/js/png’s etc).

Varnish is a reverse HTTP proxy – meaning it sits in-front of the real web server, caching content.

So, as I’m using Debian Lenny….

  1. apt-get install -t lenny-backports varnish
  2. Edit /etc/varnish/default.vcl
  3. Edit Apache virtual hosts to route requests through varnish (rather than directly to Zope)
  4. I didn’t need to change /etc/default/varnish.

In my case there are a number of Zope instances on the same server, but I only wanted to have one instance of varnish running. This is possible – but it requires me to look at the URL requested to determine which Zope instance to route through to.

So, for example, SiteA runs on a Zope instance on localhost:10021/sites/sitea. My original Apache configuration would contain something like :

<IfModule mod_rewrite.c>
   RewriteEngine on
   RewriteRule ^/(.*) http://127.0.0.1:10021/VirtualHostBase/http/www.sitea.com:80/sites/sitea/VirtualHostRoot/$1 [L,P]
 </IfModule>

To use varnish, I’ll firstly need to tell Varnish how to recognise requests for sitea (and other sites), so it can forward a cache miss to the right place, and then reconfigure Apache – so it sends requests into varnish and not directly to Zope.

So, firstly, in Varnish’s configuration (/etc/varnish/default.vcl), we need to define the different backend server’s we want varnish to proxy / cache. In my case they’re on the same server –

backend zope1 {
.host = "127.0.0.1";
.port = "10021";
}
backend zope2 {
.host = "127.0.0.1";
.port = "10022";
}
Then, in the 'sub vcl_recv' section, use logic like :
if ( req.url ~ "/sites/sitea/VirtualHostRoot") {
   set req.backend = zope1;
}
if ( req.url ~ "/siteb/VirtualHostRoot") {
    set req.backend = zope2;
}

With the above in place, I can now just tell Apache to rewrite Sitea to :

RewriteRule ^/(.*) http://127.0.0.1:6081/VirtualHostBase/http/www.sitea.com:80/sites/sitea/VirtualHostRoot/$1 [L,P]

Instead….. and now we’ll find that our site is much quicker 🙂 (This assumes your varnish listens on localhost:6081).

There are a few additional snippets I found – in the vcl_fetch { … } block, I’ve told Varnish to always cache items for 30 seconds, and to also overwrite the default Server header given out by Apache etc, namely :

sub vcl_fetch {

    # ..... <snip> <snip>

    # force minimum ttl for objects

    if (obj.ttl < 30s) {

        set obj.ttl = 30s;

    }

    # ... <snip> <snip>

    unset obj.http.Server;

    set obj.http.Server = "Apache/2 Varnish";

    return (deliver);

}
I'm happy anyway. :)
Use 'varnishlog', 'varnishtop' and 'varnishhist' to monitor varnish.