Spring trip to Barcelona

A monster in the sand!

Well, sort of random. As I don’t normally bother to upload any pictures and so on, I thought I might as well for once – and I said I’d send my father a postcard, but then failed to …. perhaps this will make up for it. I didn’t notice any postcard selling shops either – perhaps they’ve gone out of fashion?

Anyway, as I’m the boss, I left work at midday on Friday, drove up to Liverpool and flew to Barcelona with EasyJet. Thankfully I’m quite happy flying – but a minor bump triggered a load of men near me to exchange flying horror stories (“Once, over America, we hit some sort of air pocket and dropped 1000 feet!”). Anyway, I arrived in Barcelona at about 20:00 local time (I think flight time was about an hour and a half – apparently we had a 70mph tailwind, so were a bit early) and then fought through the cattle^h^h^hpassengers to get through security etc etc… train, tube, walk -> arrive at Anna’s flat. Fall asleep.

On Saturday we went shopping – so it was a boring day in that respect. But in the evening we went to my favourite Japanese restaurant to eat sushi stuff (nom-nom-nom) – but as I failed to take my camera/phone with me, there’s no photo(s).

On Sunday – we’d went to Sitges and walked into some sort of old-car rally – which was nice to see. Unfortunately this made the town quite busy, so we had to wait for ages for lunch (well, 3pm lunch).

Then we went back home, I did some DIY and the weekend was pretty much over.

Monday involved flying back (uneventful) and then a long drive home. Liverpool seemed very dull and dreary compared to Spain.

WTFs per minute

I’m currently refactoring some legacy third party PHP code, and as the old saying goes, the real metric is WTFs per minute.

So, just to entertain any readers, how about :

  1. Writing pagination links for a search form, but if there are more than 20 pages of results, add 20 onto whatever the maximum number of pages there are – so you get 20 invalid links at the end of the pagination list (clicking on them will show no results). I guess it looks like there are lots of results at least.
  2. if(isset($_GET[‘foo’]) == 0) … (wouldn’t if(!isset($_GET[‘foo’]) be easier to read?).
  3. Presumably not knowing what a while(…) { … } loop is, and always using something like: $row = mysql_fetch_assoc($x); do { … } while ($row = mysql_fetch_assoc($x)) ….
  4. Always including mysql_free_result($foo) after every query…. why bother?
  5. Always having an //END IF comment, even if the if(..) { } statement is only 3 lines long.
  6. The write_out_the_header() function which consists of a switch statement nearly 2900 lines long, which is just responsible for setting things like the <title> and some meta tags for every page in the site.
  7. When doing results pagination, for even numbered page links, write out the ‘jump’ URL differently (starting with a &, instead of a ?). Some numbers are more even/equal than others…. I guess.
  8. Executing a separate query each time within a loop rather than doing a simple join to start with….
And don’t get me started on the lack of error checking…..

A week of fire fighting (aka why you should <3 unit tests).

I feel like I’ve spent most of this week debugging some PHP code, and writing unit tests for it. Thankfully I think this firefighting exercise is nearly over.

Perhaps the alarm bells should have been going off a bit more in my head when it was implied that the code was being written in a quick-and-dirty manner (“as long as it works….”) and the customer kept adding in additional requirements – to the extent it is no longer clear where the end of the project is.

“It’s really simple – you just need to get a price back from two APIs”

then they added in :

“Some customers need to see the cheapest, some should only see some specific providers …”

and then :

“We want a global markup to be applied to all quotes”

and then :

“We also want a per customer markup”

and so on….

And the customer didn’t provide us with any verified test data (i.e. for a quote consisting of X it should cost A+B+C+D=Y).

The end result is an application which talks to two remote APIs to retrieve quotes. Users are asked at least 6 questions per quote (so there are a significant number of variations).

Experience made me slightly paranoid about editing the code base – I was worried I’d fix a bug in one pathway only to break another. On top of which, I initially didn’t really have any idea of whether it was broken or not – because I didn’t know what the correct (£) answers were.

Anyway, to start with, it was a bit like :

  • Deploy through some weird copy+pasting manner due to Windows file permissions
  • No unit test coverage
  • No logging
  • Apparently finished
  • Apparently working (but the customer kept asking for more changes)

Now:

  • Deployment through git; Stupid Windows file permissions fixed.
  • Merged in changes by third party graphics designer – should be much easier to track any changes he makes in the future
  • ~80% test code coverage. I’ve had to modify the ‘real’ scripts to accept a new parameter, which would make them read a local XML file (rather than the remote API/service) – in order for the tests to be reproducible (and quick to run)
  • Logging in place, so there’s some traceability
  • Better error handing
  • Calculations manually checked and confirmed.

Interesting things I didn’t like about the code :

  • Premature attempt at dependency injection – the few functions there are/were have a $db variable being passed in – but often don’t use it.
  • There is significant duplication in the code base still.
  • The code didn’t (and still doesn’t really) support testing particularly well – I’m having to retrieve output through a Zend_Http_Client call (i.e. mimicking a browser) – which means I can’t get code test coverage stats easily.
  • In some places a variable was being set to 0 (e.g. $speed) which would be used to determine a special case (if $speed == 0). Having multiple meanings in/on the same variable makes it difficult to follow what’s going on – and is a pain when the customer requests it behaves a bit differently. Really a separate parameter should have been used.

exim + spamassassin subject rewriting on symbiosis

One customer of mine has an Bytemark Symbiosis based exim mailserver which uses SpamAssassin. It works pretty well – however the :

rewrite_header Subject *****SPAM*****

directive in spamassassin (/etc/spamassassin/local.cf) seemed to be being ignored – and the only effect of the mail being classified as spam is/was a couple of additional headers added (X-Spam-Status: spam). For the customer in question this wasn’t of much use – as they’re reasonably non-technical and probably couldn’t create a client side mail filter. And they also thought the spamfiltering wasn’t working.

I found adding the following to /etc/exim4/system_filter results in the subject being appropriately modified :

if $h_X-Spam-Status: contains "spam"
then
    headers add "Old-Subject: $h_subject"
    headers remove "Subject"
    headers add "Subject: *** SPAM *** $h_old-subject"
    headers remove "Old-Subject"
endif

And if you want to tag virus-ey emails … add this in as well :

# X-Anti-Virus: infected
if $h_X-Anti-Virus: contains "infected"
then
    headers add "Old-Subject: $h_subject"
    headers remove "Subject"
    headers add "Subject: *** VIRUS *** $h_old-subject"
    headers remove "Old-Subject"
endif

Seeing as how that took about 2 hours to figure out – hopefully this will be of use to others.

I started looking at SpamAssassin and wondering why IT wasn’t doing it… I still don’t know why – but assume it’s an Exim ‘feature’.

State of the union – sort of (my 2011).

Well, perhaps not quite a State of the Union Address, but here’s a random update on my life in general which perhaps sums up the last year (as we’re almost at the end of the year, it’s probably fitting I somehow, somewhere write something like this).

  • My children (Rowan and Anya) are both growing up rather too quickly. Anya’s about 18 months old, walking and almost talking (‘tree’, ‘cat’, ‘that’, ‘tasty’, ‘mum’, ‘daddy’ and so on) while Rowan (~4) is busy playing/asking questions/learning to write/bashing things with hammers and so on. As far as we can tell they’re unaffected by Katherine and I splitting up. I’m lucky to be able to see them regularly (4 times a week) and usually ‘good enough’ for hugs and cuddles after they fall over / chuck up or whatever depending on the star alignment or who ever told them off most recently…..
  • Pale Purple (work) wise – everything’s going well, we’re busy, have plenty of work lined up over the next few months and there are a number of interesting projects almost ready to start. Over the last year there has been a distinct increase in the amount of development we’re doing in mobile applications – specifically towards Android and business apps (e.g. for a delivery driver to use to see what jobs they have to do – rather than the traditional Microsoft Windows CE based thing). PHP is still the main focus of the company with other supplementary bits (training, security audits, systems administration and so on). We took on an industrial year student this year – so there are currently five of us full time.
  • I’ve moved house – so I no longer live in a one bedroom flat which was always cold [no central heating]. Now I’m in a 2 up, 2 down house like thing, so the children can have their own bedroom, or spread their toys over a wider area. Soon there will be a trip to Ikea and they’ll have a bunk bed and random other things no doubt.
  • I still run / cycle / exercise – although not as much as I might like to. Mr Patch, the Jack Russell, went to live with my aunt at the beginning of the year (I think?!) – where he’s apparently behaving well, and has become somewhat wider; likewise his absence here is partly to blame for my weight gain – but conversely not having him makes looking after the children / work / jetting off to Spain (I can’t think why…) so much easier.
This post was brought to you by two great bottles of beer from MyBreweryTap, who happen to be a customer of mine – and subscribed me to their 52 bottles a year package for free. “A++++ will drink their stuff again!” as people would say on eBay!
Enjoy 2012 readers. I don’t know what things will be like by the end of the year, but I’m pretty optimistic at the moment.

SQL Injection with added magic_quotes assistance (the joys of legacy code maintenance)

 

Sometimes you really have to laugh (or shoot yourself) when you come across legacy code / the mess some other developer(s) left behind. (Names slightly changed to protect the innocent)

class RocketShip {

    function rahrah() {
        $sql = "insert into foo (rah,rahrah,...) 
            values ( '" . $this->escape_str($this->meh) . "', ...... )";
        mysqli_query($this->db_link, $sql) or 
            die("ERROR: " . mysqli_error($this->db_link));
        $this->id = mysqli_insert_id($this->db_link);
    }

    function escape_str($str)
    {
        if(get_magic_quotes_gpc())
           { $str = stripslashes($str);}
        //echo $str;
        //$clean = mysqli_real_escape_string($this->db_link,$str);
        //echo $clean;
       return $str;
    }
// ....
    function something_else() {
         mysqli_query($this->db_link, 
            sprintf("insert into fish(field1,field2) values('%s', '%s')", 
            $this->escape_str($this->field1), 
            $this->escape_str($this->field2));

    }
}

You’ve got to just love the :

  1. Lack of Error handling / logging.
  2. Functionality of the escape_str function which is only making matters worse (and could never have worked due to the variable names)
  3. Use of sprintf  and %s ….(obviously %d could be useful)
  4. Documentation?

Dare I uncomment the mysqi_real_escape_string and fix escape_str’s behaviour?

In other news, see this tweet – 84% of web apps are insecure; that’s a bit damning. But perhaps not surprising given code has a far longer lifespan than you expect….

 

Solr and WordPress (instructions/howto)

This is for Tomcat5.5 (on Debian Lenny), WordPress 3.1 and Solr 3.4. The intention is to use the solr-for-wordpress plugin (see github ).

Lenny does include a Solr package (v1.2) which is somewhat outdated (and not supported by the upstream solr-for-wordpress wordpress plugin, hence we can’t use it).

Install Tomcat (and Java)

apt-get install sun-java6-jre

Edit /etc/profile and set a JAVA_HOME – so add in something like :

# Setup Jave environment 6
export PATH=$PATH:/usr/lib/jvm/java-6-sun/bin
export JAVA_HOME=/usr/lib/jvm/java-6-sun
export JRE_HOME=/usr/lib/jvm/java-6-sun/jre

And then do :

. /etc/profile
So those settings are set / present within your environment (or logout and back in).

Next, install Tomcat :

apt-get install tomcat5.5
and then
apt-get install tomcat5.5-admin

Configure Tomcat

Edit /etc/tomcat5.5/tomcat-users.xml and define your own user; for the -admin apps you’ll need to give the user a role of admin and manager.

e.g.

<?xml version='1.0' encoding='utf-8'?>
<tomcat-users>
  <role rolename="manager"/>
  <role rolename="tomcat"/>
  <role rolename="admin"/>
  <role rolename="role1"/>
  <user username="palepurple" password="letmein" roles="admin,manager,tomcat"/>
</tomcat-users>

And then restart Tomcat. You should now be able to visit http://yourserver:8180/admin and see a login screen.

In my case, I also edited /etc/tomcat5.5/server.xml to disable the AJP connector on port 8009 and also to tell the remaining connector (port 8180) to listen only on 127.0.0.1. To connect to the admin interface, I just use SSH port forwarding from my desktop – this is just to improve security.

Finally, it seems necessary to grant permission for Java to log to /var/log/tomcat5.5… .a dirty way of achieving this is to edit :

/etc/java-6-sun/security/java.policy

and add in (near the top)

grant {
	permission java.security.AllPermission;
};

(Yes, I know this is a bit like doing chmod -R 777 on a filesystem or something; but in my case Solr is running only on localhost, so I think it’s an acceptable fix; I’m sure Google can provide more eloquent fixes.).

 

Installing Solr

Download; unpack and install .war file :

cd /root
wget http://www.apache.org/dist/lucene/solr/3.4.0/apache-solr-3.4.0.tgz
tar -zxf apache-solr-3.4.0.tgz
cp apache-solr-3.4.0/dist/apache-solr-3.4.0.war /var/lib/tomcat5.5/webapps

If you now restart Solr, you’ll find some log files and stuff of use in /var/log/tomcat5.5 – looking in the catalina log file there you’ll see it moaning about not finding solrconfig.xml and so on. To fix this –

cp -a apache-solr-3.4.0/example/solr /var/lib/tomcat5.5/

And edit /etc/default/tomcat55 to contain :

export JAVA_OPTS="$JAVA_OPTS -Dsolr.solr.home=/var/lib/tomcat5.5/solr"

This tells Solr where to find it’s configuration and so on.

Then edit :

/var/lib/tomcat5.5/solr/conf/solrconfig.xml and fix the file paths to the various .jar files included – so in my case (you might want to copy them out of the apache-solr-3.4.0 dir and into /var/lib/tomcat5.5/solr/lib perhaps) – part of the solrconfig.xml is below :

  <lib dir="/var/lib/tomcat5.5/apache-solr-3.4.0/contrib/extraction/lib" />
  <!-- When a regex is specified in addition to a directory, only the
       files in that directory which completely match the regex
       (anchored on both ends) will be included.
    -->
  <lib dir="/var/lib/tomcat5.5/apache-solr-3.4.0/dist/" regex="apache-solr-cell-\d.*\.jar" />
  <lib dir="/var/lib/tomcat5.5/apache-solr-3.4.0/dist/" regex="apache-solr-clustering-\d.*\.jar" />
  <lib dir="/var/lib/tomcat5.5/apache-solr-3.4.0/dist/" regex="apache-solr-dataimporthandler-\d.*\.jar" />

  <!-- If a dir option (with or without a regex) is used and nothing
       is found that matches, it will be ignored
    -->
  <lib dir="/var/lib/tomcat5.5/apache-solr-3.4.0/contrib/clustering/lib/" />

Next create the data directory for solr to use :

mkdir /var/lib/tomcat5.5/solr/data
chown tomcat55 /var/lib/tomcat5.5/solr/data

And restart tomcat.

At this point you should be able to visit :

http://localhost:8180/apache-solr-3.4.0/admin/

If it fails, check out /var/log/tomcat5.5/*catalina.log* or /var/log/daemon.log

WordPress stuff

cd /path/to/wordpress/wp-content/plugins

git clone https://github.com/mattweber/solr-for-wordpress.git

cp solr-for-wordpress/schema.xml /var/lib/tomcat5.5/solr/conf/

<<restart tomcat again; /etc/init.d/tomcat5.5. restart >>

Now you just need to enable the plugin from within wordpress and tell wordpress to index your posts and you’re off.

  1.  Enable plugin
  2. Goto settings -> solr options -> select single server; tell it to use localhost, port 8180 and under the path ‘/apache-solr-3.4.0’
  3.  Perform the ‘server ping’ check; and then tell WordPress you want to index your pages/posts etc as you see fit.

netstat –tcp -lp output not showing a process id

I often use ‘netstat –tcp -lpn’ to display a list of open ports on a server – so i can check things aren’t listening where they shouldn’t be (e.g. MySQL accepting connections from the world) and so on. Obviously I firewall boxes; but I like to have a reasonable default incase the firewall decides to flush itself randomly or whatever.

Anyway, I ran ‘netstat –tcp -lpn’ and saw something like the following :

tcp        0      0 127.0.0.1:3306          0.0.0.0:*               LISTEN      3355/mysqld     
tcp        0      0 0.0.0.0:54283           0.0.0.0:*               LISTEN      -               
tcp        0      0 0.0.0.0:111             0.0.0.0:*               LISTEN      1940/portmap

Now ‘mysqld’ looks OK – and portmap does (well, I need it on this box). But what on earth was listening on port 54283, and why is there no process name/pid attached to it?

After lots of rummaging, and paranoia where I thought perhaps the box had been rooted, I discovered it was from an NFS mount (which explains the lack of a pid, as it’s kernel based).

lsof -i tcp:54283

Didn’t help either. Unmounting the NFS filesystem did identify the problem – and the entry went away.