Logging … and how not to do it.

One thing that really annoys me is when I come to look at the log file and I see something like :

blah blah did blah blah

blah foo blah random comment

fish blah some data

which spans many lines or does it?

This is bad, as I’ve got absolutely no idea where the messages are from (so have to grep around a code base), and I’ve no idea WHEN they were made. At best I can look at timestamps on this file and figure out a timeframe (assuming logrotate is in use so there is a definite (must be after X timestamp)).

What’s far better from a maintenance point of view :

2010/07/29 09:33 filewhatever.py:355 blah blah blah did blah blah

2010/07/29 09:34 filewhatever.py:355 blah blah blah did blah blah

2010/07/29 09:35 filewhatever.py:355 data received from x is {{{hello world…. }}}

Changes are :

Date and time stamps (in python: datetime.datetime.now())
Recording where the message came from (see the ‘inspect’ python module – inspect.stack()[1][1] for calling file, and inspect.stack()[1][2] for the line number, or debug_backtrace() in PHP)
Wrapping any interesting output (e.g. from a remote service) in obvious delimiters (e.g. {{{ and }}} ) – without e.g. timestamps or some other common line prefix, I’ve no way of knowing what’s from where, especially if the output spreads over many lines.

Other good ideas :

Different severities of log message (classic: debug, info, error type annotation with appropriate filtering).
Make sure logrotate is in use, or a simple shell script via cron, to stop the log file growing too large and causing problems.
Stop writing your own logging mechanisms and use ones provided by the system (e.g. Python has a logger built in which does all of the above and more)

EOR – EndOfRant