This afternoon I’ve been installing web log analysis software. I chose to use awstats, mainly because I know it already. But also because last time I looked it was the prettiest open source one. Installing it was mostly painless, but hampered by a couple of technical annoyances.
Firstly, we run everything on the mySociety server as different users, for privilege separation. This involves using suExec and FastCGI, neither of which are as mature as they might be. Now, the problem with suExec is that it is extremely paranoid. For example, it makes sure the owner and group of any files you try and run as CGI are the same as the user you set them to be run by. It also checks that they are under document root – a silly restriction, especially when I’ve just added an alias to httpd.conf point to the awstats.pl file which FreeBSD installed elsewhere.
The solution to this? Make a short one line bash script which merely calls through to the actual script. That it is so easy to get round them, shows how silly the suExec restrictions are – it just puts everybody off using it at all, and instead they make everything world-readable. And less secure. So by being paranoid about security, they make everyones web server less secure. Nice.
Secondly, awstats has a strange view of the world. It insists on getting log files in order. That is every row of log fed to it has to arrive, and you can’t go back and feed an older log file in later. I’m not quite sure why this is – perhaps so that its very compact text file summary of the logs can be updated efficiently. So, to run analysis of old files from months back you have to write a quick script to run through them all and call awstats with them.
Anyway, I’ve now done that for NotApathetic, PledgeBank, WriteToThem and this very website that you’re reading. So Tom will stop hassling me every time a journalist is saying “how many visitors do you get a day?” So far today we have 2807 visits to NotApathetic.com.