We’re all busy giving WriteToThem some tender loving care. Everything from installing the latest versions of BoundaryLine and CodePoint to generating yearly statistics we’d like to be as fun and pretty as this. Watch out for more announcements as the week goes past.
The new servers are mostly running happily now. Amongst other things we can much more easily make developer sandboxes. So if you want a quiet place to hack some improvements to any of our sites, let us know.
Chris and I are rapidly tiring ourselves out with server configuration. Well, I speak for myself, but he can correct for himself in the comments. We’re moving everything from the one old server it has run on for the last year and a quarter (very) onto two new identical servers (bitter and tea). And at the same time we’ve been putting everything in CVS – every little piece of configuration and cron job that there is, for the servers and for the sites.
It’s amazing how many systems there are, and how many things to worry about. Security, backups, redundancy (we’re not too hot at that – recommendations for PostgreSQL mirror/cluster/live-backup type things welcome), admin authorisation, SSL, templated /etc files, users, groups, packages, cron, anonymous cvs, web statistics, service monitoring… It just goes on and on and on.
And that’s without listing the sites – HearFromYourMP, PledgeBank, NotApathetic, WriteToThem, mySociety (.org). And services – services.mysociety.org, gaze.mysociety.org, secure.mysociety.org, debian.mysociety.org, cvs.mysociety.org, rt.mysociety.org (request tracker, Matthew has been setting up recently).
But when we’re done, everything about the containment of our applications will be configured in CVS. We can install a new server in a trice (honestly!). New developer sandboxes can be configured in a few seconds. Everything is logged and backed up.
We’re gradually doing all of the above, but we’re not done yet.
(Shh, don’t tell anyone, but this post is really just so the bots find debian.mysociety.org, but I’m going to try and fill it with some other content so you don’t think I’m being too rude)
Debian’s software “packaging” system provides a big database of all the open source software in the world, and makes another smaller database of all the software installed on your computer. We’re using it on our new servers, which the sites are gradually migrating to now. When you’ve got security updates, multiple machines, and complex software dependencies, you need it.
Unfortunately, though it seems like the Debian people have packaged nearly all the software in the world, sometimes they miss things. Normally we’d just install them using the old Unix configure/make/make install. This time we’ve decided to do it properly, and make our own Debian packages. You can find them at debian.mysociety.org.
The advantage of this is that we can find out where any file on the system came from. We can easily upgrade multiple machines, and check that they all have the same software installed. This makes it much less likely that there’ll be bugs when you go to a corner of one of the websites, and get an error because a perl module wasn’t installed.
So far there are a few perl module .deb files in our repository, which the handy dh-make-perl builds easily from a perl module tarball. There’s also Xapian (a search engine library), which we use for quick lookups in Gaze (our gazeteer). That had already been packaged by the Xapian people, but for some reason I had to recompile it. Finally there’s one Python module, PyRTF, which makes Python modules, which I just packaged (probably badly).
Anyway, this post is here to make sure anybody searching for python2.3-pyrtf on Google will find something…
Last night I should have gone to bed early, but these things being how they are I stayed up late having tea with my housemate and his friend. I wanted to get up early, because I knew a few things needed tidying before we started getting media coverage, so I set my alarm. I haven’t done that for work for years! So I’m a bit sleepy.
The most important early thing I did was make the front page featured pledges appear in a random order, for more fairness and serendipity. Late last night Chris had added code in to fuzzily find pledges which somebody has typed in. It uses the database to look for the number of common three letter substrings, so if you type in “http://www.pledgebank.com/suirname” it gives a nice error page leading you to go to “http://www.pledgebank.com/Suriname”. It’s pretty good, and all I had to do was tidy up the text a bit, and add it to the search page as well.
By that time everyone else was up, and the no2id people were publicising their pledge. We were all on IRC, and tailing various logfiles. There were quite a few minor tidy ups for us to make to the launch pledges that were made over the weekend, changing text and signup numbers for the creators a bit.
Someone spotted that the “all pledges” page had the wrong calculated count for one of the pledges. This was very odd, as it was right for all the others. I downloaded a fresh dump of the database to my local machine, where everything was fine. Meanwhile, Chris noticed the PHP server was crashing. After more investigation, we found a subtle bug was creating a corrupt PHP variable. Calling “gettype” on it caused the PHP process to stop with an error, and calling number_format crashed the whole thing. We’re still not sure quite what PHP bug caused this, and need to investigate it more. But we found a simple workaround which stopped it causing any more trouble.
You always find all the bugs when your traffic goes up! That’s why staged beta getting larger and larger, of which today is in many ways the next phase, is the way to go.
This afternoon I’ve been installing web log analysis software. I chose to use awstats, mainly because I know it already. But also because last time I looked it was the prettiest open source one. Installing it was mostly painless, but hampered by a couple of technical annoyances.
Firstly, we run everything on the mySociety server as different users, for privilege separation. This involves using suExec and FastCGI, neither of which are as mature as they might be. Now, the problem with suExec is that it is extremely paranoid. For example, it makes sure the owner and group of any files you try and run as CGI are the same as the user you set them to be run by. It also checks that they are under document root – a silly restriction, especially when I’ve just added an alias to httpd.conf point to the awstats.pl file which FreeBSD installed elsewhere.
The solution to this? Make a short one line bash script which merely calls through to the actual script. That it is so easy to get round them, shows how silly the suExec restrictions are – it just puts everybody off using it at all, and instead they make everything world-readable. And less secure. So by being paranoid about security, they make everyones web server less secure. Nice.
Secondly, awstats has a strange view of the world. It insists on getting log files in order. That is every row of log fed to it has to arrive, and you can’t go back and feed an older log file in later. I’m not quite sure why this is – perhaps so that its very compact text file summary of the logs can be updated efficiently. So, to run analysis of old files from months back you have to write a quick script to run through them all and call awstats with them.
Anyway, I’ve now done that for NotApathetic, PledgeBank, WriteToThem and this very website that you’re reading. So Tom will stop hassling me every time a journalist is saying “how many visitors do you get a day?” So far today we have 2807 visits to NotApathetic.com.
We’ve just got our first proper server all to ourselves, and Chris Lightfoot and Francis Irving are moving existing stuff over to it. The main news, though, it that you’re invited to our…