Back in November 2006 we launched Number 10’s petitions website. We were pretty proud of the usability-centred site we built – we can still lay a pretty good claim to it being one of the biggest democracy sites (measured in terms of people transacting) that the world’s ever seen.
Over 12 million signatures had been added to petitions by the time the site was switched off after the 2010 general election. We were particularly proud of developing a system that was highly load-tolerant: we once survived over 20,000 people signing within a single hour, all whilst running on a pair of cheap little servers. That performance on so little hardware was down to the raw brilliance represented by a coding team made up of Francis Irving, Matthew Somerville, and the late, great Chris Lightfoot.
We’re also pleased that the popularity of the site led to the irresistible rise of the belief that the public should be able to petition the government via the internet. So even though our site was mothballed, Parliament and DirectGov have taken over the idea, and the commitment has been upped a notch, from ‘we’ll send a reply’ to ‘we’ll talk about it’. To be clear, we are not, nor have ever been a community interested in replacing representative democracy with direct democracy, but anything that can squeeze any drop of change from Parliament is worth a small celebration.
What’s most pleasing, though, is that we’ve been able to take the open source code built for Number 10, improve and expand upon it to develop a hosted petitions service for local councils around the country, or the rest of the world. And this is where we found the most important lesson for us: local petitions can be awesome, and despite the much smaller numbers of signatories involved, we’ve been more widely and frequently impressed by local petitions and responses than at the more glamorous national level. We’re particular fans of Hounslow Borough Council who have given positive and detailed feedback on all sorts of genuine local issues, as well as working hard to let local residents know that the service exists.
Just recently we launched a site to make it really easy to find local council petition websites, because there are hundreds hidden away (we built some; most are supplied by other vendors). If we could see anything result from today’s huge explosion of interest in online petitions, it would be that people might start to look local, and explore what petitions in their community could mean.
Last night was the annual New Statesman New Media Awards, held in Westminster Abbey’s College Gardens. mySociety were finalists in two categories, Modernising Government and Contribution to Civic Society, with both Number 10 petitions and FixMyStreet nominated in both. Also, two other projects we host, PlanningAlerts and The Government Says, were both finalists in the Information & Openness category.
It was a lovely evening, seeing some people I haven’t seen for some time and meeting new people too. We ended up winning in both our categories – the Number 10 petitions site in Modernising Government, and FixMyStreet in Contribution to Civic Society, which is obviously fantastic for everyone involved. The judges were impressed at the open source nature of the petitions site, and the “deceptive simplicity” of FixMyStreet. This is now the third year in a row we’ve won the Civic Society award – TheyWorkForYou won in 2005, and WriteToThem in 2006, so we’re obviously doing something right.
It’s a shame that Chris could not be with us, but his mother did attend to see the projects he worked on recognised.
Thanks and congratulations to all the other winners and finalists.
It is with great sadness that I must report the death of Chris Lightfoot, mySociety’s first developer and a good friend to all of us. He was found by friends at his flat on February 11th. The main announcement can be read in this post on his blog.
Chris was perhaps the pre-eminent example so far of what polymath means in the Internet age. His contributions to the world are more than just a formidable legacy of computer code of the very highest quality, for mySociety and many others. They also include substantial contributions to applied statistics, geographic information systems, economics and a range of public policy issues from identity cards to speed cameras.
Everything Chris did in these fields combined an incredulity-inducing array of technical and analytical skills with a wickedly funny, savage turn of phrase. To understand what a remarkable intellectual outlier he was, simply sift through his blog and marvel at the quantity of primary research and original coding that went into it. Documenting and exploring his work would provide material for many years of research, and yet all this was accomplished by the age of 28.
Within mySociety he was involved right from the start through the development of WriteToThem, HearFromYourMP and PledgeBank, building some amazing underpinning geographic and political web services like Gaze, MaPit and DaDem. These components make all our sites work and make a raft other tools and sites possible in the future.
For the last three or four months he was working at another employer, Media Molecule four days a week, but still helped the full time staff with the petitions work. The last major thing he built for us was the system that serves up the maps for Neighbourhood Fix-It, a site which was only just soft launched before he died, but of which he was apparently fond for its WriteToThem-like habit of getting simple things done that mattered to normal people.
Building mySociety’s major sites involved mighty team efforts, something which can obscure even huge invididual talent. So perhaps the sort of work for which Chris will be be most remembered is his wonderfully individualistic, virtuoso forays into scholastic areas in which he had no formal training. He wandered into differing disciplines, made a mark, and wandered on again like a giant that had no idea he’d just trodden on a village. The political survey work he did both hugely illuminates our understanding of our own political world, whilst raising the question “how come none of the professional political analysts thought of this?” And his travel-time maps should make everyone in government wonder if they’re sitting on information which could be reused to such amazing, potentially life changing effect.
Chris’ intellect and appetite for knowledge was surpassed by only one aspect of his character: his integrity. If you’ve ever wondered why WriteToThem goes to such lengths to protect users’ data it is largely because of his rock solid belief in the dignity and social indispensibility of privacy. Chris was an energetic campaigner in this field, notably for No2ID, who have posted a tribute.
It doesn’t stretch the truth an inch to say that with his death the whole of the UK’s citizenry, not just his family, friends and colleagues, will be worse off. Rest in peace, Chris.
So, I’ve just had a shower and I’m waiting for Matthew and Tom to turn up. As time goes on, mySociety seems to get more geographically disparate, and I look forward to meeting my coworkers. Then for 1pm we’ll be heading to CB2 for the mySociety developers meeting. Feel free to come along any time afternoon or evening, whatever your skills or interest in mySociety.
I haven’t posted on here for ages, since October. I’ve been away on holiday quite a lot, and when I’ve not been away I’ve been busy, partly with systems administration. We’ve set up lots of servers in the last month for the E Petitions site. When you go from 3 servers to 7 servers, there’s another step change in sorting out systems administration tools. For example, I had to change the monitoring script so every server wouldn’t monitor every other. And I had to work out the quirks and bugs in the system we have for storing config files for different classes of server in CVS. Because we only had one class of server before.
I’ve also had to learn lots about server monitoring and load balancing. Things have slowed down a bit now, to maybe 10 hits per second. But a few weeks ago the road pricing petition was often getting 50 hits per second. I’ve never worked on a site with that level of traffic before. You find all the bugs in your code, all the missing indices in PostgreSQL, all the badly tweaked FastCGI parameters. I’ve been sucking knowledge off Chris like a sponge, so tools like strace and vmstat begin to become instinctive. Seemingly nobody offers a book or a course which teaches this stuff well – every server setup is different, everyone knows different ways to tune and profile. But maybe you can tell me different in the comments.
Louise has been busily working away on lots of things. Amongst that is a major change to WriteToThem, to let you write to all the members in a multi-member constituency in one go. The last day or two, she’s been installing the WriteToThem test code on one of our servers, when it has only run on my laptop before. This will be fantastic – hopefully can get Matthew to be bolder about making changes to WriteToThem, if he has a test script he can easily run (getting Matthew to be bold isn’t normally a problem, but he seems mildly less bold when it comes to the WriteToThem queue daemon).
Tom and I have also been busy on a second travel maps report for the DfT. More on that soon. Lots of running screen scraping jobs on TransportDirect which take days. On the subject of Tom, he seems to have got expert at “stacking meetings” next to each other. In one day last week he had 7 meetings!
Partly for our own internal documentation, and partly because it might be of interest to (some) readers, some notes on how the Number 10 petitions site works. On the face of it you’d imagine this would be very simple, but as usual it’s complicated by performance requirements (our design was motivated by the possibility that a large NGO might mail a very large mailing list inviting each member to sign a particular petition, so that we might have to sustain a very high signup rate for an extended period of time). Here’s a picture of the overall architecture:
(This style of illustration is unaccountably popular in the IT industry but unlike most examples of the genre, I’ve tried to arrange that this one actually contains some useful information. In particular I’ve tried to mark the direction of flow of information, and separate out the various protocols; as usual there are too many of the latter. The diagram is actually a slight lie because it misses out yet another layer of IPC—between the web server, apache, and the front-end FastCGI scripts.)
Viewing petition pages is pretty conventional. Incoming HTTP requests reach a front-end cache (an instance of squid, one per web server, cacheing in memory only); squid passes them to the petition browsing scripts (written in perl running under FastCGI) to display petition information. Those scripts consult the database for the current state of the relevant petition and pass it back to the proxy, and thence to the web client. This aspect of the site is not very challenging.
Signing petitions is harder. The necessary steps are:
- write a database record about the pending signature;
- send the user an email containing a unique link to confirm their signature;
- update the database record when the user clicks the link;
- commit everything to stable storage; and finally
- tell the user that their signature has been entered and confirmed.
The conventional design for this would be to have the web script, when it processes the HTTP request for a new signature, submit a message for sending by a local mail server and write a row into the database and commit it, forcing the data out to disk. The mail server would then write the message into its spool directory, and fsync it, forcing it out to disk. The mail server will then pick the mail out of its queue and send it to a remote server, at which point it will be erased from the queue. Later on the mail will arrive in the
user’s inbox, at which point they will (presumably) click the link, resulting in another HTTP request which causes the web script to update the corresponding database row and commit the result. While this is admirably modular it requires far more disk writes than necessary to actually complete the task, which limits its potential speed. (In particular, there’s no reason to have a separate MTA spool directory and for the MTA to make its own writes to that directory.)
At times of high load, it is also extremely inefficient to do one commit per signature. It takes about as long to commit ten new or changed rows to the database as it is to commit one (because the time spent is determined by the disk seek time). Therefore to achieve high performance it is necessary to batch signatures. Unfortunately this is a real pain to implement because all the common web programming models use one process per concurrent request, and it is inconvenient to share database handles between different processes. The correct answer to this problem would of course be to write the signup web script as a single-process multiplexing implementation, but that’s a bit painful (we’d have had to implement our own FastCGI wire protocol library, or alternatively an HTTP server) and the deadlines were fairly tight. So instead we have a single-process server, petsignupd, which accepts signup and confirmation requests from the front-end web scripts over a simple UDP protocol, and passes them to the database in batches every quarter of a second. In theory, therefore, users should see a maximum latency of a bit over 0.25s, but we should achieve close to the theoretical best throughput of incoming requests. (Benchmarking more-or-less bears this out.)
Sending the corresponding email is also a bit problematic. General-purpose MTAs are not optimised for this sort of situation, and (for instance) exim can’t keep up with the sustained signup rate we were aiming for even if you put all of its spool directories on a RAM disk and accept that you have to repopulate its outgoing queue in the event of a crash. The solution was to write petemaild, a small multiplexed SMTP sending server; unlike a general-purpose MTA this manages its queue in memory and communicates updates directly to the database (when a confirmation email is delivered or delivery fails permanently).
It’s unfortunate that such a complex system is required to fulfil such a simple requirement. If we’d been prepared to write the whole thing ourselves, from processing HTTP requests down to writing signatures out to files on disk, the picture above would look much simpler (and there would be fewer IPC boundaries at which things could go wrong). On the other hand the code itself would be a lot more complex, and there’d be a lot more of it. I don’t think I’d describe this design as a “reasonable” compromise, but it’s at least an adequate one.
So, there are now over 600 petitions in the petitions system, and we’re getting a steady stream of appeals from our users to add categories.
I’m posting to ask how you all think we should handle this. It seems to me that there are a few options:
- Ask petition creators to pick one very basic top level category of no more than 10 or so, taken from a hierarchical taxonomy like the one the BBC uses.
- Ask petition creators to pick the top level and the subsequent sub-levels to be more specific.
- Go all web 2.0 and simply ask people to tag their petitions with some key words
More than just thinking about the overall philsophy I’d also appreciate thoughts on design. When you come to the homepage, how should the category system be presented to you? Tricky stuff, and I’d really appreciate your thoughts.
Since the petition system went out properly on Wednesday, we’ve been absolutely buried in an avalanche of changes, fixes, feature additions and massive massive amounts of email. I thought that you might be interested as to what sort of stuff has happened in the first two days:
- Email has taken over our lives. Matthew has responded to over 200 emails since yesterday morning, and I was up at 4am last night just trying to cope with the rate of incoming of mail. Francis, who’s now in Canada, then heroically took up the baton and responded to mail all (UK) night! Many if not most of these mails are giving us suggestions, as well as bug reports, problems with email and bits of praise and the odd conspiracy theory.
- Changes made to cope with expats and overseas military personnel.
- Phoned Hotmail to stop their system from eating 95% of the confirmation messages being sent to Hotmail accounts!
- Redesigned the automated mails no10 get telling them there’s a new petition (they’ve had over 500 of these mails, so they need to be clear and easy to read!)
- Made the rejected petitions system more granular, so that if a petition has to be rejected, and part of it has to be hidden (say, if it is libellous), then it only hides that bit, not the whole thing. Maximum transparency is the goal, you see.
- New options added to sort the list of all petitions in different ways, by number of signatures being the most asked for.
- Limited the length of “more info” fields so people can only write long rants, rather than really really long rants
- Special cased people with AOL accounts, so that their, erm, nonstandard email clients can actually cope with the confirmation links.
- Made several fixes to the processes involved in sending out confirmation mails.
- Made RSS changes and improvements.
- Updated various bits of text, like providing examples of what “party political” means. The BBC initially wrote that this meant no pledges mentioning controversial issues like Iraq, which was grabbing quite the wrong end of the stick about the nature of the rules. Now we have some complaining emails saying we’re being too liberal!
- Compiled a big list of user suggestions and fixes on the wiki here.
- Made the rejection criteria in the Ts&Cs actually match the ones in the admin interface.
- Installed a stats packages to watch what’s going on.
- Added facility to search petitions
- Improved/fixed logging
- Added link and text pointing to the open source code.
I’ve probably missed some – I’m sure Matthew, Chris, Francis and Ben will let me know!
I’m very pleased to announce that the petitions system we’ve built for 10 Downing Street has gone live today.
I’m very grateful for the hard and often inspired work put into this by Chris Lightfoot and Matthew Somerville, as well as the civil servants who have helped to build a petitions system which I believe is in a real class of its own.
The most notable features are:
1. Petitions are accepted and published, regardless of the political slant of the petition. However, if they break the Ts&Cs (a petition that doesn’t actually ask for any action, for example) then they are put on a special rejected petitions page: they don’t just vanish. We think this transparency feature is probably unique.
2. The site is being launched in beta, and will change over time. This might seem too commonplace to note for many of you, but it reflects a willingness to see a public IT service evolve in response to users, not simply fulfil a contract agreed in advance. mySociety exists partly to spread good practice in the public sector, and we think this is a nice example of that in action.
3. The code, including Chris’s amazing high-load optimised engine, is all open source.
Any questions? Come into our chat channel at www.irc.mysociety.org or mail us at firstname.lastname@example.org.
We just had our irregular weekly meeting, which we do most Mondays using a conference call. I thought I’d just write up what we’re all up to this week.
- I’m continuing to test the ePetitions site for 10 Downing Street, and developing an interesting branded version of PledgeBank for CAFOD (more when it launches).
- Matthew is going to look at various things that need doing on PledgeBank and WriteTothem. For PledgeBank more chivvying emails, I think something like this ticket but not exactly. For WriteToThem, various bits of code to do with how we handle error cases.
- Chris is making more pretty maps for the Department for Transport.
- Tom is working out in detail how we’re doing to spend the money from DCLG which has finally come through. It’s mentioned in this post, look for “e-Innovations Product and Marketisation strand via Kirklees MBC”. Which means, we’re being paid to do proper marketing and sales of branded version of our services, such as WriteToThem, PledgeBank, and Neighbourhood Fixit. He’s also chasing up some interesting people met at a conference in Eastern Europe (Bratislava, I think?) last week.
Please ask questions in the comments – for example, if you’d like us to post about particular things on this blog.
Much of my August seems to have been absorbed with maintenance tasks.
For example, Chris and I spent a few days tightening up WriteToThem’s privacy. I made sure the privacy statement correctly describes what happens with backup files, and failed messages. I reduced the timeouts on how long we keep the body of failed messages. I made sure we delete old backup files of the WriteToThem database. I wrote scripts to run periodically to check that no bugs in our queueing demon can accidentally mean we keep the body of messages for longer than we say. I added a cron job to delete Apache log files older than a month for all our sites. As AOL know to their cost, the only really private data is deleted data.
Earlier in the month, I handled some WriteToThem support email for the first time in ages. We get a couple of hundred messages a week, which Matthew mainly slogs through. It’s good for morale to do it, as we get quite a lot of praise mail. It is also hard work, as you realise how complicated even our simple site and the Internet are, and it leads to fixing bugs and improving text on the site. I made a few improvements to our administration tools, and things like the auto-responder if people reply to the questionnaire, to try and reduce the amount of support email, and make it easier to handle.
I did some more work on the geographically cascading pledges (like this prototype one), but I’m still not happy with them. In the end, I realised that it is the structure of wording of the pledge that is the key problem. Our format of “If will A but only if N others will B” just isn’t easily adapted to get across that the pledge applies separately in different geographically areas. Working out how to fix that is one of the things we’ll brainstorm about in the Lake District (see below).
The last couple of days I’ve been configuring one of our new servers who is called Balti, and getting the PledgeBank test harness working on it. Until now, it has only been run on my laptop. This is partly heading towards making a proper test harness for the ePetitions site, running on a server so we properly test nothing can be broken before deploying a new version.
Matthew has wrapped up the TheyWorkForYou API now, and is working on Neighbourhood Fixit next. Chris has been doing lots more performance work for the e
Tom’s in Berlin at the moment, he gave a talk last night, and I think has been to see some people from Politik Digital. As we’ve been discussing on the developers email list, there’s an EU grant we’re likely to apply for in collaboration with them.
On Friday, we’re all going to the Lake district for a week, with some of the trustees and volunteers intermitently. We very conveniently and cheaply all work from home, so it’s good and necessary to meet up for a more sustained period of time at least once a year. Last year we were in Wales.