1. Behind the Scenes at WhatDoTheyKnow

    mySociety’s Freedom of Information website WhatDoTheyKnow is designed to appear simple and straightforward to users. That appearance belies the fact that behind the scenes a significant amount of effort goes into making sure both those making freedom of information requests and those answering them have a positive experience of the site. While the site is almost entirely automated sometimes human involvement is necessary. This article highlights those key “edge cases” which are dealt with by the staff and volunteers who make up the WhatDoTheyKnow team.

    In the last year 15,233 freedom of information requests have been made via WhatDoTheyKnow.

    Message Delivery
    444 messages on 360 requests (2.3%) had to be manually placed on the correct request as a result of authorities not sending replies to the email address given. The errors are introduced as authorities apparently manually transcribe email addresses from incoming email into correspondence management systems. There have been suggestions some may even print out and scan-in emails into such systems. WhatDoTheyKnow’s code has been improved in light of experience, common errors are now detected automatically and in many cases the system suggests which request the message was intended to be directed to.

    In terms of outgoing messages just 52 (0.3%) requests over the course of the year were marked as receiving an error message in response and users marked 94 (0.6%) as requiring administrator attention. These are generally either transient errors which simply require a message to be resent or prompt us to check and update the contact details we hold for a particular organisation. Regularly there are problems with authority’s spam filters and we have to encourage them to change the way their filters are set up to allow messages from WhatDoTheyKnow.com through.

    Gone Postal
    119 (0.8%) requests were at some point marked as “Handled by Post”. In many of these cases users eventually persuaded authorities to release the information in electronic form. Where information is supplied outside the site users can add annotations describing the information released, then can link to copies of the data they have posted online, or as has been done in respect of 14 requests (0.1% of the total, 11% of those handled by post) they can supply the information to WhatDoTheyKnow to upload manually. When the site was being designed there was a worry that authorities would reply to many requests by post. This has not occurred, in part perhaps because the freedom of information act contains a provision (section 11) requiring the requestor’s preferred means of communication to be used where it is reasonable. A requestor using an @whatdotheyknow email address is clearly expressing a preference for a reply to be made electronically via the site.

    One of the major challenges facing the site is keeping it operating in the face of the UK’s libel laws. Unlike in other countries, such as the US, we cannot publish statements on our users’ behalf without taking the risk of being sued for libel ourselves. Even simply republishing FOI responses from public authorities is not without risk in the UK. While we don’t actively police the site a lot of administrator time is taken up dealing with cases where potentially libelous or defamatory comments have been brought to our attention. Cases can be very complicated and involve a great deal of correspondence. mySociety is lucky to have the services of a specialist internet and technology barrister with expertise in libel who provides his services free of charge. We try and act in such a way as to maximise transparency while ensuring that the existence of WhatDoTheyKnow and mySociety are not threatened by legal risks.

    In the last year there have been only seven significant cases where requests have been hidden from public view on the site due to concerns relating to potential libel and defamation. Three of those cases have involved groups of twenty or so requests made by the same one or two users. While actual number of requests we have had to hide is around 70 (0.4% of the total) even this small fraction overstates the situation due to the repetition of the same potentially libellous accusations and comments in different requests. In all cases we have kept as much information up on the site as possible. Our policy with respect to all requests to remove information from the site is that we only take down information in exceptional circumstances; generally only when the law requires us to do so.

    Personal Information
    Sometimes people accidentally post personal information to the site; for example they make a request which is not a Freedom of Information request but a subject access request under the Data Protection Act. We are happy to remove such requests. On occasion we get requests from both our users and public sector employees asking us to remove their names from the site. As we are trying to build up a FOI archive we are very reluctant to remove information from the site, our policy is only to remove names in exceptional circumstances. Often information, such as an out of office reply, which a public body or civil servant considers irrelevant and asks to be removed is in fact critical to the correspondence thread and timeline of a response.

    Copyright and Control of Information Released
    The fact information is subject to copyright and restrictions on re-use does not exempt it from disclosure under the Freedom of Information Act (though there is a closely related exemption relating to “commercial interest”). Occasionally public bodies will offer to reply to a request, but in order to deter wider dissemination of the material they will refuse to reply via WhatDoTheyKnow.com. Southampton University have released information in protected PDF documents and the House of Commons has refused to release information via WhatDoTheyKnow.com which it has said it would be prepared to send to an individual directly.

    Mantaining and Expanding The List of Authorities
    WhatDoTheyKnow lists around 3,000 public authorities, there is a regular turnover of changes in contact details. Our coverage, while large, is not comprehensive so we have requests to add bodies such as parish councils, schools, and doctors surgeries which we have not yet attempted to add in a systemic manner based on official sources of information.

    We have also had to carefully consider what we do when for handling the various situations where an authority becomes defunct and its responsibilities are taken over by another body for example as a result of reorganisations of local government and the creation and merging of government departments.

    Providing Advice and Assistance
    The team at WhatDoTheyKnow.com often provide advice to users. We encourage users to keep their requests focused so as to reduce the chance of any problems due to libel or requests being classed vexatious. On occasion we suggest appropriate authorities for users to direct requests to, provide advice to those unhappy with the response to their request, and answer a broad range of other queries as they arise such as if particular bodies are subject to the act or not. Increasingly we link to authority’s publication schemes which are intended to let people know what information an authority has and how it can be accessed.

    Lastly, like all websites which allow people to post content online WhatDoTheyKnow.com occasionally suffers from spam in various forms. Most is dealt with automatically but some has to be removed by hand. With spam, like the other aspects of running the site, the site’s code and processes are constantly being developed and improved to reduce the fraction of cases requiring any manual intervention.

    This article was prompted in part by a team in New Zealand considering launching their own version on the site asking us what’s involved.

  2. Freedom on Rails

    This week has been quite bitty. I’ve been doing more work on the Freedom of Information site, have been getting into the swing of Ruby on Rails. Once you’ve learnt its conventions, it is quite (but not super) nice.

    As far as languages are concerned, Ruby seems identical in all interesting respects to Python. It’s like learning Spanish and Italian. Both are super languages. Ruby has nice conventions like exclamation marks at the end of function names to indicate they alter the object, rather than return the value (e.g. .reverse!). But then Python has a cleaner syntax for function parameters. It is swings and roundabouts.

    Rails has lots of ways of doing things which we already have our own ways of doing for other sites. The advantage of relearning them, is that other people know them too. So Louise was able to easily download and run the FOI site, and make some patches to it. Which would have been much harder if it was done like our other sites. Making development easier is vital – for a long time I’ve wanted a web-based cleverly forking web application development wiki. But while I dream about that, Rails packaging everything you need to run the app in a standard way in one directory that quite a few people know how to use, helps.

    Other things… I’ve been helping Richard set up GroupsNearYou on our live servers, it should be ready for you to play with soon. It looks super nice, and is easy to use. I’ve had some work to do with recruitment. And catching up on general customer support email for TheyWorkForYou and PledgeBank. I’ve also been updating the systems administration documentation on our internal wiki, so others can work out how to run our servers.

  3. Love and support

    I’m still busy beavering away at the Facebook / PledgeBank integration. It all works now, but will take a bit more polishing to get just right. Matthew is, I think adding surveys to PledgeBank. So it finds out later if people have or have not done their pledge. Or is he updating to a new version of BoundaryLine at the moment, so our postcode lookup on WriteToThem and everywhere else gets better? Hard to keep track when he does so much at once.

    Keith is upgrading our internal documentation, so new people at mySociety can learn how to keep things going. Heather is stalking all of America, finding people to use and promote PledgeBank. Tom is on a much deserved holiday, after seemingly a zillion meetings per day for months.

    There’s lots of ongoing maintenance for all our sites. We’re lucky that large chunks of our customer support email are done by volunteers (thanks Anna, Louise, Tim and Tomski/James) and by Debbi (yay Debbi!). Much of this is routine – changing pledge text, updating council email addresses, giving MPs posting links for HearFromYourMP, putting new MP photos up on TheyWorkForYou etc. A lot of it is unique – handling new translations, answering questions from MPs and Lords about their voting record. I’ll let the others give some more examples of the kind of thing we answer.

    Speaking of which, do you know any good web developers who would like to work for mySociety? If so, put them in touch.

  4. How should we handle categorisation for petitions?

    So, there are now over 600 petitions in the petitions system, and we’re getting a steady stream of appeals from our users to add categories.

    I’m posting to ask how you all think we should handle this. It seems to me that there are a few options:

    • Ask petition creators to pick one very basic top level category of no more than 10 or so, taken from a hierarchical taxonomy like the one the BBC uses.
    • Ask petition creators to pick the top level and the subsequent sub-levels to be more specific.
    • Go all web 2.0 and simply ask people to tag their petitions with some key words

    More than just thinking about the overall philsophy I’d also appreciate thoughts on design. When you come to the homepage, how should the category system be presented to you? Tricky stuff, and I’d really appreciate your thoughts.

  5. This is what Beta means: the first 48 hours of petitions

    Since the petition system went out properly on Wednesday, we’ve been absolutely buried in an avalanche of changes, fixes, feature additions and massive massive amounts of email. I thought that you might be interested as to what sort of stuff has happened in the first two days:

    • Email has taken over our lives. Matthew has responded to over 200 emails since yesterday morning, and I was up at 4am last night just trying to cope with the rate of incoming of mail. Francis, who’s now in Canada, then heroically took up the baton and responded to mail all (UK) night! Many if not most of these mails are giving us suggestions, as well as bug reports, problems with email and bits of praise and the odd conspiracy theory.
    • Changes made to cope with expats and overseas military personnel.
    • Phoned Hotmail to stop their system from eating 95% of the confirmation messages being sent to Hotmail accounts!
    • Redesigned the automated mails no10 get telling them there’s a new petition (they’ve had over 500 of these mails, so they need to be clear and easy to read!)
    • Made the rejected petitions system more granular, so that if a petition has to be rejected, and part of it has to be hidden (say, if it is libellous), then it only hides that bit, not the whole thing. Maximum transparency is the goal, you see.
    • New options added to sort the list of all petitions in different ways, by number of signatures being the most asked for.
    • Limited the length of “more info” fields so people can only write long rants, rather than really really long rants 🙂
    • Special cased people with AOL accounts, so that their, erm, nonstandard email clients can actually cope with the confirmation links.
    • Made several fixes to the processes involved in sending out confirmation mails.
    • Made RSS changes and improvements.
    • Updated various bits of text, like providing examples of what “party political” means. The BBC initially wrote that this meant no pledges mentioning controversial issues like Iraq, which was grabbing quite the wrong end of the stick about the nature of the rules. Now we have some complaining emails saying we’re being too liberal!
    • Compiled a big list of user suggestions and fixes on the wiki.
    • Made the rejection criteria in the Ts&Cs actually match the ones in the admin interface.
    • Installed a stats packages to watch what’s going on.
    • Added facility to search petitions
    • Improved/fixed logging
    • Added link and text pointing to the open source code.

    I’ve probably missed some – I’m sure Matthew, Chris, Francis and Ben will let me know!

  6. What we’re up to

    Much of my August seems to have been absorbed with maintenance tasks.

    For example, Chris and I spent a few days tightening up WriteToThem’s privacy. I made sure the privacy statement correctly describes what happens with backup files, and failed messages. I reduced the timeouts on how long we keep the body of failed messages. I made sure we delete old backup files of the WriteToThem database. I wrote scripts to run periodically to check that no bugs in our queueing demon can accidentally mean we keep the body of messages for longer than we say. I added a cron job to delete Apache log files older than a month for all our sites. As AOL know to their cost, the only really private data is deleted data.

    Earlier in the month, I handled some WriteToThem support email for the first time in ages. We get a couple of hundred messages a week, which Matthew mainly slogs through. It’s good for morale to do it, as we get quite a lot of praise mail. It is also hard work, as you realise how complicated even our simple site and the Internet are, and it leads to fixing bugs and improving text on the site. I made a few improvements to our administration tools, and things like the auto-responder if people reply to the questionnaire, to try and reduce the amount of support email, and make it easier to handle.

    I did some more work on the geographically cascading pledges (like this prototype one), but I’m still not happy with them. In the end, I realised that it is the structure of wording of the pledge that is the key problem. Our format of “If will A but only if N others will B” just isn’t easily adapted to get across that the pledge applies separately in different geographically areas. Working out how to fix that is one of the things we’ll brainstorm about in the Lake District (see below).

    The last couple of days I’ve been configuring one of our new servers who is called Balti, and getting the PledgeBank test harness working on it. Until now, it has only been run on my laptop. This is partly heading towards making a proper test harness for the ePetitions site, running on a server so we properly test nothing can be broken before deploying a new version.

    Matthew has wrapped up the TheyWorkForYou API now, and is working on Neighbourhood Fixit next. Chris has been doing lots more performance work for the ePetitions site. And he’s been making some funky monitoring thing to detect PostgreSQL database lock conflicts, which we get occasionally and are hard to debug

    Tom’s in Berlin at the moment, he gave a talk last night, and I think has been to see some people from Politik Digital. As we’ve been discussing on the mySociety email list, there’s an EU grant we’re likely to apply for in collaboration with them.

    On Friday, we’re all going to the Lake district for a week, with some of the trustees and volunteers intermittently. We very conveniently and cheaply all work from home, so it’s good and necessary to meet up for a more sustained period of time at least once a year. Last year we were in Wales.

  7. My second entry of 2006

    I haven’t posted on here for six months; quality over quantity is my motto. 🙂 In that time I have answered a lot of user support email (we’re up to Ticket 5741 in RT, the subject of my last post, and quite a lot of that isn’t spam or out of office replies…) and made sure that Hansard has parsed successfully into TheyWorkForYou’s database every morning. I get all the fun jobs. 😉 Oh, I also quit my day job to work for mySociety basically full time, and wrote or helped write WriteToThem Lords, TheyWorkForYou Lords, various features and improvements to our other sites, and much else I’ve probably forgotten. It’s been fun.

    Today, I gave a talk (my first!) at LUGRadio Live 2006 on mySociety in general, with various open source related ramblings in particular. I wrote the presentation in Slidy, Dave Raggett’s HTML slide generation software, which means anyone should be able to see it online at https://www.mysociety.org/2006/lugradio-live/ – I’ll try and add some speaking notes to explain some of the slides in more detail. The talks were all recorded, though the audio of the talk before mine didn’t work – I’m not sure if mine did, or if so, where they’re going to be uploaded, but I’ll post here if I find out.

    Lastly, as a minor rant, my laptop’s power supply just stopped working yesterday – annoying when you hoped to use it for a talk the next day. I had bought it because my old one had become very frayed and liable to cut out randomly; I’ve had to switch back to that one (cutting out sometimes is better than nothing), and it’s proving very annoying. Dell using differently shaped plugs for all their models is rather annoying too. Grr.