1. Local e-Petitions – See if mySociety is providing your local system

    Councils all around England have been busy getting ready to comply with the new duty to provide e-Petitions which kicks in today, 15th December. This means that on council sites across England you should now be able to make petitions which will be formally considered by the councils, in accordance with their chosen policies.

    At mySociety we’ve spent a lot of time over the last twelve months helping councils to cope with this new duty by offering  them a commercial petitions service that is really good for users and easy to administer for councils. Some of the sites have been live for months, but many of the 35 council e-petitions sites we’re currently contracted to supply launch today.

    mySociety’s core developers Matthew Somerville and Dave Whiteland deserve huge credit for all the work they did re-purposing the No10 Petitions codebase and doing dozens of council customisations and rebrands. I’ve just seen one council officer email “Yippeee” at the prospect of launching, so I reckon they’ve done a pretty good job  – well done gents, everyone in mySociety owes you a debt of gratitude for a time consuming job well done.

    Here’s the current list of live local petitions sites. We’ll be adding more as they go up. Happy petitioning!

    Ashfield http://petitions.ashfield-dc.gov.uk/

    Barnet http://petitions.barnet.gov.uk

    Barrow http://petitions.barrowbc.gov.uk/

    Bassetlaw http://petitions.bassetlaw.gov.uk/

    Blackburn with Darwen http://petitions.blackburn.gov.uk/

    East Cambridgeshire http://petitions.eastcambs.gov.uk/

    East Northants http://petitions.east-northamptonshire.gov.uk

    Elmbridge http://petitions.elmbridge.gov.uk

    Forest Heath http://petitions.forest-heath.gov.uk

    Hounslow http://petitions.hounslow.gov.uk

    Ipswich http://petitions.ipswich.gov.uk

    Islington http://petitions.islington.gov.uk

    Lichfield http://petitions.lichfielddc.gov.uk

    Mansfield http://petitions.mansfield.gov.uk/

    Melton http://petitions.melton.gov.uk/

    New Forest http://petitions.newforest.gov.uk

    Nottinghamshire http://petitions.nottinghamshire.gov.uk

    Reigate & Banstead http://petitions.reigate-banstead.gov.uk

    Runnymede http://petitions.runnymede.gov.uk

    Rushcliffe http://petitions.rushcliffe.gov.uk/

    South Holland http://petitions.sholland.gov.uk

    Spelthorne http://petitions.spelthorne.gov.uk

    St Edmundsbury http://petitions.stedmundsbury.gov.uk

    Stevenage http://petitions.stevenage.gov.uk

    Suffolk Coastal http://petitions.suffolkcoastal.gov.uk/

    Surrey County Council http://petitions.surreycc.gov.uk

    Surrey Heath http://petitions.surreyheath.gov.uk

    Tandridge http://petitions.tandridge.gov.uk

    Waveney http://petitions.waveney.gov.uk

    Waverley http://petitions.waverley.gov.uk

    Wellingborough http://petitions.wellingborough.gov.uk

    Westminster http://petitions.westminster.gov.uk

    Royal Borough of Windsor and Maidenhead http://petitions.rbwm.gov.uk

    Woking http://petitions.woking.gov.uk

  2. Report submission edits

    A number of people report dog fouling through FixMyStreet, using slightly more… colloquial language. A number of councils have strict obscenity filters, blocking anything containing swearing. As I’m a pragmatist and not that interested in campaigning against councils blocking legitimate emails from their citizens (feel free!), FixMyStreet simply changes any “dog shit” reference to “dog poo”. This works well for everyone.

    Recently, the infamous Intellectual Property Manager from Portakabin™ Limited got in touch to complain about a couple of reports on FixMyStreet containing the words “portacabin” or “portaloo”. Again, as a pragmatist, I’m not really interested in whether users using trade marks or trade mark variants in a generic way on a problem report actually constitutes trade mark infringment (actually, I’d guess not), I just want legal people to go away and not waste our precious resources. So from now on, any report containing portakabin or similar will become [portable cabin], and portaloo will become [portable loo].

    For anyone who’s interested, this is accomplished through a simple regular expression, that looks for porta followed by 0 or more spaces, then cabin, kabin, or loo, and sticks “ble” in the middle.

  3. RIP Angie Martin 1974-2009

    It is with overwhelming sadness that I write to tell our community that Angie Martin, mySociety’s fourth core developer, has died. She was taken from us by the cancer that she had been fighting since soon after we hired her less than two years ago.

    Possessed of an almost unbelievably upbeat personality, Angie brought not only her formidable Perl skills, but her blazing warmth of character to our team. In remission during our yearly retreat in January this year, she combined laughter with a typically tough line of questioning on ideas she thought insufficiently robust. With typical disgregard for cool, her CV noted that she was “known to enjoy wrangling regular expressions on a Sunday Morning”. She didn’t see any contradiction between being a successful woman and a geek, throwing herself wholeheartedly into the Mac-toting, perlmonger ethos. She even brought her husband Tommy with her, who became a significant volunteer.

    Given her habit of plain speaking, it is pointless to pretend that Angie was able to make the contribution to mySociety’s users or codebase that she wanted to. What she achieved in terms of difficult coding during recovery from chemotherapy was incredible, breathtaking – but she wanted to change the world. It now falls to the rest of us, and our supporters, to live up to the expectations she embodied, to continue to push every day, using skills like those that she had to help people with everyday problems. We now have to ask ‘What would Angie do?’, as well as ‘What would Chris do?’. It is a lot to live up to.

    She was a mySociety core developer: I hope that meant as much to her as it meant for me to have her as one of my coders.  Remember and Respect.

    Updated: Angie changed her surname upon getting married, a couple of months ago. I have just read she wanted to be remembered as Angie Martin, and so I have made that change.

    Updated 21 7 2009: Tommy has just told me that those wishing to may memorial donations should send them to Hospice at Home.

  4. Freedom on Rails

    This week has been quite bitty. I’ve been doing more work on the Freedom of Information site, have been getting into the swing of Ruby on Rails. Once you’ve learnt its conventions, it is quite (but not super) nice.

    As far as languages are concerned, Ruby seems identical in all interesting respects to Python. It’s like learning Spanish and Italian. Both are super languages. Ruby has nice conventions like exclamation marks at the end of function names to indicate they alter the object, rather than return the value (e.g. .reverse!). But then Python has a cleaner syntax for function parameters. It is swings and roundabouts.

    Rails has lots of ways of doing things which we already have our own ways of doing for other sites. The advantage of relearning them, is that other people know them too. So Louise was able to easily download and run the FOI site, and make some patches to it. Which would have been much harder if it was done like our other sites. Making development easier is vital – for a long time I’ve wanted a web-based cleverly forking web application development wiki. But while I dream about that, Rails packaging everything you need to run the app in a standard way in one directory that quite a few people know how to use, helps.

    Other things… I’ve been helping Richard set up GroupsNearYou on our live servers, it should be ready for you to play with soon. It looks super nice, and is easy to use. I’ve had some work to do with recruitment. And catching up on general customer support email for TheyWorkForYou and PledgeBank. I’ve also been updating the systems administration documentation on our internal wiki, so others can work out how to run our servers.

  5. Time shifting

    So, we’ve been a bit quiet on this blog, but naturally busy. I just did my invoice and timesheet for last month, and remembered how bitty it has been. In one day I often do things to 3 websites, and that is just CVS commit messages – no doubt I handled emails for more. This makes it quite hard to summarise what has been happening, and also quite hard to measure how much time we spend maintaining each website.

    We’ve recently made a London version of PledgeBank, which I’ll remind Tom to explain about on the main news blog. It is a PledgeBank “microsite”, with a special query for the front page and all pledges page that shows only pledges in Greater London. Which is conveniently almost exactly a circle radius 25km with centre at 51.5N -0.1166667E. I worked that out by dividing the area (found on the Greater London Wikipedia page) by pi and taking the square root And rounding up a bit.

    Yesterday we launched a new call for proposals – head on over, and tell us your ideas for new civic websites. It is another WordPress modification, but this time to the very blog that you’re reading now. The form for submitting proposals I made anew, It creates a new WordPress low-privileged user by directly inserting into the database, and then calls the function wp_insert_post to create a post by them in a special category. The rest of the blogging software then trivially does comments, RSS, search, email alerts and archiving.

    Meanwhile, Chris has written some monitoring software for our servers, to alert us of problems and potential problems. Perl modules do the tests, things like enough disk space and that web servers that are up. I’ve been tweaking it a bit, for example adding a test to watch for long-running PostgreSQL queries which indicate a deadlock. We’ve got a problem in the PledgeBank SMS code which causes deadlocks sometimes, which we’re still debugging.

  6. Detecting bad contacts

    We decided to be particularly careful about the new WriteToThem statistics, and did lots of checks on the data. In particular, we wanted to make sure we didn’t unfairly impugn MPs whom we had had bad contact details for. It is possible that for a period of time we thought we had their details but we got them wrong, that we were sending messages to an incorrect address, and their constituents were (unknowingly unfairly) reporting them as unresponsive in the questionnaire.

    So, I wrote a script which generates the statistics, and triest to spot such cases. During the 2005 period, it breaks each MPs time up into intervals according to when we changed our contact details for them. We can do this, because every change we make to contact details is recorded in dadem’s database (see the representatives_edited table).

    I’m going to sound a bit like Donald Rumsfield here, but keep with me. For each interval, we either have good or bad contact details, and we either know or we don’t know that they are good or bad. If we know that they’re bad (e.g. we have no details at all), then that interval isn’t a problem. No messages will have been sent, WriteToThem will have apologised to constituents trying to send messages, and no questionnaires will have been sent out. Any questionnaire results we have from good intervals can still be used and will be fine.

    The case when we think we have contact details is harder. The script does some simple checks to work out if they were really valid. For example, if there have been at least a couple of questionnaire responses, and none were affirmative, then it is a bit suspicious. The script a threshold of length of time of suspicious intervals, and outputs as “unknown” MPs which it thinks there may have been a problem with for long enough for it to matter.

    Tom then heroically checked all those MPs. Some we’ve marked as “WriteToThem had possibly bad contact details for this MP” in the league table. For others, we managed to verify that the questionable email or fax that we had (either via the MPs own website, or by ringing up their office) was actually good. The script then spits out, of all things, PHP file, which you might find useful on your own websites. It contains the complete detailed results. Make sure you look at the “category” for each MP. That indicates if we had too little data, or bad contact details, amongst other things.

    Why PHP? And why not update the stats in real time? We’ve decided to make new statistics just once a year. Firstly, this is much easier to describe, we can say for example on TheyWorkForYou (where the responsiveness data also appears) that it is for the ‘year 2005’. Secondly, it lets us do the manual checking, so we are more confident about our data. Thirdly, it’s good for publicity to announce the new statistics as a news story. And finally, it is much easier to manage an unchanging text file (e.g. the PHP file), stored forerver in CVS, than it would be an ephemeral table in a database somewhere.

    After all that, we mailed or faxed all the still sitting MPs who scored 100% responsiveness, to congratulate them on a job well done. Greg Pope, Richard Page, Fraser Kemp, Thomas McAvoy, Bob Laxton, Mark Simmonds, Paul Stinchcombe, Dennis Turner, Nick Ainger, Alan Meale, Adrian Sanders, Tom Cox, Andrew Hunter, Robert Key, Andrew Selous, John Wilkinson, Paul Goodman, Gwyneth Dunwoody, David Evennett, Peter Atkinson, Andrew Bennett, George Young, Terry Lewis, Douglas Hogg, Patrick Cormack, Andrew Robathan, David Stewart, Colin Challen, Harry Barnes MPs and all your staff, congratulations! (That list includes ones who are no longer MPs, for example stood down at the General Election)

  7. HassleMe

    So, HassleMe launched today (despite mostly having been written just before Christmas). Good work by Etienne getting it all together. Today Matthew and I have been working on adding “instant-messenger” functionality to the site, which turns out to be a bit painful. Right now it seems like the most robust solution will be to use bitlbee, a proxy which allows you to interact with the various and wretched instant messenger protocols through the less varied and marginally less wretched IRC protocol.

    Integrating a website with instant messenger is an interesting problem. I’m not yet sure how much of the experience of building sites which send and receive email will carry over. We’ll see….

  8. Working in Cambridge cafe CB2

    Lots of stuff happening here.

    Earlier in the week, I’ve been getting WriteToThem to update more of its data automaticaly. Two volunteers contributed useful screen scrapers. Richard George’s gathers data from the Welsh assembly, and Jonathan Hogg’s screen scrapes the Scottish Parliament. They both spit out CSV files with representatives, constituencies and contact emails/faxes. I’ve now updated the script that can load in those CSV files, and set it all running once a week on cron. Along with another London Assembly scraper Chris wrote earlier in the year, and some code to get MPs from parlpase.

    Today I’ve been doing other bits, including improving the link from WriteToThem to HearFromYourMP. When somebody has confirmed a message to be sent with WriteToThem, we know their email address is valid. So, why if they follow the link to HearFromYourMP do they have to confirm again? It’s bad user interface, and is probably reducing our signups to HearFromYourMP a bit.

    The fix is to pass a signed email address through from WTT, and check the signature on HFYMP. The magic of hashes and shared secrets does the job.

  9. Electoral geography again

    So, it’s back to electoral geography for me, this time to get the new county and county electoral-division boundaries live on WriteToThem. This is a prerequisite for getting mail to county councillors working again after the election on May 5th, so we’re already three months behind the times. But more generally, electoral boundaries are revised all the time to account for changes of population within each ward, constituency and so forth; and at most (local and national) elections some set of boundary changes takes effect. So to keep WriteToThem running we need to incorporate such updates routinely.

    The way we handle electoral geography in general is to start with Ordnance Survey’s Boundary Line product, which, for each administrative or electoral area in Great Britain gives a polygon identifying that region. We then take a big list of all the postcodes in Britain (CodePoint) and figure out which polygons they lie in. Then when somebody comes along to WriteToThem and types in their postcode, we can figure out which ward, constituency etc. they are in, and tell them appropriate things about their representatives. (Technically this is a lie, of course, because postcodes represent regions, not points — we use the centroids of those regions — and each such region isn’t guaranteed to lie either wholly within or without all electoral and administrative regions. Unfortunately there isn’t a lot we can do about this beyond throwing our hands up and saying “oops, sorry”, so that’s what we do.)

    As an aside, outside Great Britain — that is, in Northern Ireland, we don’t have the same sort of data so instead we rely on another field in the CodePoint data which gives, for each postcode centroid, the ONS ward code for the ward in which that point lies. From that ward code you can find the enclosing local authority area, local electoral area — in Northern Ireland local councils are elected by STV over multimember regions, rather than by first-past-the-post as in Great Britain — and constituency. Happily it turns out that all of those other regions are composed of whole numbers of wards; this happy state of affairs does not necessarily prevail elsewhere.

    Now, twice a year, a new edition of Boundary Line is issued, taking account of recent changes in electoral geography. Usually this happens in May and October, though the schedule has been known to slip. In principle this should be easy to deal with: load up the new copy of Boundary Line, pass all the postcodes through it, and hey presto.

    Life, of course, is rarely that simple, and this isn’t one of those occasions. When the boundaries of a region don’t change between one year and the next, we don’t want to make any alteration to that region in our database (which uses ID numbers to identify each area). More specifically, when a new revision of Boundary Line comes along, we want to ensure that — let’s say — Cambridge Constituency in the new revision is identified with Cambridge Constituency in the old version. Now, in principle, this should be easy, because each area in the data set, in the words of the manual,

    … carries a unique identifier AI; this is the same identifier that was supplied in the previous specification of Boundary- Line. The same AI attribute is associated with every component polygon forming part of an administrative unit, irrespective of the number of polygons.

    Now, the first time that we did this, we worked from a copy of the Boundary Line data supplied in the form of “ShapeFiles” (a format used in various proprietary GIS systems, and with which our local government partners were able to supply us without having to order it specially from Ordnance Survey). Unfortunately in the ShapeFile version, the allegedly unique administrative area IDs were, in fact, not unique. After discussion with Ordnance Survey it was concluded that this was a problem which affected the translation of the data from NTF (“National Transfer Format”, their own preferred format) into ShapeFile; and that the problem would be fixed in the next release.

    So, taking no chances, we decided we’d work from the NTF format in future, since that seems to be closer to the authoritative source of the data, and anyway the ShapeFile format isn’t at all well-documented (for instance, many of the field names for the metadata about each area differ from those described in the manual for Boundary Line). So I’ve written code to parse the (slightly bonkers, natch) NTF files and modified our import scripts to use this code, with a view to then being able to keep up-to-date with future boundary revisions without too much trouble.

    You will not be surprised, therefore, to hear that this has not worked out exactly as planned. Unfortunately it appears that the May 2005 NTF release of Boundary Line suffers exactly the same problems of non-uniqueness as did the previous ShapeFile release. So unless some cleverer solution presents itself, I’ll have to revive the hack we intended to use with the ShapeFile data — try to construct unique IDs for areas from their geometry, and hope that the exact coordinates of the polygon vertices for unchanged areas do not change between revisions. We shall see. But right now I’m mostly worrying about why my parser script runs out of memory on my 1GB computer after reading a couple of hundred megs of input data.

  10. Boiling boiling hot

    Today is the sort of muggy, hot day, which makes you believe that old story. The one about the British having an empire because of the bad weather. They could actually concentrate, plot and do under the clouds, rather than just dozing in a field, lazing in the sun. How true it is I’m not sure, but it definitely seems quiet everywhere today. I’ve got fewer emails than ever, and I’m still amazed I’ve managed to think all day at all.

    I’m leaving PledgeBank behind for a bit, as most of the obvious bugs are fixed, and features added. Earlier this week, and at the end of last, I put in local email alerts. Now if you sign up to local alerts you’ll get mailed at most once a day when a new pledge is made within 25km of you. PledgeBank really is having a proper beta test, with about 100,000 visitors since launch a month ago. This has been fantastic, the feedback has made it a very stable and hopefully more usable site. Good software is software which is used lots, with a virtuous feedback loop from users into making it better.

    So this morning I went back to WriteToThem. We still haven’t updated county councils after the elections. So I’ve been writing some three way merging code, to import changed council data from GovEval. It is merging, rather than just loading, because we’ve also altered the data. This was to make the ward names match up consistently with those from our mapping data.

    Things are working pretty well for councils where neither GovEval or us have added or deleted any representatives. There are unique identifiers, and very few clashes. Only one throughout all the data which has been caused by one of us editing the ward name, the other editing the councillor name, and so the repersentative really being neither.

    The hard case is additions and deletions. Obviously, I’d like to keep our work of mySociety additions. But this is no use, as eventually one day they’ll be wrong. The councillor we thought should be there no longer will be. So how do I detect when?