Skip navigation

  Help us to make more
useful things.
Donate to mySociety

mySociety blog » Technical

April Fools’ Day Council changes

Friday, April 3rd, 2009 by Matthew Somerville

They could perhaps have picked a better day, as it was quite serious - at the stroke of midnight on the 1st of April, 37 district councils and 7 county councils in England ceased to exist, replaced by 9 new unitary authorities. This means people in Durham, Northumberland, Cornwall, Shropshire, Wiltshire, Chesire, and Bedfordshire only have one principal local authority to deal with now. The Wikipedia article on the changes has more information on the background to this change.

Obviously this meant some work for WriteToThem and FixMyStreet, both of which require up-to-date local council information. Our database of voting areas, MaPit, has “generations”, so we can keep old areas around for various historical purposes. So firstly, I created a new generation and updated all the areas that weren’t affected to the new generation. Next, six of the new unitary authorities (all the counties except Cheshire and Bedfordshire, plus Bedford) share their boundaries and wards with the coterminous councils they’re replacing, so for them it was a simple matter of updating those councils to be unitary authorities.

That left Bedfordshire and Cheshire. I created areas for the three new councils (Cheshire West and Chester, Cheshire East, and Central Bedfordshire), and transferred across the relevant wards from the old county councils - basically a manual process of working out the list of correct ward IDs. april2009-update.sql has the gory SQL details if you care.

WriteToThem was now dealt with, but FixMyStreet needed a little more work. The councils that no longer existed had understandably disappeared from the all reports table, so I had to modify the function that fetches the list of councils to optionally return historical areas so they could be included. And lastly, FixMyStreet needs a way of mapping a point on a map to the relevant council. For this, it needs to know the area covered by a council, which was missing for the new authorities I’d manually created. Thankfully, each of the three new authorities are made up of the areas of either 2 or 3 district councils (e.g. Cheshire East is the area covered by Congleton, Macclesfield, and Crewe and Nantwich), so I just had to write a script that stuck those areas together to create the area of the new council. april2009-construct-new.pl. It all seems to work, and I’m sure our users will be in touch if it doesn’t :)

So goodbye to Alnwick, Bedfordshire, Berwick-upon-Tweed, Blyth Valley, Bridgnorth, Caradon, Carrick, Castle Morpeth, Cheshire, Chester, Chester-le-Street, Congleton, Crewe and Nantwich, Derwentside, Durham City, Easington, Ellesmere Port and Neston, Kennet, Kerrier, Macclesfield, Mid Bedfordshire, North Cornwall, North Shropshire, North Wiltshire, Oswestry, Penwith, Restormel, Salisbury (which is getting a new town council), Sedgefield, Shrewsbury and Atcham, South Bedfordshire, South Shropshire, Teesdale, Tynedale, Vale Royal, Wansbeck, Wear Valley, and West Wiltshire. RIP.

FixMyStreet RSS

Thursday, October 2nd, 2008 by Matthew Somerville

FixMyStreet has a lot of RSS feeds. There’s one for every one-tier council (170), one for every ward of every one-tier council (another 5,044), two for every two-tier (county and district) council (544), and two for every ward of every two-tier council (20,296) – two per two-tier council because you might want either problems reported to one council of a two-tier set-up in particular, or all reports within the council’s boundary.

Then there’s an RSS feed every 162m across Great Britain in a big grid, returning all reports within a radius of that point, the radius by default being automatically determined by that point’s population density, but customisable to any distance if preferred. That’s, at a very rough approximation assuming Great Britain is a rectangle around its extremities, which it’s not, 19 million RSS feeds, lots of which will admittedly be very similar. :)

Every single one of those feeds can be subscribed to by email instead if that’s preferable to you, and are all accessible through a simple interface at http://www.fixmystreet.com/alert.

However, none of these RSS feeds was suitable for the person who emailed from a Neighbourhood Watch site and said that all they had was a postcode and they wanted to display a feed of reports from FixMyStreet. Given you could obviously look up a FixMyStreet map by postcode, it did seem odd that I hadn’t used the same code for the RSS feeds. Shortly thereafter, this anomaly was fixed, and if you now go to a URL of the form http://www.fixmystreet.com/rss/pc/postcode you will be redirected to the appropriate local reports feed for that postcode (I could say that adds another 1.7 million RSS feeds to our lot, but given they’re only redirects, that’s not strictly true). And after a couple more emails, I also added pubDate fields to the feeds which should make displaying in date order easier.

It’s great to see our RSS feeds being used by other sites – other examples I’ve recently come across include Brent Council integrating FixMyStreet into their mapping portal (select Streets, then FixMyStreet), or the Albert Square and St Stephen’s Association listing the most recent Stockwell problems in their blog sidebar. If you’ve seen any notable examples, do leave them in the comments.

PledgeBank Facebook application disabled

Thursday, September 18th, 2008 by Francis Irving

Unfortunately, I’ve had to disable the PledgeBank Facebook application. It used to let you sign and share pledges from within Facebook.

Facebook recently changed their platform (again!), breaking our code for sending success/failure messages. Obviously, it is no good signing up to a pledge if you don’t get informed when it succeeds.

I tried to fix it, but couldn’t work out how to do so quickly. We don’t have the time and money at the moment to chase after this, so I’ve disabled the application entirely. Links to PledgeBank pages on Facebook now redirect to pledgebank.com.

Hopefully it’ll be back one day - do send us emails if you miss it (or money if you have a large pledge that really needs it!). I think there may be a better solution with a simpler interface - the current application tried too hard to reimplement all of PledgeBank within Facebook. And besides, we should be supporting OpenSocial now it exists. It’s an open standard, Facebook isn’t.

Technical details: We used infinite session keys to send notifications from cron jobs. Quite reasonably, this no longer works. However, I couldn’t find out what to use instead. I think Facebook should respect backwards compatibility of its APIs a lot more, and if it breaks it they should give clear instructions about what to use instead. This does put me off ever wanting to develop anything on their platform again.

Relentlessly into autumn

Monday, September 15th, 2008 by Francis Irving

I’m enjoying the weather at the moment, seems to be sunnier than the summer, but cool with an atmospheric autumnal taste in the air.

mySociety is changing as ever, leaping forward in our race to try and make it easier for normal people to influence, improve or replace functions of government. More on this as it happens.

Meanwhile, I’ve been continuing to hack away at WhatDoTheyKnow. A little while ago Google decided to deep index all our pages - causing specific problems (I had to tell it to stop crawling the 117th page of similar requests to another request), and also ones from the extra attention. There have been quite a few problems to resolve with authority spam filters (see this FOI officer using the annotation function), and with subtle and detailed privacy issues (when does a comment become personal? if you made something public a while ago, and it is now a shared public resource, can you modify it or take it down?).

Right, I’ve got to go and fix a bug to do with the Facebook PledgeBank app. It’s to do with infinite session keys, and how we send messages when a pledge has completed. Facebook seem to change their API without caring much that applications have to be altered to be compatible with it. This is OK if the Facebook application is your core job, but a pain when you just want your Facebook code to keep running as it did forever.

(the autumn photo thanks to Nico Cavallotto)

acts_as_xapian

Thursday, July 17th, 2008 by Francis Irving

One of the special pieces of magic in TheyWorkForYou is its email alerts, sending you mail whenever an MP says a word you care about in Parliament. Lots of sites these days have RSS, and lots have search, but surprisingly few offer search based email alerts. My Mum trades shares on the Internet, setting it to automatically buy and sell at threshold values. But she doesn’t have an RSS reader. So, it’s important to have email alerts.

So naturally, when we made WhatDoTheyKnow, search and search based email alerts were pretty high up the list, to help people find new, interesting Freedom of Information requests. To implement this, I started out using acts_as_solr, which is a Ruby on Rails plugin for Solr, which is a REST based layer on top of the search engine Lucene.

I found acts_as_solr all just that bit too complicated. Particularly, when a feature (such as spelling correction) was missing, there were too many layers and too much XML for me to work out how to fix it. And I had lots of nasty code to make indexing offline - something I needed, as I want to safely store emails when they arrive, but then do the risky indexing of PDFs and Word documents later.

The last straw was when I found that acts_as_solr didn’t have collapsing (analogous to GROUP BY in SQL). So I decided to bite the bullet and implement my own acts_as_xapian. Luckily there were already Xapian Ruby bindings, and also the fabulous Xapian email list to help me out, and it only took a day or two to write it and deploy it on the live site.

If you’re using Rails and need full text search, I recommend you have a look at acts_as_xapian. It’s easy to use, and has a diverse set of features. You can watch a video of me talking about WhatDoTheyKnow and acts_as_xapian at the London Ruby User Group, last Monday.

Internal links, and search engine crawlers

Thursday, July 17th, 2008 by Matthew Somerville

TheyWorkForYou now finds whenever an old version of Hansard is referenced (which they do by date and column number, e.g. Official Report, 29 February 2008, column 1425) and turns the citation into a link to a search for the speeches in that column on that date. This only really became feasible when we moved server, upgraded Xapian, and added date and column number metadata (among others), allowing much more advanced and focussed searching - the advanced search form gives some ideas. Perhaps in future we’ll be able to add some crowd-sourcing game to match the reference to the exact speech, much like our video matching (nearly 80% of our archive done!). :)

Kudos to Google and Yahoo! for spotting this change within a couple of days, as they’re now so busy crawling everything for changes that they’re slowing the whole website down… ;-)

Postcodes on TheyWorkForYou

Tuesday, July 8th, 2008 by Matthew Somerville

If you enter your postcode on TheyWorkForYou and it’s Scottish or Northern Irish, you’re now presented with your MSPs and MLAs as well as your MP, which makes sense given the site covers their Parliament and Assembly respectively. :-) You also get an extra tab in the navigation linking through to Your MSPs or MLAs. In order to do this, I needed a quick way of determining if a postcode was Northern Irish or Scottish. Northern Ireland was easy, as all postcodes there begin with BT. I assumed Scotland was also easy, which turned out to be true apart from the TD postcode area that straddled the border like a mail-sorting Niagara Falls. After some very dull investigation, I eventually worked out that e.g. most of TD15 is in England, but (amongst others) TD15 1X* is in Scotland, except for TD15 1XX which is apparently back in England. The final result was the postcode_is_scottish() function in postcode.inc, which (hopefully) correctly determines if a given postcode is Scottish or not - perhaps someone else will find it useful.

Highlighting the current speech

Friday, June 13th, 2008 by Matthew Somerville

Debate pages that have at least one timestamped speech (such as the previously mentioned last week’s Prime Minister’s Questions) have a video fixed to the bottom right hand corner (if your browser is recent enough) showing that debate. While playing the video, the currently playing speech is highlighted with a yellow background, and you can start watching from any timestamped speech by clicking the “Watch this” link by any such speech. So how does all that work?

I’m very proud of this feature, I wasn’t sure it would be possible, and it’s very exciting. :-)

Flash has an ExternalInterface API, where JavaScript can call functions in the Flash, and vice-versa. When the video player loads, it requests an XML list from the server of all speech GIDs and timestamps for the current debate (here’s the file for the above debate). So when someone clicks a “Watch this”, it calls a moveVideo function in main.mxml with the GID of the speech, which loops through all the speeches and moves to the correct point if possible.

The highlighting works the other way - as the video is playing, it checks to see which speech we’re currently in, and if there’s been a change, it calls the updateSpeech function in TheyWorkForYou’s JavaScript, which finds the right row in the HTML and changes the class in order to highlight it. Quite straightforward, really, but it does make following the debate very simple and highlights the linking between the video and the text, all done by our excellent volunteers (join in! :) ).

Talking of our busy timestampers, I’ve also been busy making improvements (and fixing bugs) to the timestamping interface to make things easier for them. As well as warnings when it looks like two people are timestamping the same debate at the same time, various invisible things have been changed, such as using other people’s timestamps to make the start point for future timestamps on the same day more accurate. I also added a totaliser, using the Google Chart API, for which you simply have to provide image size and percentage complete.

Approaching 45% of our entire archive of video timestamped, with the totaliser approaching the chartreuse :-)

Previous articles

  1. The Flash player
  2. Seeking
  3. Highlighting the current speech

TheyWorkForYou video - seeking

Friday, June 13th, 2008 by Matthew Somerville

Our video is streamed via progressive HTTP, using lighttpd and mod_flv_streaming. This works by having keyframe metadata at the start of the FLV (Flash video) file (we add ours using yamdi as that doesn’t load the whole file into memory first), which maps times within the video to byte positions within the file. When someone drags the position slider, or presses a skip button, the player actually changes the source of the video to something like file.flv?start=<byte position> which starts a new download from that point in the video. This means you can seek to parts of the video not yet downloaded, which is definitely a required feature.

The video is split up into programme chunks, according to BBC Parliament’s schedule, so each Oral Questions will (approximately) be its own video chunk, and the main debates will be a couple of chunks. By default, the video player will show a screengrab from the start of the video, as that’s all that’s available when it first loads (you have to load the start of the FLV file to fetch the keyframe metadata in order to move anywhere else :) ). I wanted the player to show a relevant screengrab before you hit Play, so came up with the slightly messy workaround of setting the volume to 0, seeking and playing the video for under a second in order to start it from the new point and show the video, then stopping it and resetting the volume. It works most of the time :-)

Some of our video chunks have jumps in them, due to problems in downloading the original WMV stream. The timestamping interface has a link for people to let us know of such problems, so that we can mark the relevant speeches as missing video and not have them be offered to future timestampers. One valiant volunteer, Tim, let us know about two such videos, but with the added oddity that if you let them play, they would happily carry on past their “end” point, but this made timestamping those speeches quite difficult.

I started investigating, firstly noting that both videos should have been 6 hours long, but were both listed as 1:20:24, which I thought was a bit of an odd coincidence. After reading the FLV file specification, it turned out that 32-bit millisecond timestamps in FLV are split into two - first the low 24 bits, then the high 8 bits. 2^24 = 16,777,216, which in milliseconds is 4 hours, 39 minutes, 37 seconds, which is pretty much exactly what the two videos’ durations were short by! All the timestamps in our FLV files were not setting the high byte, so after 4:39:37, they were wrapping round to 0 (and thus 6 hours became 1:20:24ish).

Our video processing consists of four major steps - the downloading script uses ffmpeg to convert each 75 minute chunk from WMV to MPEG; then nightly processing uses ffmpeg again to convert the right bits of these MPEG files to FLV, mencoder to join the relevant FLV files into one FLV chunk, then yamdi to add the metadata. My first try at a solution was to alter yamdi to increment the high byte itself, which fixed the duration display and let you seek to high times, but when you tried to go to e.g. 5 hours, the video started playing from the right point but the video thought it was playing from 20 minutes in. This would obviously confuse timestamping!

As the FLV files produced by ffmpeg were all under 75 minutes long, they couldn’t have the problem. It turned out we were running an old version of mencoder, and updating that and converting all our long video files fixed the problem. Phew :-)

Join us later today for my third short technical talk on TheyWorkForYou video, where I’ll explain how our Flash application talks to the HTML and vice-versa to enable the “Watch this” and highlighting of speeches.

  1. The Flash player
  2. Seeking
  3. Highlighting the current speech

TheyWorkForYou video - the Flash player

Thursday, June 12th, 2008 by Matthew Somerville

TheyWorkForYou video timestamping has been launched, over 40% of available speeches have already been timestamped, and (hopefully) all major bugs have been fixed, so I can now take a short breather and write this short series of more technical posts, looking at how the front end bits I wrote work and hang together.

Let’s start with the most obvious feature of video timestamping - the video player itself. :) mySociety is an open-source shop, so it was great to discover that (nearly all of) Adobe Flex is available under the Mozilla Public Licence. This meant I could simply download the compiler and libraries, write some code and compile it into a working SWF Flash file without any worries (and you can do the same!).

Writing a Flex program is split into three main areas - MXML that lays out your application, defines any web services you’re using and so on; CSS to define the style of the various components; and ActionScript to deal with things like events, or talking to the JavaScript in the parent HTML. My code is probably quite shoddy in a number of places - it’s my first application in Flex :-) - but it’s all available to view if you want to take a peek, and it’s obviously running on the live TheyWorkForYou site.

To put a video component in the player is no harder than including an <mx:VideoDisplay> element - set the source of that, and you have yourself a video player, no worrying about stream type, bandwidth detection, or anything else. :) You can then use a very useful feature called data binding to make lots of things trivial - for example, I simply set the value of a horizontal slider to be the current playing time of the video, and the slider is then automatically in the right place at all times. On the downside, VideoDisplay does appear to have a number of minor bugs (the most obvious one being where seeking can cause the video to become unresponsive and you have to refresh the page; it’s more than possible it’s a bug in my code, of course, but there are a couple of related bugs in Adobe’s bug tracker).

As well as the buttons, sliders and the video itself, the current MXML contains two fades (one to fade in the hover controls, one to fade them out), one time formatter (to format the display of the running time and duration), and three web services (to submit a timestamp result, delete a mistaken timestamp, and fetch an array of all existing timestamps for the current debate). These are all called from various places within the ActionScript when certain events happen (e.g. the Now button or the Oops button is clicked).

Compiling is a simple matter of running mxmlc on the mxml file, and out pops a SWF file. It’s all straightforward, although a bit awkward at first working again with a strongly-typed, compiled language after a long time with less strict ones :-) The documentation is good, but it can be hard to find - googling for [flex3 VideoDisplay] and the like has been quite common over the past few weeks.

Tomorrow I will talk about moving around within the videos and some bugs thrown up there, and then how the front end communicates with the video in order to highlight the currently playing speech - for example, have a look at last week’s Prime Minister’s Questions.

  1. The Flash player
  2. Seeking
  3. Highlighting the current speech

Lessons from mySociety conversion tracking

Thursday, March 13th, 2008 by Tom Steinberg

Matthew and I have been sitting next to each other today looking at the outputs of his lovely new custom built conversion tracking system, designed to ensure that the optimal number of users who just come to one of our services as a one off get signed up to something else longer lasting.

I’ve been banging on for ages about how government should seize on cross selling people who’ve just finished using one online service into using another of a more democratic nature, so it seems worth spelling out some of the lessons.

First, there’s some interesting data from the last few weeks, since our newest conversion tracking infrastructure has been running in its nice new format.

One of the adverts randomly served to users of WriteToThem (after they’ve finished sending their letter) encourages them to sign up to TheyWorkForYou email alerts - the service people use to get emailed whenever their MP speaks in Parliament. The advert features a slogan of encouragement, and a pre-populated email form containing the user’s email, and a ‘Subscribe me’ button. This advert was shown to 2328 users last month, of whom 676 became TheyWorkForYou email subscribers, which is a pretty cool 29.04% conversion rate. However, we also showed another advert for the same service, to the same WriteToThem users, which also had the same button and text, but which hid the form (and their address). That was shown to 2216 users of whom 390 signed up, a more modest 17.6%. So the impact of simply showing an email box with the users email address in it, versus hiding it, was worth 10% more users. Why? Go figure!

So now we’ve canned the advert that hides the address form, and instead we’re comparing two different adverts both of which feature the pre-populated signup form, but which use different words. It’s probably too early to judge, but the new ad appears to have a very similar conversion rate suggesting it might be hard to squeeze many more subscribers out of this page. We’ll keep trying though!

Another thing we learned of interest was that monthly subscribers to email alerts on TheyWorkForYou were down year on year in the month before we added this new advertising & conversion tracking system, even though the total number of visitors were clearly up on the same month last year. This appears to suggest that two things are happening. First RSS is catching on, so some users who would previously have got email alerts are subscribing to RSS feeds instead. Second, it suggests that the TheyWorkForYou user audience might have been getting more saturated with regulars - proportionally fewer new users coming (although more visitors in absolute terms) so fewer people signing up to get alerts. The cross marketing and conversion tracking seems to have reversed that trend, which is awesome.

We also advertise several different services to people who just finish signing up to get email alerts on TheyWorkForYou itself. We’ve just noticed that a full 25% of people shown the advert to sign up for HearFromYourMP proceed to sign up. We’ve therefore just decided to dump other adverts shown on TheyWorkForYou (such as advertisements for other sorts of TheyWorkFor you email alert) and concentrate on just cross selling HearFromYourMP. A back of the envelope calculation suggests that by just advertising this one site from the completion page we should get an extra 10,000 subscribers to HearFromYourMP this year on top of the organic growth. Not bad for a few minutes analysis, and a number likely to make a fair few more MPs post messages to their patiently waiting constituents.

One last interesting thing (at least to me) is how some more demanding services are a much harder sell than others to users. So asking people to make new groups on GroupsNearYou.com or report a problem in a street on FixMyStreet tend to result in more traditional online marketing scale conversion rates of 0.1% to 2%. Still worth doing, and so we compare different versions of those ads too, to try and eke up those rates for these sites that arguably have more tangible, direct impacts on people and communities.

It will be a challenge for mySociety’s future to work out how to trade off impact against scale of service use - are 10 HearFromYourMP subscribers worth one pothole that doesn’t get fixed? Answers on a postcard…

mySociety’s Freedom of Information site goes live

Friday, February 22nd, 2008 by Tom Steinberg

There’s a lot left to do, but Francis Irving’s brilliant new mySociety Freedom of Information site is now live. You can file requests to central government departments (most of the them), and browse what other people have been requesting (already fascinating). It doesn’t have a name yet, nor any slick design, nor half the features we want it to have, but it works and it gets things done.

And dammit, people, that’s what mySociety’s all about. Can we explain it any better?

Rails packages for Debian Sarge

Monday, November 26th, 2007 by Francis Irving

On our servers we only install software from Debian packages, or our own software with install scripts from our own CVS. This at first seems a bit mad, especially to Ruby on Rails people who love their gems. But it’s a sane way of managing lots of servers (we’ve got 7 Debian servers, and 2 FreeBSD servers to run at the moment).

Of course, you could install packages on them from CPAN, from Ruby Gems, by compiling them yourself and putting them in /usr/local. But you’d have to have another system for each packages system to keep track of what you’d installed and what version, and to worry about security updates. And you’d lose some of the benefits of dependency checking.

Most of our servers are, inevitably, still running Debian Sarge (the latest and greatest when we started them a few years ago). We’re going to gradually upgrade them to Debian Etch, but it is going to take a while. In the fast moving world of Rails this isn’t particularly helpful, so you have to backport packages. I couldn’t find any, so have made some myself.

You can find packages for Rails 1.2.5-1 on Sarge in our Debian package repository. Yeah, still an old version for you people “living on the edge”, but it’s the one in Etch (the latest Debian stable), and is way better than 0.13.1-1 that we had before :)

Freedom on Rails

Thursday, October 18th, 2007 by Francis Irving

This week has been quite bitty. I’ve been doing more work on the Freedom of Information site, have been getting into the swing of Ruby on Rails. Once you’ve learnt its conventions, it is quite (but not super) nice.

As far as languages are concerned, Ruby seems identical in all interesting respects to Python. It’s like learning Spanish and Italian. Both are super languages. Ruby has nice conventions like exclamation marks at the end of function names to indicate they alter the object, rather than return the value (e.g. .reverse!). But then Python has a cleaner syntax for function parameters. It is swings and roundabouts.

Rails has lots of ways of doing things which we already have our own ways of doing for other sites. The advantage of relearning them, is that other people know them too. So Louise was able to easily download and run the FOI site, and make some patches to it. Which would have been much harder if it was done like our other sites. Making development easier is vital - for a long time I’ve wanted a web-based cleverly forking web application development wiki. But while I dream about that, Rails packaging everything you need to run the app in a standard way in one directory that quite a few people know how to use, helps.

Other things… I’ve been helping Richard set up GroupsNearYou on our live servers, it should be ready for you to play with soon. It looks super nice, and is easy to use. I’ve had some work to do with recruitment. And catching up on general customer support email for TheyWorkForYou and PledgeBank. I’ve also been updating the systems administration documentation on our internal wiki, so others can work out how to run our servers.


News & information:
Projects:
Contact & information:
Technical:

mySociety is a project of UK Citizens Online Democracy (UKCOD). UKCOD is a registered charity in England and Wales, no. 1076346.