1. Mobile operators altering (and breaking) web content

    We had a complaint that FixMyStreet maps weren’t displaying on someone’s computer. We hadn’t had any other complaints, and we quickly narrowed it down to the fact that the person was on the internet using a tethered T-Mobile phone.

    T-Mobile (and Orange, and quite possibly others) are injecting JavaScript and altering content served over their networks. Their reason for doing this, according to their websites (T-Mobile, Orange), is to compress images and video sent to your browser, so as to speed up your browsing. Seeing it in action, they also inline some CSS and JavaScript, though not all, and remove comments from external files.

    However, their implementation breaks things. In this particular instance, the T-Mobile JavaScript comment stripper appears to be searching for “/*” and “*/” and removing everything inbetween. This might work in most cases; however in the jQuery library, we find a string containing “*/*”, and later down the file, another string containing “*/*”. T-Mobile remove everything between the things it thinks are comment markers, even though they’re actually contained within strings, causing the jQuery library to be invalid JavaScript and stopping anything using jQuery from running.

    Their decision to inline lots of the CSS also seems a bit odd – sure, on a mobile this might be quicker, but even ignoring tethering nowadays plenty of mobiles have caches too and having the CSS download once and be cached would seem better than adding weight to every page download. But I’m sure they’ve studied their decision there, and it doesn’t make any difference to the actual browsing, as opposed to the comment removal.

    To turn off this feature on your mobile phone or broadband, visit accelerator.t-mobile.co.uk or accelerator.orange.co.uk on your connection and pick the relevant option – if anyone knows of similar on other networks, do leave updates in the comments.

    From a FixMyStreet point of view – whilst FixMyStreet functions just fine without JavaScript, I had made the (perhaps incorrect) decision to put the map inside a <noscript> element, to prevent a flash of map-oddity as the JavaScript map overlaid the non-JS one. However, this meant in this circumstance the map did not work, as JavaScript was enabled, but jQuery was unable to be loaded. I haven’t decided whether to change this behaviour yet; obviously it would help people in this situation as the map would still display and function as it does for all those without JavaScript, but for those with JavaScript it does look a bit jarring as the page loads. Any suggestions on a better approach welcome :)

  2. FixMyStreet iPhone app

    We’ve had reports that our FixMyStreet iPhone app is crashing on iPhone 3.0, and so have withdrawn it from the App Store until we are able to find out what’s wrong and fix it. I’m afraid I don’t know when that will be, as it’s all rather busy at present – if anyone has the skills and would like to volunteer to help, the code is available and should just import into XCode. I can supply some crash logs too.

  3. Relentlessly into autumn

    I’m enjoying the weather at the moment, seems to be sunnier than the summer, but cool with an atmospheric autumnal taste in the air.

    mySociety is changing as ever, leaping forward in our race to try and make it easier for normal people to influence, improve or replace functions of government. More on this as it happens.

    Meanwhile, I’ve been continuing to hack away at WhatDoTheyKnow. A little while ago Google decided to deep index all our pages – causing specific problems (I had to tell it to stop crawling the 117th page of similar requests to another request), and also ones from the extra attention. There have been quite a few problems to resolve with authority spam filters (see this FOI officer using the annotation function), and with subtle and detailed privacy issues (when does a comment become personal? if you made something public a while ago, and it is now a shared public resource, can you modify it or take it down?).

    Right, I’ve got to go and fix a bug to do with the Facebook PledgeBank app. It’s to do with infinite session keys, and how we send messages when a pledge has completed. Facebook seem to change their API without caring much that applications have to be altered to be compatible with it. This is OK if the Facebook application is your core job, but a pain when you just want your Facebook code to keep running as it did forever.

    (the autumn photo thanks to Nico Cavallotto)

  4. TheyWorkForYou video – seeking

    Our video is streamed via progressive HTTP, using lighttpd and mod_flv_streaming. This works by having keyframe metadata at the start of the FLV (Flash video) file (we add ours using yamdi as that doesn’t load the whole file into memory first), which maps times within the video to byte positions within the file. When someone drags the position slider, or presses a skip button, the player actually changes the source of the video to something like file.flv?start=<byte position> which starts a new download from that point in the video. This means you can seek to parts of the video not yet downloaded, which is definitely a required feature.

    The video is split up into programme chunks, according to BBC Parliament’s schedule, so each Oral Questions will (approximately) be its own video chunk, and the main debates will be a couple of chunks. By default, the video player will show a screengrab from the start of the video, as that’s all that’s available when it first loads (you have to load the start of the FLV file to fetch the keyframe metadata in order to move anywhere else :) ). I wanted the player to show a relevant screengrab before you hit Play, so came up with the slightly messy workaround of setting the volume to 0, seeking and playing the video for under a second in order to start it from the new point and show the video, then stopping it and resetting the volume. It works most of the time :-)

    Some of our video chunks have jumps in them, due to problems in downloading the original WMV stream. The timestamping interface has a link for people to let us know of such problems, so that we can mark the relevant speeches as missing video and not have them be offered to future timestampers. One valiant volunteer, Tim, let us know about two such videos, but with the added oddity that if you let them play, they would happily carry on past their “end” point, but this made timestamping those speeches quite difficult.

    I started investigating, firstly noting that both videos should have been 6 hours long, but were both listed as 1:20:24, which I thought was a bit of an odd coincidence. After reading the FLV file specification, it turned out that 32-bit millisecond timestamps in FLV are split into two – first the low 24 bits, then the high 8 bits. 2^24 = 16,777,216, which in milliseconds is 4 hours, 39 minutes, 37 seconds, which is pretty much exactly what the two videos’ durations were short by! All the timestamps in our FLV files were not setting the high byte, so after 4:39:37, they were wrapping round to 0 (and thus 6 hours became 1:20:24ish).

    Our video processing consists of four major steps – the downloading script uses ffmpeg to convert each 75 minute chunk from WMV to MPEG; then nightly processing uses ffmpeg again to convert the right bits of these MPEG files to FLV, mencoder to join the relevant FLV files into one FLV chunk, then yamdi to add the metadata. My first try at a solution was to alter yamdi to increment the high byte itself, which fixed the duration display and let you seek to high times, but when you tried to go to e.g. 5 hours, the video started playing from the right point but the video thought it was playing from 20 minutes in. This would obviously confuse timestamping!

    As the FLV files produced by ffmpeg were all under 75 minutes long, they couldn’t have the problem. It turned out we were running an old version of mencoder, and updating that and converting all our long video files fixed the problem. Phew :-)

    Join us later today for my third short technical talk on TheyWorkForYou video, where I’ll explain how our Flash application talks to the HTML and vice-versa to enable the “Watch this” and highlighting of speeches.

    1. The Flash player
    2. Seeking
    3. Highlighting the current speech
  5. “Truth lies within a little and certain compass, but error is immense.”

    I’ve been working on PledgeBank quite a bit recently. As well as adding survey emails asking whether signers have done their pledge, and a feature for people to contact a pledge’s creator, I’ve been fixing numerous bugs that have sprung up along the way. For starters, people on the Isle of Man and the Channel Islands now get a much more helpful error if they try and enter their postcode anywhere on the site, rather than the confusing postcode not recognised they were getting previously.

    Other errors I found turned out not to be with our code. The PledgeBank test suite (that we run before deploying the site to check it all still works) was throwing lots of warnings about “Parsing of undecoded UTF-8 will give garbage” when it got to the testing of our other language pages. Our code wasn’t doing anything special, and there were multiple places the warning came from – upgrading our libwww-perl removed one, and I’ve submitted bug reports to CPAN for the rest (having patched our copies locally – hooray for open source).

    The Perl warnings were at least understandable, though. While tracking down why the site was having trouble sending a couple of emails, I discovered that we had a helper function splitting very long words up to help with word-wrapping – which when applied to some Chinese text was cutting a UTF-8 multibyte character in two and invalidating the text. No problem, I think, I simply have to add the “/u” modifier to PHP’s regular expression so that it matches characters and not bytes. This didn’t work, and after much playing had to submit a bug report to PHP – apparently in PHP “non-space character followed by non-space character” isn’t the same as “two non-space characters in a row”…

  6. “When a rose dies, a thorn is left behind.”

    The first new councillor details have begun to automatically arrive in our database, thanks to GovEval. 34 councils were reactiviated on WriteToThem today, from Alnwick to Wokingham. It would have been 38 but the other 4 councils have had boundary changes that we don’t have the data for yet.

    16 of the 40 Welsh Assembly constituencies did not change their boundaries at the election (this took some time to work out, as the Press Association said it was 18, and the official report from the Boundary Commission for Wales said it was 17 :) ). Those 16 Assembly Members are now also reactiviated on WriteToThem, along with their regional AMs.

    Other than that, I’ve continued tweaking Neighbourhood Fix-It and started some work on TheyWorkForYou – the first step of which is to deal with the large backlog of mail that’s accumulated, leading to a number of bugfixes. Apologies to anyone who was trying to look at Brian Wilson MLA‘s page and found themselves stuck in an infinite loop of being told there were two Brian Wilsons. We also had a couple of emails asking us why Gordon Brown didn’t have a voting record on equal gay rights like other MPs. This was easy to answer – he’s never voted in any division that is included by that PublicWhip policy – and so an MP’s page now states if they’ve been absent from every vote on a particular policy.

  7. “Now, here, you see, it takes all the running you can do, to keep in the same place.”

    Whereas new sites are lovely, and I talk about Neighbourhood Fix-It improvements further down, there’s still quite a bit of work that needs to go into making sure our current sites are always up-to-date, working, and full of the joys of spring. Here’s a bit of what I’ve been up to recently, whilst everyone else chats about database upgrades, server memory, and statistics.

    The elections last week meant much of WriteToThem has had to be switched off until we can add the new election results – that means the following aren’t currently contactable: the Scottish Parliament; the Welsh Assembly; every English metropolitan borough, unitary authority, and district council (bar seven); and every Scottish council. The fact that the electoral geography has changed a lot in Wales means there will almost certainly be complicated shenanigans for us in the near future so that our postcode lookup continues to return the correct results as much as possible.

    Talking of postcode lookups, I also noticed yesterday that some Northern Ireland postcodes were returning incorrect results, which was caused by some out of date entries left lying around in our MaPit postcode-to-area database. Soon purged, but that led me to spot that Gerry Adams had been deleted from our database! Odd, I thought, and tracked it down to the fact our internal CSV file of MLAs had lost its header line, and so poor Mr Adams was heroically taking its place. He should be back now.

    A Catalan news article about PledgeBank brought a couple of requests for new countries to be added to our list on PledgeBank. We’re sticking to the ISO 3166-1 list of country codes, but the requests led us to spot that Jersey, Guernsey and the Isle of Man had been given full entry status in that list and so needed added to our own. I’m hoping the interest will lead to a Catalan translation of the site; we should hopefully also have Chinese and Belarussian soon, which will be great.

    Neighbourhood Fix-It update

    New features are still being added to Neighbourhood Fix-It.

    Questionnaires are now being sent out to people who create problems four weeks after their problem is sent to the council, asking them to check the status of their problem and thereby keep the site up-to-date. Adding the questionnaire functionality threw up a number of bugs elsewhere – the worst of which was that we would be sending email alerts to people whether their alert had been confirmed or not. Thankfully, there hadn’t yet been any such alert, phew.

    Lastly, the Fix-It RSS feeds now have GeoRSS too, which means you can easily plot them on a Google map.

  8. This is what Beta means: the first 48 hours of petitions

    Since the petition system went out properly on Wednesday, we’ve been absolutely buried in an avalanche of changes, fixes, feature additions and massive massive amounts of email. I thought that you might be interested as to what sort of stuff has happened in the first two days:

    • Email has taken over our lives. Matthew has responded to over 200 emails since yesterday morning, and I was up at 4am last night just trying to cope with the rate of incoming of mail. Francis, who’s now in Canada, then heroically took up the baton and responded to mail all (UK) night! Many if not most of these mails are giving us suggestions, as well as bug reports, problems with email and bits of praise and the odd conspiracy theory.
    • Changes made to cope with expats and overseas military personnel.
    • Phoned Hotmail to stop their system from eating 95% of the confirmation messages being sent to Hotmail accounts!
    • Redesigned the automated mails no10 get telling them there’s a new petition (they’ve had over 500 of these mails, so they need to be clear and easy to read!)
    • Made the rejected petitions system more granular, so that if a petition has to be rejected, and part of it has to be hidden (say, if it is libellous), then it only hides that bit, not the whole thing. Maximum transparency is the goal, you see.
    • New options added to sort the list of all petitions in different ways, by number of signatures being the most asked for.
    • Limited the length of “more info” fields so people can only write long rants, rather than really really long rants :)
    • Special cased people with AOL accounts, so that their, erm, nonstandard email clients can actually cope with the confirmation links.
    • Made several fixes to the processes involved in sending out confirmation mails.
    • Made RSS changes and improvements.
    • Updated various bits of text, like providing examples of what “party political” means. The BBC initially wrote that this meant no pledges mentioning controversial issues like Iraq, which was grabbing quite the wrong end of the stick about the nature of the rules. Now we have some complaining emails saying we’re being too liberal!
    • Compiled a big list of user suggestions and fixes on the wiki.
    • Made the rejection criteria in the Ts&Cs actually match the ones in the admin interface.
    • Installed a stats packages to watch what’s going on.
    • Added facility to search petitions
    • Improved/fixed logging
    • Added link and text pointing to the open source code.

    I’ve probably missed some – I’m sure Matthew, Chris, Francis and Ben will let me know!

  9. What we’re up to

    Much of my August seems to have been absorbed with maintenance tasks.

    For example, Chris and I spent a few days tightening up WriteToThem’s privacy. I made sure the privacy statement correctly describes what happens with backup files, and failed messages. I reduced the timeouts on how long we keep the body of failed messages. I made sure we delete old backup files of the WriteToThem database. I wrote scripts to run periodically to check that no bugs in our queueing demon can accidentally mean we keep the body of messages for longer than we say. I added a cron job to delete Apache log files older than a month for all our sites. As AOL know to their cost, the only really private data is deleted data.

    Earlier in the month, I handled some WriteToThem support email for the first time in ages. We get a couple of hundred messages a week, which Matthew mainly slogs through. It’s good for morale to do it, as we get quite a lot of praise mail. It is also hard work, as you realise how complicated even our simple site and the Internet are, and it leads to fixing bugs and improving text on the site. I made a few improvements to our administration tools, and things like the auto-responder if people reply to the questionnaire, to try and reduce the amount of support email, and make it easier to handle.

    I did some more work on the geographically cascading pledges (like this prototype one), but I’m still not happy with them. In the end, I realised that it is the structure of wording of the pledge that is the key problem. Our format of “If will A but only if N others will B” just isn’t easily adapted to get across that the pledge applies separately in different geographically areas. Working out how to fix that is one of the things we’ll brainstorm about in the Lake District (see below).

    The last couple of days I’ve been configuring one of our new servers who is called Balti, and getting the PledgeBank test harness working on it. Until now, it has only been run on my laptop. This is partly heading towards making a proper test harness for the ePetitions site, running on a server so we properly test nothing can be broken before deploying a new version.

    Matthew has wrapped up the TheyWorkForYou API now, and is working on Neighbourhood Fixit next. Chris has been doing lots more performance work for the ePetitions site. And he’s been making some funky monitoring thing to detect PostgreSQL database lock conflicts, which we get occasionally and are hard to debug

    Tom’s in Berlin at the moment, he gave a talk last night, and I think has been to see some people from Politik Digital. As we’ve been discussing on the mySociety email list, there’s an EU grant we’re likely to apply for in collaboration with them.

    On Friday, we’re all going to the Lake district for a week, with some of the trustees and volunteers intermittently. We very conveniently and cheaply all work from home, so it’s good and necessary to meet up for a more sustained period of time at least once a year. Last year we were in Wales.

  10. Time shifting

    So, we’ve been a bit quiet on this blog, but naturally busy. I just did my invoice and timesheet for last month, and remembered how bitty it has been. In one day I often do things to 3 websites, and that is just CVS commit messages – no doubt I handled emails for more. This makes it quite hard to summarise what has been happening, and also quite hard to measure how much time we spend maintaining each website.

    We’ve recently made a London version of PledgeBank, which I’ll remind Tom to explain about on the main news blog. It is a PledgeBank “microsite”, with a special query for the front page and all pledges page that shows only pledges in Greater London. Which is conveniently almost exactly a circle radius 25km with centre at 51.5N -0.1166667E. I worked that out by dividing the area (found on the Greater London Wikipedia page) by pi and taking the square root And rounding up a bit.

    Yesterday we launched a new call for proposals – head on over, and tell us your ideas for new civic websites. It is another WordPress modification, but this time to the very blog that you’re reading now. The form for submitting proposals I made anew, It creates a new WordPress low-privileged user by directly inserting into the database, and then calls the function wp_insert_post to create a post by them in a special category. The rest of the blogging software then trivially does comments, RSS, search, email alerts and archiving.

    Meanwhile, Chris has written some monitoring software for our servers, to alert us of problems and potential problems. Perl modules do the tests, things like enough disk space and that web servers that are up. I’ve been tweaking it a bit, for example adding a test to watch for long-running PostgreSQL queries which indicate a deadlock. We’ve got a problem in the PledgeBank SMS code which causes deadlocks sometimes, which we’re still debugging.