We sometimes have incorrect or out of date addresses for sending Freedom of Information requests to. Now anyone can check our addresses. Click “view FOI email address” on the page for any authority, and enter two of those squiggly words to prove you are not a robot.
If you are using WhatDoTheyKnow, and suspect problems with a request, please do check the address we are using is correct. If you are from an authority, or work closely or know a particular authority, please also check the address.
The epic task of manually matching each of the 42,019 video clips of MPs was started way, way back, ooh, about 12 whole weeks ago. Two days ago the Number 1 rated volunteer timestamper in our league table, Abi Broom, completed the last clip in our database, bringing her personal tally to 8,543 clips.
Last night we went out and met with Abi and Robert Whittakker, one of the other super-timestampers who had done over 2,000 himself.
As a result of their efforts, and those of hundreds of other volunteers, we have put all the video that we have of the House of Commons sitting over the last year online, next to the text of the debates. The many thousands of people per day who visit TheyWorkForYou can, as a direct consequence of this work, now see video of most of the debates for the last year. When people embed clips on their own sites, that’ll also be thanks in part to our volunteers.
When Parliament starts again in the Autumn there’ll be another 300-400 clips a day to do, but we have a feeling the only problem doing them will be who gets to them first.
In the meantime, we’ll soon be working on another game-like toy to help create more data. Hint – it might have something to do with GroupsNearYou.
There’s been a lot written recently about the cognitive surplus, a phrase coined by Clay Shirky to describe the amount of human energy that can be deployed to create things if only barriers are lowered and incentives sharpened.
mySociety has recently been fortunate enough to see a little of this phenomenon through the explosion of volunteering activity which grew up around our TheyWorkForYou video timestamping ‘game’. For those of you not familiar, we needed video clips of politician’s speaking matched with the text of their speeches, and in just a couple of months a gang of volunteers new and old have done almost all of the video in the archive. Other, much larger examples include reCAPTCHA and the ESP game.
Reflecting on this, my friend Tom Lynn suggested that there was a gap in the market for a service that would draw together different crowdsourcing games, ensure that their usability standards and social benefit were high, and which then syndicate them out in little widgets, recaptcha style, to anyone who wanted to include one on a web page.
This is where Mozilla and Ubuntu come in. Anyone who uses Firefox knows what the home page is like, essentially the Google homepage with some Firefox branding. Ubuntu’s default browser homepage, post patch upgrade especially, is similarly minimalist and focused on telling you what’s changed.
Therein lies the opportunity – using pieces of these default home pages (maintained by organisations that claim to have a social purpose, remember) for more good than simply repeatedly reminding users about the the brand of the product. Traditionally that would mean asking people to donate or become volunteers, but the new universe of ultra-easy crowdsourcing games are challenging that assumption.
Here’s a scenario. One time in ten when I load Firefox, the homepage contains a widget right under the search box that contains an almost entirely self explanatory task that contributed to the public good in some way. This could be spotting an object on a fragment of satellite photo after a disaster, typing in a word that’s difficult to OCR, timestamping a video clip, or adding tags to an image or a paragraph of text. The widgets would be syndicated from the central repository of Cognitive Surplus Foundation ‘games’, and would help groups like Mozilla and Ubuntu to show themselves to millions of tech-disinterested users to be the true 21st century social enterprises that they want to be.
TheyWorkForYou now finds whenever an old version of Hansard is referenced (which they do by date and column number, e.g. Official Report, 29 February 2008, column 1425) and turns the citation into a link to a search for the speeches in that column on that date. This only really became feasible when we moved server, upgraded Xapian, and added date and column number metadata (among others), allowing much more advanced and focussed searching – the advanced search form gives some ideas. Perhaps in future we’ll be able to add some crowd-sourcing game to match the reference to the exact speech, much like our video matching (nearly 80% of our archive done!). 🙂
Kudos to Google and Yahoo! for spotting this change within a couple of days, as they’re now so busy crawling everything for changes that they’re slowing the whole website down… 😉
TheyWorkForYou video timestamping has been launched, over 40% of available speeches have already been timestamped, and (hopefully) all major bugs have been fixed, so I can now take a short breather and write this short series of more technical posts, looking at how the front end bits I wrote work and hang together.
Let’s start with the most obvious feature of video timestamping – the video player itself. 🙂 mySociety is an open-source shop, so it was great to discover that (nearly all of) Adobe Flex is available under the Mozilla Public Licence. This meant I could simply download the compiler and libraries, write some code and compile it into a working SWF Flash file without any worries (and you can do the same!).
To put a video component in the player is no harder than including an <mx:VideoDisplay> element – set the source of that, and you have yourself a video player, no worrying about stream type, bandwidth detection, or anything else. 🙂 You can then use a very useful feature called data binding to make lots of things trivial – for example, I simply set the value of a horizontal slider to be the current playing time of the video, and the slider is then automatically in the right place at all times. On the downside, VideoDisplay does appear to have a number of minor bugs (the most obvious one being where seeking can cause the video to become unresponsive and you have to refresh the page; it’s more than possible it’s a bug in my code, of course, but there are a couple of related bugs in Adobe’s bug tracker).
As well as the buttons, sliders and the video itself, the current MXML contains two fades (one to fade in the hover controls, one to fade them out), one time formatter (to format the display of the running time and duration), and three web services (to submit a timestamp result, delete a mistaken timestamp, and fetch an array of all existing timestamps for the current debate). These are all called from various places within the ActionScript when certain events happen (e.g. the Now button or the Oops button is clicked).
Compiling is a simple matter of running mxmlc on the mxml file, and out pops a SWF file. It’s all straightforward, although a bit awkward at first working again with a strongly-typed, compiled language after a long time with less strict ones 🙂 The documentation is good, but it can be hard to find – googling for [flex3 VideoDisplay] and the like has been quite common over the past few weeks.
Tomorrow I will talk about moving around within the videos and some bugs thrown up there, and then how the front end communicates with the video in order to highlight the currently playing speech – for example, have a look at last week’s Prime Minister’s Questions.
Thank you for all the good work – and especially to our top timestampers. David Jones, Alex Hazell and Lee Maguire are currently the top three in the overall rankings, but there are five more people who have timestamped more than 500 clips, another seventeen people who have done 100 clips or more, and more than 100 people who’ve done anything from 1 to 100 clips. And of course, there’s also a fair few anonymous people who haven’t yet registered, so their individual contributions to the “anonymous” total of 3349 clips are not recorded on the league tables. Remember, we’ll be handing out prizes to the top timestampers, so get registered before you timestamp your next video!
We’re starting to collect a list of notable clips that we can use to compile a “best of parliament” video gallery – if you would like to nominate a particular speech, please leave a comment below or send an email to email@example.com – just tell us the name of the MP speaking, and the URL of the page where this speech appears on theyworkforyou.com. We’ll put the best of them together and publish a list later this summer.
We’re very excited to announce that our Parliamentary website TheyWorkForYou.com now includes video of debates in the House of Commons – but we need your help to match up each speech with the video footage.
It’s really easy to help out. We’ve built a really simple, rather addictive system that lets anyone with a few spare minutes match up a randomly-selected speech from Hansard against the correct snippet of video. You just listen out for a certain speech, and when you hear it you hit the big red ‘now’ button. Your clip will then immediately go live on TheyWorkForYou next to the relevent speech, improving the site for everyone. Yay!
You can start matching up speeches with video snippets right away, but if you take 30 seconds to register a username then we’ll log every speech that you match up and recognise your contribution on our “top timestampers” league table. We’ll send out mySociety hoodies to the top timestampers – they’re reserved exclusively for our volunteers as a badge of honour.
We think that this really easy approach to crowd-sourcing data about online video could come in useful in many different situations – not just for politics – and we hope that it gets used all over the place. It might even be a world first, we’re not sure. If you’d like us to create something similar for your local legislature, sports team, Am Dram group or anything else that can be audio or video recorded then please get in touch. We’d also really appreciate your feedback on the current beta system – please send your email to firstname.lastname@example.org.
Note to MPs, researchers, office staff, campaigners and bloggers – we know that you want to concentrate on matching up the speeches of a particular MP, or of a particular debate. If this sounds like you, please send an email to email@example.com with what you want, and we’ll help you do it.
This project was initially commissioned and funded by the BBC, who asked mySociety to create a searchable, online video archive of debates based on footage from BBC Parliament. We were thrilled to help out, because we think that it will enhance the public understanding of – and respect for – the work of Parliament. The initial goal of this project was to use the BBC’s captions to help chop up the video into different speeches. Tom Loosemore arranged for access to the BBC’s internal captions data, Etienne Pollard was commissioned to build an open source recording/transcoding/web-serving system (and then donated some of his wages back to pay for enough hard drive space for the video!), Stef Magdalinski donated a network storage array to hold the disks. However, after lots of hard work trying to get our computers to automatically slice up the video into chunks according to the BBC’s captions we concluded that this on its own wasn’t sufficiently accurate to reliably match up every speech in Hansard with the appropriate snippet in our video footage.
Adversity, however, is a great source of innovation. Matthew Somerville, working on a spec first sketched out by Tom Steinberg customised the flash interface substantially so that users could watch video and help add correct timestamps. Now that’s built, what remains is for you to do your part! What’s more, once we get a significant number of speeches timestamped we can start providing web feeds and APIs for MPs to embed video footage directly on their own websites, and video of your MP’s most recent speeches on their MP page on TheyWorkForYou.
There are some conflicting views about whether this all online video of Parliament is a good idea – for instance, this video snippet (created using the new system) shows that the Deputy Leader isn’t so keen on the idea of Parliamentary footage appearing on sites like YouTube. Or perhaps she’s just been misunderstood – now you can judge for yourself what she was saying, based on her appearance and intonation. On the other hand, the BBC seem to understand the benefit of putting video content online (and they’re a fully paid up member of ParBol, the Parliamentary Broadcasting group), and Parliament themselves have an alternative set of online video streams. Unfortunately the official Parliamentary video service can’t be integrated with Hansard, is only available in Windows Media format, only has enough storage to keep the most recent 28 days of footage in archive, and doesn’t even attempt to break up the video into individual speeches apparently you can search for speeches after all, although this capability isn’t actively advertised. It perhaps goes without saying that mySociety considers it an important public service for citizens to be able to find footage of their MPs doing their work, and we will resist attempts to deny this service to citizens.
One final thing – we’re currently trying to persuade the clerks in Parliament to tweak their internal processes a bit, and make it easier for people to see how laws are made. It’s called the Free Our Bills campaign, and we need as many people as possible to join the campaign, so that we can bring law-making into the 21st century. Please sign up now!
There are already over 1000 timestamps, and we’ve not even gone for any media coverage yet. Well done all!
Update 11.00AM on Thursday 5 June 2008
6769 speeches have now been timestamped, which is exactly 20% of the current total of 33838 speeches. Thanks for all your efforts, and keep up the good work!
A few weeks ago mySociety and Politik Digital held a small unconference in Berlin. The idea was to get together some of the best practioners building and running democracy websites across Europe, regardless of their size or status.
I’ll try to write this up more fully soon, but for the moment I wanted to share some email interviews I did with some of the participants after the event. The first is with Guglielmo Celata from the Italian group D.E.P.P. We first came across them a couple of years ago when they borrowed some code from PublicWhip.org.uk (the independent volunteer vote analysis project run by Julian Todd and mySociety senior developer Francis Irving) for their website OpenPolis.
Anyway, enough for the context – D.E.P.P have some great, boundary-pushing work coming up and I thought people in the English speaking community would want to know.
What is the organisation you work for?
The name of the Association is D.E.P.P., that stands for Electronic Democracy and Public Partecipation.
It’s a relatively small group of people (four) who work on e-partecipation projects with local administrations (the municipality of Rome and the Regione Lazio, for examples). We also have an self-financed project, named Openpolis, to map politicians, their charges, their declarations, both at a national and at a local level. More on this later.
What is the main purpose of the site(s) that you run?
We have a project named eDem 1.0 which has been so far installed twice: municipiopartecipato.it focuses in enabling e-participation of local communities on the “participatory budget”; and edem-regione on the budget of the Regione Lazio (the link points to an alpha version).
I think the participatory budget for the local community is far more interesting. The site shows a list of issues categorized by theme and territory. Registered users can vote up issues and make them emerge as important. Issues are created by the users. Users can also create proposals related to issues and vote them. The integration with Google Maps, allows user to see how issues and proposals distribute in their territory; it makes the user interface immediate (and of course makes the site sooo stylish).
The proposals emerging as the most voted are approved and follow a workflow to be actually financed and implemented.
Online activities and offline physical assemblies (which exist), are linked together by a group of paid people, called enablers. They take care of moderating both offline and online activities, too.
The other project has almost the same features, but applied to the budget document of the Regione Lazio. Of course, the issues here are not created by the citizens, being the chapters (or sections) of the official budget document. The citizens can create and rate proposals, but such proposals are never going to be implemented.
This happens a lot, administrators are intersted in e-participation projects, but they want to reduce the possibility of issues emerging directly from citizens, and€ of course they try to change the nature of the project from a participative one, into a consultative one. A kind of Poll 2.0, if one wants to be cynical.
Can you tell us about your next site, the one you showed us in Berlin?
Openpolis is a project to gather informations on our political class and make them transparent. How they vote once elected, what laws they propose, their charges in institution, political parties and private organisations, public declarations, financial interests, judicial positions etc. The aim of the project is to revive the bond between the citizens and their representatives. We would like to give individuals or organized group of citizens, a set of tools to enable them to perform lobbying activities.
We want to work both at a national level and at a very local level, and to do this we plan to allow users to create part of the content on the site, and hope this way to create communities, wiki-style.
However, the site is not a wiki, since content has to be well-structured; we want to export statistics and make analysis on data added by users.
You are planning to combine information gathered from formal sources, and submitted by users. Can you tell us where you’re getting the formal information from, and how you are going to handle the information submitted by users?
We have different levels, and correspondingly different sources. At a national level, we are harvesting the official web sites of the Camera and Senato (the two houses of national representatives) and the web site of European Parliament. At local levels we rely on official biographical data from the Ministero degli Interni (Interior Ministry). We double check politician’s data for the 20 major cities in italy, but of course can’t possibly dream of doing that for the 109 provinces and 8100 municipalities.
For data on charges, declarations, financial interests and judicial positions, and for a complete double check on details and biographical data, we plan to leverage the community of users. The more users, the more data and verification.
Of course, data inserted by the users must be always connected to sources (i.e a web link, a reference to a book, an article in a newspaper, or a radio or television program). Data will be verified by moderators, and the community of moderators will grow on trust basis (using a karma-based system, so that when a subscribed user reach a certain treshold of trust, he is proposed as moderator to the board of administrators). We all know that this part is a real challenge and that handling a community online is a daunting task, but, hey, let’s try.
Users can be banned and content can be censored (after publishing), but any banning or censorship will be performed transparently, so that anyone, in any moment will be able to know the reason why a user was banned.
Do you ever face claims that the effects you have on politicians aren’t entirely positive? If so, how do you respond?
We actually have not yet started, but we do plan on receiving a lot of such claims. Of course we are trying to create something that the politicians should use, as well, so the most interested and active users should be the politician themselves.
Are there any other features of your site that you think are unusual or unique?
We plan to release an API, in order to make integration of our data and analysis possible directly from other web applications. Starting from RSS feed, to a proper API, it should be possible to integrate pieces of our applications directly into people’s blog or other similar applications.
What other projects around the world excite you the most, and why?
Well, of course the TheyWorkForYou project was a real kick off, we just thought: “wow, we have to do that here in Italy!” Then I really appreciate the work at GovTrack.us, especially from the technical standpoint, for the innovative way of using RDF and the Semantic Web approach.
Here in Italy, a project I forgot to mention in Berlin is: http://fainotizia.radioradicale.it.
FaiNotizia means Make Your Own News, it’s a project by Radio Radicale, an historical radio broadcast of the Italian Radical Party. It provides one of the first citizens journalism website in Italy and we plan to integrate with them in the future.
Do you use the law to help you get information? If so, how have you gone about it, and what have you obtained?
We haven’t so far, every information that we gathered was publicly available, we just wrote tons of parser code.
We plan to push the release of data on financial interests and judicial positions, though. Those data are public, but poorly accessible (no electronic format, no scanning, phtos or copies possible). This will require some legal actions or some fantasy to get them. We’ll see.
So there we are. If you’ve any further questions or clarifications, just post a comment here and I’ll update this post with Guglielmo’s help.