It’s official, there’s going to be a General Election in the UK on June 8th.
As you might suspect mySociety has lots of tools and services that you might find useful during the campaign whether you just want to find out the voting record of your current MP or if you’re planning on building a website or app to cover the campaign.
First things first: TheyWorkForYou.com already covers in lots of detail who your MPs are and how they voted. This should be your first port of call so that you can evaluate your incumbent MP, especially when you’re thinking about who to vote for next.
Over the next couple of weeks we are going to make some changes here and there to make relevant parts of the voting record more prominent, and more clearly explain how we calculate the voting records themselves.
If you’re planning on using the data we have in TheyWorkForYou you can access information on UK politicians, parliamentary debates, written answers, and written ministerial statement via our API at theyworkforyou.com/api
Tomorrow we’ll share a blog post explaining in a little more technical detail how to access the API and some advice on how to get the most out of the service.
Building a service or website that covers all or part of the country and want an easy way to let your users identify which constituency they are in? Then MapIt is your friend.
It already powers most of our own services and is widely used by the likes of Government Digital Services and our friends at Democracy Club.
You can sign up for for free at mapit.mysociety.org and if you need more calls it’s easy to upgrade to a monthly plan – you can get 10,000 calls a month for free if you are a charity or working on an open project – if you think you are going to be busier than that (a) congrats and (b) drop us an email at email@example.com
Helping Democracy Club
Speaking of Democracy Club we’re going to be wholeheartedly supporting their efforts to crowdsource a full set of candidate data in the run up to the election – they are gathering all of their ideas together in this Google Doc https://goo.gl/8WtZvc
We had planned to make some updates and amends to the YourNextRepresentative service that supports Democracy Club’s WhoCanIVotefor.co.uk site in the quiet period between major elections, ahem, but with the snap election called we’ll be doing what we can to make the site run faster and make whatever UI tweaks and fixes we can in the time available.
They will no doubt be looking for help in sourcing candidate data, so please do sign up to help and find out what you can do democracyclub.org.uk/blog/2017/04/18/its-ge2017
In summary and to make it easy you can find all of our relevant #GE2017 datasets and APIs here data.mysociety.org/datasets/?category=ge2017
It’s not too late to let your current MP know what you think on any subject of your choice via WriteToThem.com.
And finally, don’t forget to register to vote yourself at gov.uk/register-to-vote
When working with data that you didn’t set out to gather you have to be careful to think about what the data actually means, rather than what it seems to be saying. As an example, one of the “interesting” side effects of FixMyStreet is a database of places people have reported dog poop (or “dog fouling” as it tends to be called academically). We now have over 20,000 locations across the UK where nature’s call has both been heard, and reported.
My first thought when learning about this data was “that’s a lot of dog poop!” but it turns out 20,000 dog poops is not a lot of dog poop at all. There are an estimated 8.5 million dogs in the UK, assuming (on average) each one poops once a day, they’ll produce over 3.1 billion poops a year.
So actually, 20,000 poops over nine years is nothing compared to the amount of pooping going on. But just because our data is a drop in the bucket doesn’t mean we can’t learn interesting things from it. The first question to ask is if we have a representative sample of where all this dog fouling is going on. The answer, sadly, is no. But the reasons for that answer raise further questions – which is interesting!
When you map the location of dog poo complaints in England against the Index of Multiple Deprivation , you get this:
This tells us that reports about dog fouling are roughly parabolic – there are more in areas in the middle than those that are either very deprived or very not.
This is interesting because when Keep Britain Tidy actually went out into the world and checked (p. 14), they found this:
This graph tells a very different story, where dog fouling gets worse the more deprived the area. But why is this? And why doesn’t our data tell the same story?
One reason we would expect more dog poop in the most deprived areas is that the most deprived areas are more urban. Taking the same IMD deciles and using the ONS’s RUC categories to apply a eight point ‘ruralness’ scale (where 1 is ‘Urban major conurbation’ and 8 is ‘Rural village and dispersed in a sparse setting’) lets us see the average ‘ruralness’ of each decile. While this reflects that deprivation is spread across urban and rural areas – the most deprived areas tend to be more urban.
As urban areas have fewer natural places to dispose of dog waste, and the most deprived areas are more urban, we would expect the most deprived areas to have more dog fouling. We also know that measures that contribute to IMD scores (such as crime levels) are related to trust and social cohesion in an area. When social cohesion is lower, we would expect more dog fouling because owners feel less surveyed and are less concerned with the opinion of neighbours. The real world increase reported by the Keep Britain Tidy survey supports these relationships.
The drop off in our reported data compared to the real world can be explained by features of the general model for understanding FixMyStreet reports — some measures of deprivation are correlated with increased reports (because they relate to more problems) and others with decreased reports (because they hurt the ability or inclination of people to report). We would also expect areas with worse deprivation to have fewer reports because of disengagement with civic structures.
Quickly checking the English dog fouling data (so only 17,103 dog poops) against the same model confirms that significant relationships exist for the same deprivation indexes as the global dataset with the largest effect size of a measure of deprivation being for health – as health deprivation in an area goes up, reports of dog fouling increase.
What this tells us is that our dog data (and probably our data more generally) is clipped in areas of the highest deprivation. We’re not getting as many reports as the physical survey would suggest and so our data has very real limits in identifying the areas worse affected by a problem.
This is a lesson in being careful about interpreting datasets you pick up off the ground – if you used this data to conclude the most deprived areas had a similar dog poop problem to the least deprived areas you would be wrong. Because we have an independent source of the real world rate of problems, we can see there is a mismatch between distribution in reports and reality. Using this independent data of ‘actual problems’ for one of our categories makes us more aware that there is negative pressure on reports in highly deprived areas.
If you’d like to learn more about the history of dealing with dog poo on the street (and who wouldn’t want to learn more about that!) – I’ve very generously gone into more detail here.
: An index that combines thirty-seven indicators from seven domains (income, health, crime, etc) to provide a single figure for an area that is indicative of its level of deprivation relative to other areas.
:This is relative. Rural areas still have problems with bagged dog poo (“the ghastly dog poo bauble” hanging from branches – as MP Anne Main put it). There is also a risk to the health of cows from dog fouling in farmland – so there are unique rural dog poo problems.
: Ross et al. found “People who report living in neighborhoods with high levels of crime, vandalism, graffiti, danger, noise, and drugs are more mistrusting. The sense of powerlessness, which is common in such neighborhoods, amplifies the effect of neighborhood disorder on mistrust.”
Header image: https://www.flickr.com/photos/scottlowe/3931408440/
It’s that time of year again. Local elections are on the 4th of May and we have updated our boundary change checker. It also helpfully lets you know if your ward is not having elections (not that your author was unaware of course.)
On May 4th elections will be taking place across English, Welsh and Scottish councils as well as the the elections for the new ‘combined authority’ Mayors.
Ward boundaries are changing
You might think you already know where to vote, and who’s standing for election in your area.
But both are dictated by which ward you live in — and that may not be the one you’re used to, thanks to ongoing changes in ward boundaries.
There’s no need to worry, though. As before, we’ve got the data that will tell you whether your ward has changed. Just enter your postcode here.
Our EveryPolitician project makes data on the world’s politicians available in a useful, consistent format for anyone to use. If you’ve been following our progress, you’ll know we’ve already collated a lot of data (over 72,000 politicians from 233 countries). The work on adding to the depth and breadth of that data is ongoing, but EveryPolitician data is already being used to do interesting things.
Previously we looked at Politwoops as an example of EveryPolitician data being used to augment existing data.
In that case, the useful data for Politwoops was the politicians’ party affiliation. But our team (a handful of humans and one very busy bot) collects richer data than just that. EveryPolitician data includes contact information for politicians.
At mySociety, we know how powerful this particular kind of data can be. For example, our WriteToThem site makes it easy for UK citizens to contact their representatives (WriteToThem grew out of the earlier online service FaxYourMP, and uses the now more common technology of email).
Of course, there’s nothing especially radical about collecting email addresses of politicians… or phone numbers, Twitter handles, or Facebook pages. Indeed, many individuals and groups do just that. But an important difference with EveryPolitcian is that we’re not just collecting data (which happens to include those things, as well as a host of others) but also making it available so it’s easy to use. We do that by putting it out in consistent, useful formats.
For many projects, downloading a CSV of current politicians from EveryPolitician will be enough. That can be opened as a spreadsheet, and if one of those columns is called
Opening a spreadsheet is just one way of accessing the data. Our own use of EveryPolitician data to power the “Write in Public” MajlisNameh site for ASL19 (see this blog post for more about that) demonstrates a more programmatic approach.
But the whole point of making data available like this isn’t so that we can use it. It’s for other people, other groups. Anyone can build more nuanced or complex services with this data too.
For example, the people at Represent.me have built a sophisticated platform for gathering opinions and votes that can be shared with politicians and constituency MPs. It’s a system of information-gathering that has a network of citizens at one end feeding into their political representatives at the other. They use EveryPolitician’s data to populate their system with information about those representatives, including contact details, for each country they operate in.
And, because we make sure our data is consistently formatted, it’s a good general solution. As they cover more areas, they can expect the code they’ve written to ingest the EveryPolitician data in the countries they’re already operating in to also work as they expand into others.
If you’re running a project that needs such data, you could invest time and effort finding and collecting it all yourself. But it’s almost inevitable that you’d be using the same public sources that we are anyway — after all, we try to identify and use all the sources we can, merging them together into one, collated whole — so really it makes sense to simply take the data from EveryPolitician. Remember, too, that once our bot has been told about a source, it checks it daily for changes and updates too. So instead replicating the effort we’re already doing to gather the same data you need, you’re free to focus on developing the way your project uses that data… while we hunker on down and get on with collecting it.
Inevitably, as with all software projects, there’s always lots more to do, but already the value of providing useful data — and especially contact information — in a consistent format is clear.
Image: Telegraph Chambers (Montreal) CC BY-NC-ND 2.0 by Andre Vandal
Over the last two years, we’ve gathered data on the top-level politicians of almost every country in the world, and made it accessible to developers everywhere through our project EveryPolitician.
Now we’d like to take a step that we believe will benefit more people, and further extend the usefulness of this extensive dataset. We’re proposing to integrate more deeply with Wikidata, to fill the gaps in their coverage and provide consistent, linked data to their global community.
Wikidata is the central storage for the structured data each of its Wikimedia sister projects including Wikipedia, Wikivoyage, Wikisource, and others. Wikidata also provides support to many other sites and services beyond just Wikimedia projects so the combination of EveryPolitician’s data with the reach of Wikidata’s community is pretty compelling.
So in many places, the aims of the EveryPolitician and Wikidata projects are already aligned. We already synchronise EveryPolitician data with the good quality data available in Wikidata where we find it, and we feed back our own additions. As our datasets improve, it seems prudent to combine efforts, and resources, in one place.
If you play an active part within the Wikidata community, or are someone who would benefit from this initiative, we’d very much appreciate your support. Please do add your endorsements or thoughts at the foot of that page if you’d like to see the project go ahead.
International Women’s Day seems like a good time to check in on our project Gender Balance, the crowdsourcing website that invites users to help gather gender data on the world’s politicians.
As you may recall, our aim was not simply to present top-level numbers: data already existed that allows us to, say, understand which legislatures have the most even-handed representation, genderwise.
No, Gender Balance seeks to go more in-depth: by attaching gender data to individual politicians, and making that data available via structured datasets, we hope to allow for more subtle comparisons to be made.
For example, researchers may like to test theories such as, ‘do women vote differently from men?’, or ‘do women politicians make different laws around childcare?’ — or a whole host of other questions, all of which can only be answered when gender data relates to specific public figures, or when it is viewed in combination with other data.
The data that is collected when you play Gender Balance goes, with data from other sources, into EveryPolitician, our project that seeks to provide structured, downloadable, open information across all the world’s legislatures.
Not right away, mind you. To ensure that the data really is accurate, we make sure that each politician on Gender Balance is presented to at least five different players, all of whom give the same answer, before we consider it verified.
EveryPolitician currently contains data for about 73,000 politicians in total. In some cases, that data came to us along with a trusted gender field, so we don’t need to run that through the Gender Balance mill, but the majority of parliamentary sites don’t provide this data.
We can sometimes obtain that information from other sources, but Gender Balance has been invaluable in filling in lots of the gaps. Thanks to our players, it has already provided us with gender information for over 30,000 politicians (and in some cases, pointed out discrepancies in the data we obtained from elsewhere).
There’s still plenty to go, though, if you’d like to help; and, as elections happen around the world, Gender Balance will continue to refresh with any politicians for whom we can’t find trusted gender data. As we speak, approximately 22,000 politicians still need sorting.
That might sound like quite a lot, but each politician need only take seconds — and every little helps. So, if you’d like to help contribute a little more gender data, just step this way.
Image: India’s Prime Minister Narendra Modi at the valedictory session of the National Conference of Women Legislators in Parliament House CC BY-SA 2.0, via Wikimedia Commons
Politwoops tracks politicians’ tweets, and reports the ones that are deleted.
Often those tweets are deleted because of a typo: everyone makes simple mistakes with the buttons on their devices, and politicians are no less human than the rest of us.
But Politwoops’s targets are public servants who use Twitter to communicate with that public. And sometimes the contents of the tweets they delete are not simply the result of bad typing. Those tweets can be especially interesting to people whom those politicians are representing: sometimes they may be evidence of a usually-suppressed prejudice, or an attempt to remove evidence of a previously held opinion that is no longer convenient.
In effect, Politwoops is a public archive of direct quotes that would otherwise be lost.
And also… EveryPolitician
Our EveryPolitician project is an ever-growing collection of data on every politician in the world (we’re not there yet, but we’re over 230 countries and 72,700 politicians in, and counting).
Like Politwoops, our data includes politicians’ Twitter handles. But also a lot more besides.
We make that data useful by putting it out in consistent, simple formats — the simplest of which is a comma-separated value (CSV) file for each term of a legislature. In practice, that means if you want a spreadsheet of the current politicians in your country’s parliament, then EveryPolitician is probably the place for you.
Put them together…
Now, Politwoops predates EveryPolitician by several years, and they’ve being doing their thing without needing our data just fine. In fact, Politwoops has been happily politwooping since 2010 (Politwoops is a project of the Open State Foundation, based in the Netherlands).
Behind the scenes, it works pretty much the way you’d expect: with a list of politicians’ Twitter handles for each country where it’s running.
But… who doesn’t want to add something extra for free? Our data also includes Twitter handles (mostly but not entirely from the same public sources Politwoops were using). So that meant they could take our CSVs and match each line—all that extra data!—with the Twitter handle.
Better, for free
So last year, they augmented their data with ours for one very simple win: they get to know party affiliation for the politician associated with each of those twitter handles. Well, actually they get to know lots of other things besides party — gender, date of birth, or… well, all our other data, if they wanted it. But just party? That’s also fine.
This all means that Politwoops now shows the party of each tweet’s deleter, just because they merged our CSV with theirs. Lovely!
Although party affiliation was the detail Politwoops went for, it turns out the other data from EveryPolitician was a little too tempting for them to ignore… So recently they’ve been doing some playful analysis on their statistics using the gender breakdown that EveryPolitician data makes possible too. You can see more on the Politwoops website.
You can too
To be clear: Politwoops did this, not us. We’re committed to doing the groundwork of finding, collecting and collating the data, and making it available (and, additionally, endlessly checking for updates… if you’re interested in how this all works, you can read our bot’s own blog). We do this so people who want to get on with using the data can do just that. As did, in this case, Breyten and his team at Politwoops.
EveryPolitician’s data is available as plain CSVs for this kind of thing, but we also provide a richer JSON version too if that’s more useful to you. All the files are downloadable from the website. If you’re a coder who wants to dive in, there are libraries to make it even easier for you (the EveryPolitician team works in Ruby, so we wrote the everypolitician gem, but there are also ports to Python and PHP).
For more information see the docs.everypolitian.org.
The EveryPolitician bot wrote its own version of this blog post, which goes into a little more detail of the process.
Last time we updated you about Alaveteli professional, the Freedom of Information toolset for journalists that we’re building, we were just coming out of our discovery phase.
Since then, we’ve made strides through the alpha and early beta part of our development process. In alpha, the idea is to build dummy versions of the tool that work in the minimum way possible — no bells and whistles — to test concepts, and our assumptions. Having thought hard about the potential problems of Alaveteli Professional, now is the time for us to try the approaches that we believe will solve them, by making prototypes of how the tool might work and testing them with a very small group of users.
In the early stages of beta, our priority has been to get to the point where a Freedom of Information request can go through all its various processes, from composition to response, with the features that a journalist user would need. Once that’s in place, it allows us to add other features on top and see how they would integrate.
This pattern — discovery, alpha, beta, release — is a well-tested method by which to produce a final product that works as it should, while avoiding costly mistakes.
Alpha and beta testing, perhaps unexcitingly, are all about the reduction of risk: in the words of the startup mantra, it’s good to ‘fail fast’— or rather, it’s better to know early on if something doesn’t work, rather than spend time and money on something that doesn’t fit the bill.
So, for Alaveteli Professional, what are the risks that have been keeping us awake at night?
We think the biggest priority is to ensure that there’s actually added value for journalists in using a service like this. Clearly, the Freedom of Information process is already available to all, whether via our own site WhatDoTheyKnow, or directly.
We need to be able to demonstrate tangible benefits: that Alaveteli Professional can save journalists time; help them be more efficient in managing their requests; maybe help them get information that otherwise wouldn’t be released; and give them access to rich data they wouldn’t otherwise be able to access.
For all we said about failing fast, the alpha phase also meant committing to some fairly big technical decisions that, ideally, we wouldn’t like to reverse.
Decisions like, do we build the service into the existing Alaveteli codebase, or go for a new standalone one (we went for the former)? From the user’s point of view, should Alaveteli Professional look like a totally different site, or like a registration-only part of WhatDoTheyKnow (we chose the latter)?
And onto beta
As we move from alpha to beta, we’re finding out what happens when real users make real requests through the service, and making adjustments based on their feedback.
What do they think of the way we’ve implemented the ability to embargo requests – does it make sense to them? Do they trust us to keep embargoed requests private? Are they able to navigate between different interfaces in a way that seems intuitive? mySociety designer Martin has been figuring out how to take the cognitive load off the user and give them just the information they need, when they need it.
We’re also returning to prototyping mode to work out how to implement new features, like the ability to send round robin requests to multiple authorities, in an effective and responsible way. The other half of our design team, Zarino, has been showing us that a slideshow in presentation mode can be an effective tool for demonstrating how users might interact with an interface.
As we continue to round out the feature set in the UK, we’re also cooking up plans in the Czech Republic so that later in the year we can present the tools to a new audience of journalists there and again, use their feedback to make the tools more flexible so that they can be used in different jurisdictions.
As you can see, there’s lots going on, and we’re all really excited to be finally getting some real life users in front of the tools that we’ve been thinking about, and working on, for so many weeks. Don’t forget to sign up to the mailing list if you’d like to keep up with Alaveteli Professional as it develops.
About six million people a year visit mySociety’s Freedom of Information website WhatDoTheyKnow.com; there are well over 100,000 registered users, and over 385,000 requests have been made via the service.
Of course, it’s fantastic that WhatDoTheyKnow is so well used, but the growth and popularity of the site brings its own challenges, not least the day-to-day admin that keeps the site running.
Many aspects of the site’s operation are run by volunteers, supported by mySociety’s staff and trustees — and due to the site’s success we’re looking to expand the volunteer team.
What does volunteering involve?
The work is pretty varied, but there are some frequent and recurring tasks:
Dealing appropriately with requests to remove material from the website
This is one interesting challenge which arises fairly often. Sometimes these requests are from public bodies who’ve released information they didn’t mean to; and they can also come from individuals and companies who are named in correspondence on the site.
These decisions are not always as black and white as you might expect. Some recent examples where we had to carefully consider the balance on both sides were:
- Material which Transport for London were concerned could be used to steal a tube train. We considered: was this a genuine risk? Was our publication really increasing the risk? Was the information already available elsewhere? What was the potential value of publishing the information to tube staff and their representatives, travellers and the wider public?
- Sainsbury’s were concerned that published material didn’t reflect their corporate policy on “workfare”. That may have been the case, but we asked ourselves whether that made it inappropriate to continue to publish the information that had been released. Additionally, where did the public interest lie? What legal risks were there arising from continued publication?
Responding promptly and accordingly to accidental releases
Thankfully, the frequency with which public bodies accidentally release personal information in bulk via Freedom of Information responses is decreasing, but the WhatDoTheyKnow team still have to act promptly when this does occur.
We often help users on both sides of the FOI process. For requesters, we can answer questions about FOI and how to use it, and we also work with the staff of public bodies who are at the receiving end of requests.
And all the rest
There’s always more that can be done to promote the service, draw attention to interesting correspondence on the site, and lobby for improvements to our access to information laws.
The wider team at mySociety help people around the world to establish and run their own online Freedom of Information services; and new features are being added to the UK site to make it more attractive to professional users such as journalists and campaign groups. Volunteers have the opportunity to get involved in these activities, helping steer the direction of new projects, based on their frontline experience of being a site administrator.
Keeping the database of thousands of public bodies up to date is another challenge, especially given the frequency of reorganisations in the UK’s public sector.
We work primarily by email, with regular video conferencing meetings, and occasionally meet up in person.
As a volunteer, you can decide how much time you put in, and what aspects of running the service you decide to take part in — but ideally we’re looking for people who can spare at least an hour or two, a couple of days a week.
We understand that people’s external commitments vary over time, and of course, there’s a flexible approach if a team member needs to step away for a stretch now and then.
What makes a good WhatDoTheyKnow volunteer?
There’s one characteristic that all the WhatDoTheyKnow volunteers have in common: a belief in the value of Freedom of Information, or, more widely, the expectation of transparency and accountability from the bodies which citizens fund.
As for practical skills: perhaps you’ve been involved in moderating discussions on the web, or have experience with access to information, defamation, or data protection law. Or perhaps you have, or would like to gain, experience dealing with “customers” by email.
Primarily we’re looking for people capable of making good judgements, and who can communicate clearly online.
Before joining the team, new volunteers will have to agree to follow our policies covering subjects such as security and data protection. That said, part of the role may be, if desired, taking a part in developing and refining these, and other, policies as the service grows and changes.
How to apply
If helping us run WhatDoTheyKnow sounds like the kind of thing you’d be interested in doing, then please do apply to join us.
We only have the capacity to bring on and train a few volunteers at a time, and it is important that those chosen to help administer the service are trustworthy and committed to its policies, direction and non-partisan stance. For these reasons, we are recruiting volunteers via a formal application process.
To apply please write to us before the 20th of March 2017, introducing yourself, and letting us know about any relevant interests or experience you have.
What do we offer in return?
As a volunteer, the main reward comes from the satisfaction of assisting users, making good decisions, and helping run what is fast becoming a key part of the country’s journalistic and democratic infrastructure.
Volunteers may be invited to mySociety events and meet-ups, providing a chance to take part in discussions about the future direction of the service and the organisation’s activities more generally. There have been a number of conferences held, where those running Freedom of Information sites around the world have got together to share experiences: one or more volunteers may be invited to join in, with travel expenses paid.
Other ways to help out
If volunteering to join the WhatDoTheyKnow team isn’t for you, perhaps there’s something on mySociety’s Get Involved page that is — or you could:
Image: MarkBuckawicki [CC0], via Wikimedia Commons
MapIt has had a bit of a refresh to bring the look into line with the rest of the mySociety projects. At the same time, we thought we’d take the opportunity to make it a bit easier for non-technical folk to understand what it offers, and to make the pricing a little less opaque.
You may not be familiar with MapIt, but all the same, if you’ve ever found your MP on TheyWorkForYou, written to your representatives on WriteToThem, or reported an issue through FixMyStreet, you’re a MapIt user!
That’s because MapIt does the heavy lifting in the background when you enter a postcode or location, matching that input to the boundaries it falls within (ward, constituency, borough, etc). It is, if you like, the geographic glue that holds mySociety services together.
Like most of mySociety’s software offerings, MapIt is available for others to use. So for example, the GOV.UK website uses it to put users in touch with the right council for a number of services, and Prostate Cancer UK uses it on their campaign site, using MapIt’s knowledge of CCG (Clinical Commissioning Group) region boundaries.
And you can use MapIt too: if your app or website needs to connect UK locations with areas like constituencies or counties, it will save you a lot of time and effort.
Pricing and payment is a lot slicker now: while it was previously managed manually, you can now purchase what you need online, quickly and without the need for human intervention. It’s also quite simple to see the pricing options laid out.
We hope that this will make it easier for people to make use of the service, and better understand what level of usage they need. But if you need to experiment, there’s a free ‘sandbox’ to play about with!
As ever, we’re happy to provide significant discounts for charity and non-profit projects: see more details on the licensing page.
If you have any questions or comments please do get in touch.