After over five years of active development we have decided to pause work on the EveryPolitician project for the foreseeable future.
In this post we’ll outline where we are leaving things, how you can make use of the data that does exist, and how you might be able to help migrate or transfer some of what we’ve collected over to services like Wikidata.
What’s in place today
The EveryPolitician project is, as its name suggests, based on the simple idea to gather accurate and up-to-date data on every politician in the world, collated and shared in a consistent format for free download and use by researchers, democracy projects, campaigners and individual citizens.
Over the course of the project we have gathered, structured and shared data on 78,382 politicians from 233 countries and territories presented on EveryPolitician.org via hundreds of scrapers run on morph.io and hosted on GitHub, producing the data on everypolitician-data.
Mostly the data covers the main chambers of recent parliaments around the world, but it also includes thousands of entries for previous parliaments, in some cases going back decades.
This has been a sizeable undertaking, involving a handful of very talented developers and colleagues within mySociety, as well as contributions from dozens of other organisations and individuals, many of whom make use of the data within their own projects.
The reality is that this work is hugely time consuming, complex and requires not just expert knowledge but a commitment to go deep into the intricacies of parliamentary data in order to make it comprehensible to a wider group of users. And looking to the next couple of years this task is only ever going to increase in complexity — too much for one underfunded organisation.
We therefore intend to freeze the current data as it currently stands, and it will continue to be available for download and reuse. We just can no longer commit to keeping this data up to date.
Always playing catch up
The challenge with data projects like EveryPolitician, beyond the complexity of understanding the structures and relationships within hundreds of individual parliaments (every parliament is an edge case in some way), is that the data is always steadily going out of date.
Across the world’s national parliaments there is an election somewhere roughly once a week, and that’s often when parliaments choose to update their websites, sometimes breaking our scrapers and changing the format of the data. Throughout the life of a parliament you might expect a few percent of MPs to change, sometimes more in different systems, so keeping on top of those individual changes is a sizeable task – especially where errors or duplications occur.
In addition to managing the hundreds of scrapers, we also included data from other sources — increasingly from Wikidata. Over the past 18 months we’ve been attempting to migrate more and more of what we’ve learned on EveryPolitician over to Wikidata via the WikiProject every politician.
Where the project goes next
EveryPolitician was built on the many years of work we had already delivered in this area, through PopIt, Poplus and working with Popolo. We knew what was needed, what worked and what didn’t.
We saw the potential to create an Open Corporates for political data, and hoped that EveryPolitician would be able to attract grant funding to grow, and potentially develop appropriate commercial services in support.
However, after five years of significant investment we just don’t have the funding to continue this work on our own.
In time we hope to be able to continue to contribute again to the wider availability of political data, and with hindsight it’s clear that Wikidata should be the natural global home for this type of data – benefitting from much greater reach, the contribution of motivated individuals in each country, and from the wider Wiki community.
As part of our contribution to Wikidata, we’ve created numerous tools to support the cross-referencing, verification, and supported update of data between EveryPolitician and the Wikiproject. This is still something of a work in progress, but we see it as a key way that others might contribute and take on aspects of the project in the future.
In the meantime we hope that many people continue to make use of the wealth of data that’s already been collected.
If you have a specific interest in a country, group of legislatures or some other combination, perhaps you can consider adding the kind of data that EveryPolitician has collected to Wikidata. We have no further resources to devote to this work; however if you do have an interest in taking some of this on then we will try to advise what options might best suit.
Image: Jelle van Leest
What we’ve done — and what we want to do
Wikidata now has up-to-date and consistent data on political position holders in current national legislatures for at least 39 countries (and work in progress for over 60 countries), thanks to work by volunteer community members on the Wikiproject every politician. mySociety worked as part of this project with a Wikimedia Foundation grant in 2017-18.
There is now a real possibility for Wikidata to become the definitive source of data about democracies worldwide — but only if that data can be maintained sustainably. A significant risk is that elections and other major political changes quickly render data on political position holders and legislatures in Wikidata out-of-date.
We’re proposing a Wikidata post-election updating toolkit project, which aims to ensure that data on elected representatives is substantially correct and complete within a month following an election, leading to improved quality and consistency of data in Wikidata over time. We’ll work as part of the Wikidata community to create and signpost tools and pathways that help contributors to quickly, easily and consistently update data following an election or other political change.
How community members can get involved in the project
If you’re already active around data relevant to political position holders, legislatures, or elections in Wikidata, we’d like your feedback and help to test the new tools and guidance and ensure that they are consistent with the emerging consensus around modelling these types of data.
In particular, if you live in a country or major region that has an upcoming election, please talk to us about piloting the tools! We’d like for you to test the project tools and guidance to update data following your country’s election, and to give us feedback on the value and appropriateness of the approach in your context and political system.
In general, we’re keen to encourage discussion and evaluation of Wikidata as a source of current position holder data.
Please review our proposal
If you’re interested in this, and are active on Wiki projects, please have a look and review our proposal here.
Image: Mike Alonzo
What do you want? An update on Democratic Commons! When do you want it? As regularly as possible!
…well, that’s what you’re getting, anyway. Whether or not you know that’s what you wanted is another matter — because you could be forgiven for having completely missed the Democratic Commons, the ambitious project that mySociety is helping to develop right now.
Even more than that — you might think the issues that the project is addressing were all done and dusted years ago. Not having open access to basic data on elected representatives? That sounds like a 2005 issue, especially somewhere like the UK with its thriving Civic Tech sector and a government that’s declared its commitment to open data. And by ‘basic data’, we mean the fundamentals — stuff as simple as the representatives’ names, the positions they hold and the areas they represent… not exactly rocket science, is it?
But, here we are, it is almost 2019, and the information on who our elected representatives are is still not easily available as structured, consistent and reusable public data.
And so, we have been busy working closely with Wikidata to support a change. Here’s a rundown of everything we’ve been doing:
- Supporting the gathering of lots of data on politicians internationally — including detailed electoral boundary data
We’ve been working with partners around the world to get the basic data on political systems, and who is currently elected into positions, into Wikidata. And we have the electoral boundary data to match the areas they represent.
From the national, down to the city and local level within these cities, this data is now openly available through Wikidata and our GitHub repositories (we’re just writing the documentation for the latter, so watch this space).
The countries where efforts have been focused to model and/or gather data so far are:
Australia, Argentina, Bangladesh, Brazil, Canada, Chile, Colombia, Hong Kong, Italy, India, Mexico, Nigeria, Pakistan, Paraguay, Peru, South Africa, Taiwan and the UK!
Our partners include Premium Times Centre for Investigative Journalism (PTCIJ), Fundación Conocimiento Abierto, Distintas Latitudes, g0v, Code for Pakistan, OpenUp, Open Knowledge Bangladesh and Factly.
- Building a tool to help you visualise Wikidata and discover what data on politicians exist for any country
Specifically, a visualisation tool that helps you explore what data exists that fits the Wikidata every politician data model (see this blog post). mySociety Developer, Alex Dutton, has been fiddling about in his spare time to create this tool, that runs off SPARQL queries. Take a look to see what structured data currently exists for any given country – and tell us what you think!
Or, if it shows you that there ’s data missing, get on Wikidata, and make edits. You’re welcome to ask us for help on this and we’ll be very glad to give it, but you should also know that the Wikidata Facebook group is a great place to ask questions if you’re a newbie.
- Talking to lots of people about their need for structured, consistent and reusable data on elected representatives
It’s all very well having all this data, but it doesn’t count for much if people aren’t using it.
Over the past few months, I’ve been connecting with people and asking how they currently access and maintain data on politicians, and, the implications this has on their work (you may have seen a recent post asking for more examples: this still stands!).
I’ve also been exploring how people think they could contribute and benefit from being part of a collaborative effort. Here’s a rundown of a few choice conversations:
- We’ve spent time with Democracy Club, Open Data Manchester and Open Council Data talking about possible approaches to making UK councillor data more accessible. Sym has nicely summarised where we’re at here. I recommend joining the Democracy Club slack channel #councillors if this is something that interests you.
- Talking to UK focused organisations such as campaign organisation 38Degrees, the brain injury association Headway and the creator of the iparl campaigning tool from Organic Campaigns about how they currently gather and maintain data on elected politicians (ways range from paying for detailed data to supporting political students to maintain spreadsheets); and exploring what they need from data for it to be useful in their work, and the implications of not having this data up to date (small charities struggle to run e-campaigns, for example, that ensure their supporters can connect to representatives).
- Talking to international organisations who build software for nonprofits and campaigners — like New/Mode, Engaging Networks and The Action Network — about their data needs, the struggles of candidate data, and whether any of the new data we’ve been collecting can be helpful to them (it can!). In particular, it was great to hear how useful our EveryPolitician data is for New/Mode.
- Checking what support we can offer to our partners (as listed above) to increase reuse and maintenance of the data in the regions where they work. Also: if you know any further groups interested in reusing data on politicians for their work, please tell them about us.
- We met with staff at Global Witness and heard how they’re using EveryPolitician data on politicians to uncover potential corruption.
- And we checked in with the University of Colorado for an update on their project to model the biographies of members of Congress and see if a politician’s background affects voting behaviour.
- We’re also supporting editathon events to improve political data, being delivered by SMEX in Lebanon (read about their event here), France based F0rk and Wikimedia España.
- And last but very much not least: I attended the Code for All conference. It was really inspiring to meet people from our previous collaborations through Poplus such as Kharil from the Sinar Project, hear some amazing speakers and meet lots of new friends, who we hope to see more as mySociety is now a Code for All affiliate organisation. Also, I surprised myself with my enthusiasm for talking about unique identifiers over a glass of wine…!
Through November and December, we will be focusing on:
- Delivering changes to the EveryPolitican.org site to reflect our desire to source the data from Wikidata (not the current arrangement of 11,000 scrapers that keep breaking!) and offer more guidance on how to contribute political data to Wikidata.
- Working with Wikimedia UK to create some engaging ‘how to get started on Wikidata’ and ‘editing political data’ resources to share with you all.
- Making sure lots of people know this data exists, so they can use it (and hopefully maintain it). Got any ideas?
- Finding out what support is needed to continue this work internationally and keep gathering people who also think this work is important — and putting together funding bids so that we can keep supporting this work
Want to get involved? Here’s how
- Contribute to the Wikidata community: if you are Wikidata user, or keen to learn, the first step is to visit the Wikidata project page on political data. If you need guidance on tasks, do feel free to add to the Talk page to ask the community.
- Join the conversation on the Code for All Slack channel #democratic-commons: https://codeforall.org/ (scroll down and find the ‘Chat with us’ button).
- Tell us (and others) how you think you would use the data: this project can’t just be about collecting data for its own sake: it’s about it being used in a way that benefits us all. How would the Democratic Commons help your community? We’d love people to share any ideas, data visualisations, or theories, ideally in an open medium such as blog posts. Please connect with Georgie to share.
- Something missing from this list? Tell us! We’re @mySociety on Twitter or you can email email@example.com.
- Supporting the gathering of lots of data on politicians internationally — including detailed electoral boundary data