Since 2015, mySociety have collected and shared open data on the world’s politicians via the EveryPolitician project.
And while we receive emails from across the world pretty much on a weekly basis, asking us to update a dataset, we still can’t say exactly who uses the EveryPolitician data, and for what purpose.
This is largely because we want to place as few barriers as possible to using the data. Asking folk to fill in a form or register with us before they access data which we believe ought to be free and accessible? Well, that would be counter to the whole concept of Open Data.
But that said, it’s really useful for us to hear how the data is being put to use, so we were very pleased when Global Witness sent us their report, The Companies we Keep.
This fascinating read shares the results of their analysis of the UK’s Persons of Significant Control Register (PSC) in which Global Witness used EveryPolitician data to see if there are politicians who are also beneficial owners of a company registered to the UK.
In order to interpret and compare datasets (which also included sources such as the Tax Justice Network Financial Secrecy Index), Global Witness and Datakind UK built two tools:
- An automated system for red-flagging companies
- A visual tool for exploring the PSC register and other associated public interest datasets
The red flagging tool can be used to uncover higher risk entries, which do not indicate any wrongdoing but could be in need of further investigation… such as the 390 companies that have company officers or beneficial owners who are politicians elected to national legislatures, either in the UK or in another country.
The report also highlights some of the challenges faced by Companies House that prevent the register from fulfilling its full potential to help in fighting crime and corruption. We recommend a full read: you’ll find it here.
It is very helpful for us to demonstrate the uses of EveryPolitician data, both for our own research purposes and to enable us to secure the funding that allows us to go on providing this sort of service.
If you have or know of more examples of the data being used, please get in touch with me, Georgie. And if you value open, structured data on currently elected politicians, you should get involved with the Democratic Commons; this is a developing a community of individuals and organisations working to make information on every politician in the world freely available to all, through the collaborative database Wikidata.
Photo by G. Crescoli on Unsplash
You may remember that thanks to a grant from the Wikimedia Foundation, mySociety has been working to support increasingly authoritative data on the world’s politicians, to exist on Wikidata as a key part of developing the concept of the Democratic Commons.
And, this summer mySociety welcomed two members of staff to support with the community work around both Wikidata and the Democratic Commons. In May, I (Georgie) joined in the role of ‘Democratic Commons Community Liaison’ and in late June I was joined by Kelly, mySociety’s first ever ‘Wikimedia Community Liaison’… and it’s about time you started to hear more from us!
I’ve been climbing the learning curve: exploring the potential moving parts of a global political data infrastructure, finding out how the communities of Wikidata and Wikipedia operate, attempting to take meaningful notes at our daily meetings for the tool the team developed to improve political data on Wikidata and making sense of the complexity in creating interface tools to interpret the political data already in Wikidata. Oh, and supporting a “side-project” with Open Knowledge International to try and find every electoral boundary in the world (can you help?).
And if you are in any of the relevant open Slack channels (what is Slack?), you may have seen my name on the general introduction pages, as I have been shuffling around the online community centres of the world — off Wikidata Talk that is — trying to find the people interested in, or with a need for, consistently and simply formatted data on politicians, but who aren’t already part of the Wikidata community.
That’s because, the issue the Democratic Commons seeks to address is the time-consuming business of finding and maintaining data on politicians, work that we suspect is duplicated by multiple organisations in each country (often all of them having a similar aim), that is slowing down delivering the stuff that matters. This has certainly been mySociety’s experience when sharing our tools internationally.
And the solution we propose — the Democratic Commons — is that if people and communities worked together to find and maintain this data, it would be better for everyone… ah the paradox of simplicity.
Update on efforts to support the Democratic Commons concept
With each interaction and conversation that we’ve had about the Democratic Commons with partners, we’ve continued to learn about the best role for us to play. Here are some initial actions and thoughts that are shaping the work; please feel free to comment, or even better, get involved 🙂
Making sure the concept is a good fit through user research
We have set a goal to carry out user research on the concept of the Democratic Commons. So far, we have lined up calls with campaign staff (who are interested in using and supporting open political data through their UK campaigning work) and journalists in Nigeria (who have expressed a need for the data) and I am lining up more calls — if you have a need for or can contribute political data, let’s talk.
Bringing the Open Data/Civic Tech and Wiki communities together?
From my experience to date, the Civic Tech and Wiki communities appear to operate quite separately (I am very open to being proved wrong on this point!).
I am just getting started within the Wikidata/ Wikimedia communities (that’s more for Kelly) but on the Open Data/ Civic tech side, there are questions about data vandalism and the potential to trust the data from Wikidata, arguments on the benefit of using Wikidata (especially where you already have a lot of useful data) and on whether there is a need to invest time in learning SPARQL, the query language that allows faster retrieval and manipulation of data from databases.
Misconceptions are not unusual in communities online or offline, but it is a gap that our work focus, communications and tools hope to help close. If you have ideas on blogs, video tutorials or articles to share to read around these concepts, please get in touch.
Working openly in existing global communities (off Wiki)
We are aware that, off-Wiki mySociety is leading the work to develop the Democratic Commons, however, we know that we need to be delivering this work in the open for it to be owned by other people outside of mySociety, and finding the right homes to talk about it (off Wiki) has been important. In order to work openly, we have a shared #DemocraticCommons Slack channel with mySociety and Code for All; see ‘Get involved’ below to find out how to join the conversation.
We also plan to document the learning involved in the process through blog posts and documentation, to be uploaded publicly.
And, supporting local communities to develop, where possible
A global network such as Code for All is very useful in supporting a concept like the Democratic Commons, however, the bulk of need for the data will likely be country-specific. Together with our partners and collaborators, we are exploring what is needed and how to support local communities:
- Through the remainder of our Wikimedia Foundation Grant, we are supporting community events and editathons: in Lebanon with SMEX, in France with newly formed organisation F0rk, and in Spain with Wikimedia España.
- Some groups we are working with, such as Code for Pakistan, plan to set up a channel on their Slack instance and use their Whatsapp community to discuss the data use and maintenance.
- In my own country, the UK, we are talking to mySociety’s community and collaborators to understand how the Democratic Commons could benefit organisations and work in practice here. If you want to be involved in this work, please contact me.
- We are listening to understand what support is needed with collaborators in the global South, as we’re well aware that it is a lot to ask people to work on a voluntary basis and that adequate support is needed. I hope we can share the learning and use it to shape any future projects that may emerge.
How to get involved in the Democratic Commons?
- Contribute to the Wikidata community: If you are Wikidata user, or keen to learn, visit the Wikidata project page on political data. If you need guidance on tasks, do feel free to add to the Talk page to ask the community, or get in touch with Kelly, our Wikimedia Community Liaison: firstname.lastname@example.org.
- Join the conversation on Code for All Slack: If you would like to join the Slack conversation, join here: https://codeforall.org/ (scroll down and find the ‘Chat with us’ button).
- Look for electoral boundary data: We are working with Open Knowledge to find electoral boundary data for the whole world. See more about that here.
- Keep up to date and subscribe to our Medium blog: Sometimes these Democratic Commons posts are a bit too in-depth for the general mySociety readership, so for those who are really interested, we plan to share all we are learning here.
- Share the concept with contacts: Please share the message on your platforms and encourage potential users to take part in research and get involved. We recognise that our view — and reach — can only be anglo-centric, and we’d so appreciate any translations you might be able to contribute.
- Tell us (and others) how you think you would use the data: This can’t just be about collecting data; it’s about it being used in a way that benefits us all. How would the Democratic Commons help your community? We would love people to share any ideas, data visualisations, or theories, ideally in an open medium such as blog posts. Please connect with Georgie to share.
- Something missing from this list? Tell us! We’re @mySociety on Twitter or you can email email@example.com or firstname.lastname@example.org .
Image: Toa Heftiba
Last Saturday (August 19th) at Newspeak House in London, mySociety and Wikimedia UK held the “Wikifying Westminster” workshop, a day-long event to encourage people to get involved with Wikidata, but also to give a taste of what people can build with the data that is already there.
The vision: one day, complex investigations which currently take researchers a lot of time, such as “how many MPs are descended from people who were also MPs” or “how many people named X were MPs in year Y”, will be answerable with data from Wikidata using a single SPARQL query…
…but we’re not quite there yet. Currently, some data is scattered all over separate databases (which sometimes get shut down or disappear); some is just plain missing; and most frustrating of all, some is in place but there’s no apparent way to get it out of the database.
In order to make this vision a reality, we need to experiment with the data, find ways to check how complete it is, and explore what questions we can currently answer with it. Events like Wikifying Westminster are the perfect opportunity to do just that.
After a brief introduction to Wikidata and the EveryPolitician project, we split into two groups: one focused on learning how to use Wikidata, while the other focused on working on mini-projects.
Here’s a taste of what happened…
The learning track began by introducing new users to the basic Wikidata editing principles (or “getting data into Wikidata”). Participants were able to put their new skills into action immediately, by adding missing data on British MPs, who were mostly lacking dates and places of birth.
By the end of the first session, good progress had been made, particularly on obtaining dates of birth for current British parliamentarians. For some reason, though, it proved much harder to find these for women than for men: we can only speculate as to why that might be (do some still adhere to the idea that a woman shouldn’t reveal her age?!).
We were also given an introduction to SPARQL, a language used to query information on databases (or “getting data out of Wikidata”). Lucas Werkmeister introduced the Wikidata query service and explained a few tricks to help with using it. Participants were later able to put this to the test by running progressively difficult test queries such as “All current UK MPs” or “Who is the youngest current MP?”
Also, Navino Evans showed us the potential of reusing data, talking about Histropedia, which he co-created with Sean McBirnie. Histropedia is an awesome tool that lets you visualise thousands of topics on interactive timelines: you can browse through existing ones or create a new one from scratch.
This group both worked on improving data and looked at how well we could answer some simple “stepping stone” queries (i.e. small questions to which we already knew some of the answers) as a heuristic of how good the data in Wikidata already is. You can see and contribute questions to the list of test queries here.
Some more details:
Improving data. The focus here was on the Northern Ireland Assembly, for which Wikidata now has full membership history back to the foundation of the Assembly, and on adding academic degrees of cabinet ministers. Starting from an excellent spreadsheet of the undergraduate universities and subjects of UK politicians and ministers (going back to John Major’s cabinets), we tried to upload that data on the relevant items, adding the qualifier “academic major” (P812) to the property “educated at” (P69). In this case, the key problem we found was that we weren’t sure how to model when people did joint subjects, like “Maths and Politics”, convincing us to concentrate on the more obvious subjects first.
Answering some unusual and/or intriguing questions. Inspired by a prior finding that there are more FTSE 100 CEOs named John than there are female ones, and that John is historically the most common name of UK parliamentarians, we thought we’d find out when exactly the John-to-female balance was toppled amongst the UK’s MPs (hint: not until 1992).
Going back further in history, we queried the first time each given name was recorded in Parliament, this was inspired by a recent news article about an MP who claimed he was the first “Darren” in the Commons.
Some ideas were also born that we weren’t able to see through, for various reasons. For example, could we discover which, if any, MPs are descended from people listed in the UCL’s ‘Legacy of British Slave-owners’ database? An interesting question, but at the moment, the answer is ‘no’, partly because child-parent relationships are currently inconsistently modelled in Wikidata, and partly because of the nature of Wikidata and ancestry: if there is someone who doesn’t exist in Wikidata (e.g. Grandad Bob, the painter) in the family chain, Wikidata can’t bridge the gap between a present day MP and the slave owner who might be their ancestor.
This is just the beginning
Work, of course, is still ongoing: all pre-1997 UK data is still to be inserted or improved on Wikidata, and so much more is missing – family connections, academic degrees, links to other databases, and all sorts of “unusual stuff” that can be used for interesting queries.
This data is crucial if we want to be able to answer the really big questions which Wikidata should one day be capable of helping us explore, about what politicians do.
We can do that together!
We hope that events like this give people an easy way in to Wikidata and also show them what’s already possible to achieve with the data. Over the coming months, we are hoping to support more events of this type around the world. If you are interested in getting involved, here’s how:
- Want to improve your country’s data? Events like this can be a great way to help kickstart activities and find other people who share your goals. We are happy to help out and support people in other countries to do so.
- Are you already organising or planning to organise a similar workshop around Wikidata? Make sure it is listed on the Wikidata Event page!
- Do you want to attend future workshops? Follow us on Twitter to stay updated about events that we are running, and ones that other people are too!
Feature image credits: Mark Longair
mySociety was at MozFest again this year — we had a table at Friday evening’s “Science Fair”, showcasing our EveryPolitician project.
As ever with Mozilla’s annual, hands-on festival, there was a lot going on in London’s Ravensbourne, a venue that’s especially conducive to mixing and meeting.
MozFest attracts an active and positive crowd of digital people, ranging from junior-school coder kids right through to hoary old digital campaigners. So we were delighted to meet up with old friends and make new ones, especially as some of them had travelled for afar to be there. London was fortunate once again to be hosting the event, since Mozilla is of course an international organisation. And as our main focus at this year’s event was EveryPolitician — “data about every national legislature in the world, freely available for you to use” — that international aspect was especially welcome.
As a result of our being there, we hope that lots more people know about EveryPolitician’s data, and that some of them are going to build or do amazing things with it. We’re still adding to our data, so we’d love your help: we have data on at least the current term of the top-level legislatures of most of the countries in the world. But we’d still love your help with finding good sources for the remaining few, as well as our ongoing task of going wider (adding more details about the politicians we do have) and deeper (adding historic data from previous terms).
If, in the spirit of digital do-ism that infuses MozFest, you do make something useful or funky with EveryPolitician’s data, do please let us know. We make sure all this lovely data is available to you in a consistent way (that not only means the delivery formats of CSV or JSON Popolo, but also that we adopt reliable conventions about the way we use them). This maximises the likelihood that, when you share that thing you’ve built using the data for your country, people in other places will be able to easily adopt it to work with the data for theirs. And that’s why, if you’ve made something amazing, we’d like to know — so we can shout about it.
Finally: thanks to the people who made MozFest run so smoothly this year, and the spirit of the open web. See you next year!
Image: Mozilla Festival CC BY 2.0
Amazing—we did it!
When we decided to mark Global Legislative Openness Week with a drive to get the data for 200 countries up on EveryPolitician, in all honesty, we weren’t entirely sure it could be done.
And without the help of many people we wouldn’t have got there. But last night, we put live the data for North Korea and Sweden, making us one country over the target.
The result? There is now consistently-structured, reusable data representing the politicians in 201 countries, ready for anyone to pick up and work with. We hope you will.
That’s not to say that our job is over… far from it! There’s still plenty more to be done, as we’ll explain below.
Here’s how it happened
Getting the data for each country was a multi-step process, aided by many people. First, a suitable online source had to be located. Then, a scraper would be written: a piece of code that could visit that source and pull out the information we needed—names, districts, political parties, dates of office, etc—and put it all in the right format.
Because each country’s data had its own idiosyncrasies and formatting, we needed a different scraper for every country.
Once written, we added each scraper to EveryPolitician’s list. Crucially, scrapers aren’t just a one-off deal: ideally they’ll continue to work over time as legislatures and politicians change.
The map above shows our progress during GLOW week, from 134 countries, where we began, up to today’s count of 201.
mySociety’s Tony, Lead on the EveryPolitician project, worked non-stop this week to get as many countries as possible online. But this week we’ve seen EveryPolitician reach some kind of momentum, as it takes off as a community project. It’s an ambitious idea, and it can only succeed with the help of this kind of community effort. Thanks to everyone who helped, including (in no particular order):
Duncan Walker for writing the scraper for Uganda; Joshua Tauberer for helping with the USA data; Struan Donald for handling Ecuador, Japan, Hong Kong, Serbia and the Netherlands; Dave Whiteland, with ThaiNetizen helpfully finding the data source for Thailand; Team Popong for South Korean data; Jenna Howe for her work on El Salvador; Rubeena Mahato, Chris Maddock, Kätlin Traks, François Briatte, @confirmordeny, and @foimonkey for lots of help on finding data; Henare Degan and OpenAustralia who made the scraper for Ukraine; Matthew Somerville for covering the Falkland islands and Sweden; Liz Conlan for lots of help with Peru and American Samoa; Jaroslav Semančík who provided data for, and assistance with, Slovakia; Mathias Huter who supplied current data for Austria while Steven Hirschorn wrote a scraper for the historic data; Andy Lulham who wrote a scraper for Gibraltar; Abigail Rumsey who wrote a scraper for Sri Lanka; everyone who tweeted encouragement or retweeted our requests for help.
But there’s more
There are still 40 or so countries for which we have no data at all: you can see them here. This week has provided an enormous boost to our data, but the site’s real target is, just like the name says, to cover every politician in the world.
And once we’ve done that, there’s still the matter of both historic data, and more in-depth data for the politicians we do have. Thus far, we mostly have only the lower houses for most countries which have two — and for many countries we only have the current politicians. Going into the future we need to include much richer data on all politicians, including voting records, et cetera.
Meanwhile, our first target, to have a list of the current members of every national legislature in the world, is starting to look like it’s not so very far away. If you’d like to help us reach it, here’s how you still can.
Just how quickly can we hit the 200 countries mark on EveryPolitician? That’s what we’ll be finding out this week, and one thing’s for sure — we’ll get a lot further with your help.
This week is GLOW, the Global Legislative Openness Week, and we’re marking it with a concerted drive for more data.
Tony, the project lead, has consistently added one new country every day since EveryPolitician launched four months ago, and now it’s time to put a rocket behind our efforts.
The site currently contains data for 134 countries. We’ll be going flat out to see how quickly we can reach 200, and as the excitement ramps up, we hope you will help spread the word and get involved, too. Tony will carry on working as hard as he can to fill in the gaps, but we need your help to get further, faster.
What is EveryPolitician?
EveryPolitician is our project to store and share open data on every national-level legislator in the world — all in a standard format that can be used by anyone. We wrote about it here.
How can I help?
- Help us find data for more countries! We don’t currently know where to find the politician data for many countries. Here’s a list of the ones we need and here’s a page about how to contribute. If you get stuck, give us a shout.
- Write a scraper If you have the know-how, you can help us enormously by helping scrape the data from the places we do know about. See this page for guidance on how to go about writing a scraper. You’ll find lots of examples here.
- You can also help by spreading the word – tell your friends, tweet, blog, get up on a platform and talk, and just generally share this post. Thank you!
Why do we need this data?
Politician data is readily available for most countries, but it comes in a massive variety of inconsistent formats. Most of those formats aren’t ‘machine readable’, that is to say, the data can’t readily be extracted and re-used by programmers, and pretty much every country differs on what information it provides about each politician.
That being the case, anyone who wants to build an online tool that deals with politicians from more than one country, or who would like their tool to be available to people in other places, or would like to adapt an existing tool to be used elsewhere, would first have to adapt their tool to cope with the data.
EveryPolitician saves them the trouble, and the structured format also means that the tools they build will be compatible with any other tools that use it.
What kind of tools?
EveryPolitician data will be useful for all kinds of projects.
It’ll be much easier to build a website that shows people how to contact a politician. Or one that holds a government to account and educates people about what politicians are doing. Or one that helps voters make choices by displaying facts about what their politicians believe.
It can go further than that, though — with these building blocks in place, developers can really use their imagination to put together all kinds of projects, many of which we haven’t even begun to imagine. And don’t forget that, if a tool has been built to use the standardised data, it’ll also be easy for others to redeploy elsewhere.
If you’d like to see a concrete way in which the data’s already being used, check out Gender Balance.
How can I keep up to date?
We’ll be putting out regular updates via Twitter as the number of countries covered increases — plus you can watch the map turn green on http://everypolitician.org/countries.html as we progress.