Earlier this year, we were fortunate enough to be contacted by Brian Keegan, Assistant Professor in Information Science at the University of Colorado Boulder, who specialises in the field of network analysis.
Brian and his team were planning to mine the official biographies of every legislator published by the Library of Congress – going back to the first Congress in 1789 – and add the information as structured data to Wikidata. Having heard of our involvement with WikiProject Every Politician, they wanted to understand more about contributing.
The research team, which included professors from the Libraries, Political Science and Information Science departments, planned to combine this biographical data with more common data in political science about voting and co-sponsorship, so that interesting questions could be asked, such as “Do Ivy League graduates form cliques?” or “Are medical doctors more likely to break with their party on votes concerning public health?”. Their hypothesis was that the biographical backgrounds of legislators could play an important role in legislative behaviours.
However, the first big step before questions could be asked (or SPARQL queries made) was supporting undergraduate students to enter biographical data for every member of Congress (going right back to the first) on Wikidata. This has not generally made it into the datasets that political scientists use to study legislative behaviour, and as students began to enter data about these historical figures, it quickly became apparent why: non-existent nations, renamed cities, and archaic professions all needed to be resolved and mapped to Wikidata’s contemporary names and standardised formats.
Nine months on, the team and ten undergraduates have revised over 1,500 Wikidata items about members of Congress, from the 104th to the 115th Congresses (1995-2018) and the 80th– 81st Congresses (1947-1951), which is 15% of the way through all members dating back to the first Congress in 1789!
They started running SPARQL queries this summer.
Joe Zamadics, a political science PhD student who worked on the project explained the potential of combining these data: “One example we tried was looking at House member ideology by occupation. The graph below shows the ideology of three occupations: athletes, farmers, and teachers (in all, roughly 130 members). The x-axis shows common ideology (liberal to conservative) and the y-axis shows member’s ideology on non-left/right issues such as civil rights and foreign policy. The graph shows that teachers split the ideological divide while farmers and athletes are more likely to be conservative.”
The team are keen to highlight the potential that semantic web technology such as Wikidata offers to social scientists.
For the full Q + A with Brian and Joe see the mySociety Medium post.
For a large part of next week, mySociety team members Mark, Rebecca and I will be attending the Code for All Summit, this year to be held in beautiful Bucharest.
Code for All, for those who don’t know, is the largest international network of Civic Tech organisations who “believe that digital technology opens new channels for citizens to more meaningfully engage in the public sphere and have a positive impact on their communities”.
And, as a collective, they have achieved some brilliant things; developing open source toolkits for disaster relief, schooling future generations and making laws and governments more accessible to citizens around the world.
This year, for the first time, Code for All takes the form of an open doors summit, as the aim is to get “every great mind of Civic Tech” together, not just the official Code For All affiliates. Under the title of “The Heroes of Tech”, it will focus on the barriers, challenges and the future of Civic Technology and explore the conference themes of The Power and Impact of Civic Tech, Scaling Civic Tech, Civic Tech & the Wider Context and Sustainable Civic Tech.
Our Head of Research, Dr Rebecca Rumbul, will be leading a workshop to explore how to best to evidence Civic Tech, disseminate learning in the impact field and explore what themes and activities will be featured in mySociety’s next Impacts of Civic Technology conference (TICTeC) in Paris, March 2019, and our Chief Executive Mark will be delivering a presentation and workshop, to co-define and develop a declaration for the Democratic Commons.
We will be sure to report back!
Since 2015, mySociety have collected and shared open data on the world’s politicians via the EveryPolitician project.
And while we receive emails from across the world pretty much on a weekly basis, asking us to update a dataset, we still can’t say exactly who uses the EveryPolitician data, and for what purpose.
This is largely because we want to place as few barriers as possible to using the data. Asking folk to fill in a form or register with us before they access data which we believe ought to be free and accessible? Well, that would be counter to the whole concept of Open Data.
This fascinating read shares the results of their analysis of the UK’s Persons of Significant Control Register (PSC) in which Global Witness used EveryPolitician data to see if there are politicians who are also beneficial owners of a company registered to the UK.
- An automated system for red-flagging companies
- A visual tool for exploring the PSC register and other associated public interest datasets
The red flagging tool can be used to uncover higher risk entries, which do not indicate any wrongdoing but could be in need of further investigation… such as the 390 companies that have company officers or beneficial owners who are politicians elected to national legislatures, either in the UK or in another country.
The report also highlights some of the challenges faced by Companies House that prevent the register from fulfilling its full potential to help in fighting crime and corruption. We recommend a full read: you’ll find it here.
It is very helpful for us to demonstrate the uses of EveryPolitician data, both for our own research purposes and to enable us to secure the funding that allows us to go on providing this sort of service.
If you have or know of more examples of the data being used, please get in touch with me, Georgie. And if you value open, structured data on currently elected politicians, you should get involved with the Democratic Commons; this is a developing a community of individuals and organisations working to make information on every politician in the world freely available to all, through the collaborative database Wikidata.
You may remember that thanks to a grant from the Wikimedia Foundation, mySociety has been working to support increasingly authoritative data on the world’s politicians, to exist on Wikidata as a key part of developing the concept of the Democratic Commons.
And, this summer mySociety welcomed two members of staff to support with the community work around both Wikidata and the Democratic Commons. In May, I (Georgie) joined in the role of ‘Democratic Commons Community Liaison’ and in late June I was joined by Kelly, mySociety’s first ever ‘Wikimedia Community Liaison’… and it’s about time you started to hear more from us!
I’ve been climbing the learning curve: exploring the potential moving parts of a global political data infrastructure, finding out how the communities of Wikidata and Wikipedia operate, attempting to take meaningful notes at our daily meetings for the tool the team developed to improve political data on Wikidata and making sense of the complexity in creating interface tools to interpret the political data already in Wikidata. Oh, and supporting a “side-project” with Open Knowledge International to try and find every electoral boundary in the world (can you help?).
And if you are in any of the relevant open Slack channels (what is Slack?), you may have seen my name on the general introduction pages, as I have been shuffling around the online community centres of the world — off Wikidata Talk that is — trying to find the people interested in, or with a need for, consistently and simply formatted data on politicians, but who aren’t already part of the Wikidata community.
That’s because, the issue the Democratic Commons seeks to address is the time-consuming business of finding and maintaining data on politicians, work that we suspect is duplicated by multiple organisations in each country (often all of them having a similar aim), that is slowing down delivering the stuff that matters. This has certainly been mySociety’s experience when sharing our tools internationally.
And the solution we propose — the Democratic Commons — is that if people and communities worked together to find and maintain this data, it would be better for everyone… ah the paradox of simplicity.
Update on efforts to support the Democratic Commons concept
With each interaction and conversation that we’ve had about the Democratic Commons with partners, we’ve continued to learn about the best role for us to play. Here are some initial actions and thoughts that are shaping the work; please feel free to comment, or even better, get involved 🙂
Making sure the concept is a good fit through user research
We have set a goal to carry out user research on the concept of the Democratic Commons. So far, we have lined up calls with campaign staff (who are interested in using and supporting open political data through their UK campaigning work) and journalists in Nigeria (who have expressed a need for the data) and I am lining up more calls — if you have a need for or can contribute political data, let’s talk.
Bringing the Open Data/Civic Tech and Wiki communities together?
From my experience to date, the Civic Tech and Wiki communities appear to operate quite separately (I am very open to being proved wrong on this point!).
I am just getting started within the Wikidata/ Wikimedia communities (that’s more for Kelly) but on the Open Data/ Civic tech side, there are questions about data vandalism and the potential to trust the data from Wikidata, arguments on the benefit of using Wikidata (especially where you already have a lot of useful data) and on whether there is a need to invest time in learning SPARQL, the query language that allows faster retrieval and manipulation of data from databases.
Misconceptions are not unusual in communities online or offline, but it is a gap that our work focus, communications and tools hope to help close. If you have ideas on blogs, video tutorials or articles to share to read around these concepts, please get in touch.
Working openly in existing global communities (off Wiki)
We are aware that, off-Wiki mySociety is leading the work to develop the Democratic Commons, however, we know that we need to be delivering this work in the open for it to be owned by other people outside of mySociety, and finding the right homes to talk about it (off Wiki) has been important. In order to work openly, we have a shared #DemocraticCommons Slack channel with mySociety and Code for All; see ‘Get involved’ below to find out how to join the conversation.
We also plan to document the learning involved in the process through blog posts and documentation, to be uploaded publicly.
And, supporting local communities to develop, where possible
A global network such as Code for All is very useful in supporting a concept like the Democratic Commons, however, the bulk of need for the data will likely be country-specific. Together with our partners and collaborators, we are exploring what is needed and how to support local communities:
- Through the remainder of our Wikimedia Foundation Grant, we are supporting community events and editathons: in Lebanon with SMEX, in France with newly formed organisation F0rk, and in Spain with Wikimedia España.
- Some groups we are working with, such as Code for Pakistan, plan to set up a channel on their Slack instance and use their Whatsapp community to discuss the data use and maintenance.
- In my own country, the UK, we are talking to mySociety’s community and collaborators to understand how the Democratic Commons could benefit organisations and work in practice here. If you want to be involved in this work, please contact me.
- We are listening to understand what support is needed with collaborators in the global South, as we’re well aware that it is a lot to ask people to work on a voluntary basis and that adequate support is needed. I hope we can share the learning and use it to shape any future projects that may emerge.
How to get involved in the Democratic Commons?
- Contribute to the Wikidata community: If you are Wikidata user, or keen to learn, visit the Wikidata project page on political data. If you need guidance on tasks, do feel free to add to the Talk page to ask the community, or get in touch with Kelly, our Wikimedia Community Liaison: firstname.lastname@example.org.
- Join the conversation on Code for All Slack: If you would like to join the Slack conversation, join here: https://codeforall.org/ (scroll down and find the ‘Chat with us’ button).
- Look for electoral boundary data: We are working with Open Knowledge to find electoral boundary data for the whole world. See more about that here.
- Keep up to date and subscribe to our Medium blog: Sometimes these Democratic Commons posts are a bit too in-depth for the general mySociety readership, so for those who are really interested, we plan to share all we are learning here.
- Share the concept with contacts: Please share the message on your platforms and encourage potential users to take part in research and get involved. We recognise that our view — and reach — can only be anglo-centric, and we’d so appreciate any translations you might be able to contribute.
- Tell us (and others) how you think you would use the data: This can’t just be about collecting data; it’s about it being used in a way that benefits us all. How would the Democratic Commons help your community? We would love people to share any ideas, data visualisations, or theories, ideally in an open medium such as blog posts. Please connect with Georgie to share.
- Something missing from this list? Tell us! We’re @mySociety on Twitter or you can email email@example.com or firstname.lastname@example.org .
Image: Toa Heftiba
We, and Open Knowledge International, are looking for the digital files that hold electoral boundaries, for every country in the world — and you can help.
Yeah, we know — never let it be said we don’t know how to party.
But seriously, there’s a very good reason for this request. When people make online tools to help citizens contact their local politicians, they need to be able to match users to the right representatives.
So head on over to the Every Boundary survey and see how you can help — or read on for a bit more detail.
Data for tools that empower citizens
If you’ve used mySociety’s sites TheyWorkForYou — or any of the other parliamentary monitoring sites we’ve helped others to run around the world — you’ll have seen this matching in action. Electoral boundary data is also integral in campaigning and political accountability, from Surfers against Sewage’s ‘Plastic Free Parliament’ campaign, to Call your Rep in the US.
These sites all work on the precept that while people may not know the names of all their representatives at every level — well, do you? — people do tend to know their own postcode or equivalent. Since postcodes fall within boundaries, once both those pieces of information are known, it’s simple to present the user with their correct constituency or representative.
So the boundaries of electoral districts are an essential piece of the data needed for such online tools. As part of mySociety’s commitment to the Democratic Commons project, we’d like to be able to provide a single place where anyone planning to run a politician-contacting site can find these boundary files easily.
And here’s why we need you
Electoral boundaries are the lines that demarcate where constituencies begin and end. In the old days, they’d have been painstakingly plotted on a paper map, possibly accessible to the common citizen only by appointment.
These days, they tend to be available as digital files, available via the web. Big step forward, right?
But, as with every other type of political data, the story is not quite so simple.
There’s a great variety of organisations responsible for maintaining electoral boundary files across different countries, and as a result, there’s little standardisation in where and how they are published.
How you can help
We need the boundary files for 231 countries (or as we more accurately — but less intuitively — refer to them, ‘places’), and for each place we need the boundaries for constituencies at national, regional and city levels. So there’s plenty to collect.
As we so often realise when running this sort of project, it’s far easier for many people to find a few files each than it would be for our small team to try to track them all down. And that, of course, is where you come in.
Whether you’ve got knowledge of your own country’s boundary files and where to find them online, or you’re willing to spend a bit of time searching around, we’d be so grateful for your help.
Fortunately, there’s a tool we can use to help collect these files — and we didn’t even have to make it ourselves! The Open Data Survey, first created by Open Knowledge International to assess and display just how much governmental information around the world is freely available as open data, has gone on to aid many projects as they collect data for their own campaigns and research.
Now we’ve used this same tool to provide a place where you can let us know where to find that electoral boundary data we need.
Where to begin
Thanks for your help — it will go on to improve citizen empowerment and politician accountability throughout the world. And that is not something everyone can say they’ve done.
Image credit: Sam Poullain