Tl;dr: We’re now releasing our register of interests data as a spreadsheet.
High quality data about the external interests of our MPs and ministers is vital to identifying conflicts of interests, and discouraging politicians from having conflicts of interest in the first place.
Lack of clarity on the interests and income streams of MPs is a corruption risk. The problem with second jobs and outside interests is less that MPs might be distracted from their main job – but that when they stand in Parliament, they may be representing groups beyond their constituents, asking questions (or not asking questions) depending on their outside work.
When outside interests exist, it’s vital they are clear and transparent. The Register of Members Interests contains a list of disclosures MPs are required to make of financial interests or benefits which “others might reasonably consider to influence his or her actions or words as a Member of Parliament”. Following the Owen Patterson scandal, there was renewed interest in this data, as it was clear that there were a number of potential stories and scandals hidden in plain sight – just requiring someone to join up the data.
Repowering Democracy
Subscribe to our Repowering Democracy: our weekly newsletter on democracy and technology
Building a data ecosystem
A key problem is that the data is not easy to work with. The data is released (roughly fortnightly) on the parliament.uk website as a HTML document for each MP. This process technically releases the information, but makes it hard to compare releases of the same MP over time, or to make comparisons between different MPs.
TheyWorkForYou improves on this by creating structured data from the HTML release. Using this we can highlight the changes in each release from the previous release. This is useful for journalists and campaigners in quickly understanding what has changed in each release. For instance, the change in Rishi Sunak’s register over time can be seen here.
We want to avoid people doing the same work of cleaning the data over and over. We make our version of the data available publicly, so other people can use our work to do things that we haven’t done ourselves. For instance, Open Innovations have built on top of the data we publish to link the data to other datasets and create a Register of Members’ Financial Interests Explorer.
While projects like the Tortoise/Sky News Westminster Accounts create new value in joining up datasets and cleaning the data for their own work – ultimately the new datasets they have created are only usable by those organisations. That’s their right as the people doing the work – but we think there is a bigger (and more sustainable) impact to be had in improving the data in public.
Making our data more accessible
Previously, we have published our interests data as a series of XML files, which is useful for programmers, but harder for other specialists to work with. We did some thinking with OpenDemocracy last year to explore if there were small changes we could make that would make the work we already do more useful.
As well as the XML files, we now publish an experimental spreadsheet version of all data since 2000, and the register for the current 2019 Parliament.
These sheets show the earliest and latest disclosure of an interest, and include some (very) basic NLP analysis to extract mentioned orgs from the free text and make it easier to quickly parse when scrolling.
This data can also be explored through Datasette, which can be used to query the datasets in the browser, and save the queries as links that can be shared.
For instance, the following links go to specific queries (we’re using an in-browser version for prototyping and this might take a minute to load):
- Paid visits to outside UK mentioning the UAE
- Gifts from England Lawn Tennis Club
- Declarations involving a helicopter
- Declarations new in latest release
We want to continue to improve our approach here – and welcome feedback from anyone this spreadsheet helps.
Parliament can do better data publication
A key problem run into by everyone working with the data is that it’s broken to start with. MPs fill things out in inconsistent ways that makes the overall data different to analyse without cleaning first (see both the Open Innovations and Tortoise/Sky News methodology notes). Fixing this up is a key first step towards aggregate analysis – and the easiest place to fix it is with validation when the data is collected at the start.
While work can be done to improve the data after the fact (and experiments with Generative AI have found it to be quite good at fixing inconsistent formatting), improving the initial data collection is the most effective way of improving the quality of the data. There are active moves in Parliament to fix some of these problems. Producing more information in machine readable formats, and adding methods to make sure the data is correct to start with, will make the transparency process simpler at every stage.
Similar issues apply to the register published for All-Party Parliamentary Groups (APPGs), which should publish as “machine readable” data the range of data that the groups are formally supposed to make publicly available. APPGs are semi-official groups that MPs can form around specific interests or issues. Many of these are useful ways of having discussions, but these can also be an avenue for corruption, with outside interests supporting the group and its activities. The register includes the officers of groups and financial assistance and gifts received by the group – but not the overall membership. APPGs are separately required to disclose their wider membership on their website (or if they don’t have a website, if someone asks) but this isn’t included in the register, and so can’t be consistently scraped to produce data. While MPs are supposed to disclose benefits from groups on their individual disclosure, clearer data on what is officially “public” memberships would help ensure that there is nothing missed between these two datasets.
Separately there is a register of ministerial interests that applies to MPs who also have government positions. This is in principle more strict, requiring disclosures of relevant interests of family members, and avoiding even perceived conflicts of interest. However, in practice the information does not contain the specific financial value of gifts or benefits, just that they exist. The disclosure cycle is also longer, being published every six months rather than monthly. In practice – this means that relevant interests may not be public for a significant time after a minister is appointed (and potentially never published, if the minister has again moved on by then).
There is a lot of work that can be done from the outside to build on official data. But the more Parliament does things that it is uniquely able to do, the more we can focus on analysis and data comparisons that are best done outside.
What mySociety can do
A very basic thing we can do is beat the drum (and work with those who have been doing this for ages) for better publication of data from Parliament.
But if this happens or not, we can do work to make the data better. If it looks like Parliament’s data is unlikely to be fixed at the source, then a project of improving the data in public in a way that multiple projects could then build on would be useful. But if the data gets better, then we can better spend our time doing more work on top of this data. This might include joining up the official data with other datasets (including those of the UK’s other Parliaments and Assemblies) to draw out connections and better analysis.
But our work here isn’t just about producing good data – it’s about displaying it in a way that’s useful and understandable by people. Chris Bryant MP (former Chair of the Standards Committee) has argued that Parliament’s own display of the history of registers should match what’s provided by TheyWorkForYou. If Parliament improved its own display to the public of registers of members’ interests this would be fantastic news – and we in turn would need to think about if there are new approaches that would be useful on top of that.
One approach we are thinking about would be to find out what people wanted to know the answers to about their MPs interests, and then using volunteers to answer a set of common questions. This is the kind of editorialising that Parliament itself would find much harder to do – while providing something different from aggregate analysis of the data all together. This is something we could do with the data as it exists, but is something where better data would let us create new tools so volunteers could answer more complicated questions.
Making MPs’ interests clearer and easier to understand is key to spotting conflicts of interest and keeping politicians accountable. We hope our new spreadsheet version of the data helps make the work we’re already doing more useful and accessible – while we think about the road we want to take in future to improve TheyWorkForYou and the project of a transparent democracy.
Subscribe to Repowering Democracy
Image: Wilhelm Gunkel on Unsplash.