This blog post is part of our Repowering Democracy series. We are publishing a series of short pieces of writing from mySociety staff and guest writers who are thinking about how our democracy works and are at the frontlines of trying to improve it.
This week, we’re re-publishing a blog post from Anna Powell-Smith at the Centre for Public Data, which is a new, non-partisan non-profit working for stronger public data. We’re previously worked together on recommendations to avoid fragmented public data. This blog post touches on several issues close to our hearts: Parliamentary written questions, and where there isn’t enough data to understand what’s going on.
—
Data gaps are under-reported, because it’s hard to write about data that doesn’t exist.
As we’ve written about before, newspapers publish endless stories on house prices, where there’s lots of data – but few on rental costs, even though millions of people rent. That’s partly because the Office for National Statistics doesn’t collect much data on rentals.
To tackle this problem, I’ve been thinking about how to map data gaps, and make them more visible.
And I think the best way is actually to think about questions, instead of data. What are the important questions that the government can’t answer?
Obviously, ‘important’ is subjective! But one source of clearly important questions is Parliamentary written questions, which are the formal questions that MPs and peers ask the government. Where the government doesn’t have the data to answer them, it has to say so.
So this post introduces new research: a data analysis of 200,000 Parliamentary written questions, and what they tell us about the UK’s missing numbers.
Our modest goal: to find the UK’s biggest data gaps.
Repowering Democracy
Subscribe to our Repowering Democracy: our weekly newsletter on democracy and technology
What we did
Building on some previous research of ours, we strapped on our coding hats 🪖, and did the following:
- First, we scraped all the written questions in Parliament from December 2019 to February 2023, from TheyWorkForYou, which gaves us about 200,000 questions.
- Next, we flagged questions asking for quantitative information, with phrases like “how many” or “how much” – which showed that about a fifth of questions wanted data, just under 40,000.
- Then we flagged questions where the government apparently said the data was “not held”, “not collected”, etc. About a quarter of quantitative questions were answered like this.
And we ended up with a dataset of around 10,000 questions where MPs apparently both (i) asked for data, and (ii) were told it was not available. So: missing numbers.
Then we spot-checked the questions to check our method. It wasn’t perfect, but it was very decent. (It helps that Parliament uses formal, consistent language.) You can download the full dataset here.
Sometimes, MPs ask about strange things, like jobs for clowns. But most are extremely serious, covering the issues that affect MP’s constituents. And overall, they tell us what MPs need to know.
Data gaps by department
Firstly, we looked at how often each government department said that data wasn’t available. (See the code.) And there were were huge differences:
- At the Department of Health & Social Care, around 40% of quantitative requests were unanswered (though we can cut them some slack, as this was during the Covid pandemic).
- At the Home Office and the Department for Work & Pensions, around a third were; at the Ministry of Justice the proportion of unanswered quantitative requests was 30%, and the Department for Education 27%.
- But the proportion was much lower at other big departments – almost all others were below 20%.
Of course, we need to be cautious here, as the numbers are approximate. Without reading each question, we can’t be sure that we’ve tagged it correctly, or if the MP was asking something impossible. It’s probably most useful to consider the differences between departments.
Given that, it’s not surprising that the health, benefits, justice and education departments would get requests for data, since they run massive operational services that affect people’s lives. (The Foreign Office, by contrast, largely seems to get asked about wine.) It’s more surprising that they seem to struggle to answer them more than other departments.
Now let’s dive into what these unanswered questions were about.
The topics with the biggest data gaps
Each question scraped has a title. We can use this to see which topics were least likely to get an answer.
Other than Covid-related topics, the major topics with the highest proportion of unanswered questions were:
- Benefits – grouping together benefits like Universal Credit and PIP
- Asylum, refugees and migrants
- Child maintenance
- Energy meters
- Armed forces housing
This seems plausible. The DWP Select Committee has repeatedly criticised the government for the lack of visibility over the benefits system; the statistics regulator has expressed concerns about the use of asylum statistics, while the National Audit Office has noted gaps in the data available on smart meters.
We also used GPT-4 to try tagging questions, which worked quite well. We used it to tag questions to the Department of Health & Social Care. This helped us identify major clusters of unanswered questions in these areas.
In healthcare, MPs often struggled to get basic prevalence information, whether:
- the number of people diagnosed with particular conditions, like silicosis
- diagnoses broken down by characteristics, like the number of women with meningitis, or region, like autism in the East of England
- or diagnoses for particular (important) groups, like prisoners with mental health conditions.
Also, funding is a topic it’s surprisingly difficult to get information about, e.g.
- funding for particular conditions, like endometriosis, or
- funding on social care by local authority
- funding at a local level, especially per hospital, which MPs often care about.
Following on from this, hospital-level information in general often seems to be poor, e.g.:
- how many A&E visits there are per hospital
- what waiting times are per hospital.
And finally, workforce is a huge one, with topics like:
- vacancies – how many current GP vacancies are there?
- retention – how many dentists are still working 5 years after qualifying?
- skills – how many specialists are there with particular skills, like Parkinson’s nurses or motor neurone nurses?
You can see the tagged questions here – there are many more examples under each topic.
This gets really worrying when you look at the dataset over time. It’s immediately clear that MPs often ask the same thing over and over again – yet the information doesn’t seem to improve.
What next?
We think statistics producers should be monitoring Parliamentary questions, to tell them where data needs to be better. After all, MPs deserve answers to their questions, and so do we all.
If you can help us make this happen, we’d love to talk.
If you’re interested in this research – or even better, if you can fund us to do more of it! – please do get in touch.
Image: Tom Chen on Unsplash.