We’ve been working over the last few years to make our research as easy to read and explore as we can. However, because we release a lot of open data (and are usually open to sharing other data with researchers) there’s also been a lot of research written by researchers outside mySociety, which of course also forms part of the knowledge base about our services.
As such, we’re expanding the scope of the research store to include work about mySociety’s services that has been produced by researchers beyond our own team.
Where papers have been released under a Creative Commons licence but there is only a PDF file available, we will sometimes create more accessible versions. For instance, we have already done so with Emily Shaw’s research into Civic Tech Cities and Frederik M Sjoberg, Jonathan Mellon, & Tiago Peixoto’s exploration of how receiving a response through FixMyStreet affects the probability of making future reports.
This isn’t yet a comprehensive collection, but we plan to add new research as it is published, and retrospectively add older research on a rolling basis. Sign up for our newsletter to hear when new research is added.
While we’re making things easier to find — we’ve also started including mySociety’s responses to calls for evidence and consultations on the research portal, and you can see those here.
Research Mailing List
Sign up to our mailing list to hear about future research.
With funding from the Consumer Data Research Centre (CDRC) we’ve been working with researchers from the University of Sheffield and University of Sterling to open up FixMyStreet data for researchers.
For an example of the kind of thing that can be done with this data, this group have produced maps for every local authority in the UK, mapping FixMyStreet reports against indices of deprivation (a few examples: Sheffield, Harrogate and Cardiff). These can be explored on our mini-site, where for each authority you can also download a printable poster with additional statistics.
If you’d like to know more about what these maps mean and what we learned from the process, there’s a report exploring what we learned here.
Research Mailing List
Sign up to our mailing list to hear about future research.
The Inter Parliamentary Union release a report each year detailing changes in the representation of women across the world. In 2017, women represented 23.4% of all MPs – which is less than half of the proportion of women in the population at large.
While the picture for the last decade shows a positive trend, there is nothing inevitable about ever-increasing representation of women. The IPU report notes that while Albania and France’s representation of women rose by 10% and 12% respectively, other countries saw a decline. Improved representation of women is often a result of decisions deliberately taken to improve representation, rather than being a natural outcome of unstoppable social forces.
One of the pitfalls of international comparisons is that it obscures some of the drivers of good and poor representation. Increased representation of women is often uneven, and concentrated more in some parties rather than others. As Miki Caul points out, international comparisons of relative representation of women overlook “the fact that individual parties vary greatly in the proportion of women MPs within each nation”. Similarly, Lena Wängnerud argues “cross-country studies tend to miss variations between parties within a single system. Variations in the proportion of women to men are even greater across parties than across nations”.
To understand more about this, we’ve built an experimental mini site to examine the roles of parties in driving the representation of women. Using data from EveryPolitician.org (which contains gender and party information for a number of countries), we can explore the respective contributions of different parties to representation of women.
For this it’s not enough to look at the gender ratios of all the parties individually, as those with the best proportional representation of women are often quite small — for instance, the Green Party in the UK has 100% female representation, in the form of its one MP.
Instead, what we look at is the respective contributions to the total gender ratio. For each party we look at how much better or worse the proportional representation of women would be if you ignored that party’s MPs.
For instance in the UK, while the gender ratio of the current House of Commons is around 32%, the Labour Party’s ratio is around 44%. If you take out the Labour Party the representation of women in the House of Common as a whole drops to 23%.
For our purposes, the Labour Party is the UK’s Most Valuable Party (MVP) — ignoring it leads to the largest reduction in the representation of women. For each country, the gap between the ‘gender ratio’ and the ‘gender ratio ignoring the MVP’ gives a new metric of how to understand the gap in gender representation. Where this number is high, it means that the role of individual parties is very important; where it is lower it means that the ratio is not strongly driven by party effects. For instance, the gender ratio in the United States is strongly driven by party effects, while in Bolivia it is not.
Countries with a wide gap between the ‘ratio ignoring the best party’ and ‘ratio ignoring the worst party’ tend to be countries that use majoritarian electoral systems, like the UK. Pippa Norris shows that systems using majoritarian electoral systems tend to have a poorer representation of women than those using proportional representation, but also that there is a lot of variation within each family of electoral systems and “the basic type of electoral system is neither a necessary nor a sufficient condition to guarantee women’s representation”.
Our analysis shows that parties have different levels of agency to improve the overall representation of women depending on the party structure created by the electoral system. Countries that use proportional representation tend to show smaller party effects because there are usually more parties with fewer MPs — and so the ability of any one party to shift the overall representation is reduced. Conversely, in FPTP parliaments with only a few major parties, a large amount of change can happen by only one of these major parties taking measures to improve their internal representation of women.
For example, while Germany’s CDU and the UK’s Conservative party have a similar representation of women at the national level (20.5% and 21.14% respectively), the Conservative party has more than twice the leverage to affect the overall representation of women simply by changing their own policy.
There are limits to using the proportional representation of women as a single measure for the political representation of women. As mySociety’s Head of Research Rebecca Rumbul has previously shown, even bodies with relatively good representation of women like the National Assembly for Wales can then fall down on other areas – with a low proportion of oral evidence to consultations and committees coming from women. While the UK’s Conservative party performs poorly on the proportion of MPs, it has conversely selected more female party leaders and Prime Ministers.
Importantly, looking at the representation of women as a single figure also obscures the important role of social factors as such class or race in shaping which women are represented. Creating a metric for comparison across many different countries is inherently reductive and discards important information about local context in every instance.
Our goal with this website has been to re-complicate the international comparison by moving away from a single national statistic for representation in a way that assigns agency to political actors within each country. Variations among these parties (and international variations in this variation) reflect that representation of currently under-represented groups isn’t a natural fact of life in a given country, but reflects choices made – and that other choices can lead to different outcomes.
This is still a work in progress and we acknowledge there will be holes in how this data has been applied. Lack of gender information for all countries means that some countries that have high representation of women (such as Rwanda) are not addressed. This means that it shouldn’t be taken as a comprehensive ranking — but we hope it is useful as a jumping off point for thinking about the representation of women in parliaments across the world.
We have detailed our methodology here, including known issues with the data. This is an early experiment with the data and we welcome feedback on the website here; or get in touch through the contact details here.
The data the site is built on can be downloaded from everypolitician.org.
A key part of mySociety’s research agenda is understanding how Civic Technology is (or isn’t) helping under-represented groups in society access government services and their representation. In 2015 we released a report Who Benefits from Civic Technology, that explored variations in usage of Civic Tech in various countries and demographics. You can read or download it here.
In this blog post I’m going to talk a bit about how we’ve internally tried to apply our data to understanding the under-representation of women in politics and as users of our services, as well as some interesting things that external researchers have found using our data.
Our EveryPolitician dataset contains information on current (and in some cases historical) politicians for a large number of countries around the world. For a large number of representatives, this includes gender information.
However, a key problem of international comparisons of the representation of women is, as Miki Caul points out, that it “overlooks the fact that individual parties vary greatly in the proportion of women MPs within each nation”. Similarly, Lena Wängnerud argues “cross-country studies tend to miss variations between parties within a single system. Variations in the proportion of women to men are even greater across parties than across nations”.
Fortunately, this is exactly the kind of problem that an international dataset like EveryPolitician is well placed to examine – on Thursday we’ll be using a new mini-site to explore the gender and party information contained in EveryPolitician to give a sense of the international picture and the party-level differences within each country. Stay tuned! Or you can download the data yourself (there are APIs for Python, Ruby and R) and try and beat us to it.
TheyWorkForYou makes it easy to search through the history of what has been said in Parliament, and we make the data (based on the Hansard dataset but more consistently formatted) freely available to download. As essentially a download of a very large amount of text, getting insights from this dataset is a bit more complicated, but potentially very rewarding.
Jack Blumenau has a paper based on TheyWorkForYou data using language to analyse whether appointing female ministers changes how other female MPs participate in debates. Looking at “half a million Commons’ speeches between 1997 and 2017, [he demonstrates] that appointing a female minster increases the participation of women MPs in relevant debates by approximately one third over the level of female participation under male ministers” – and that “female MPs also became more influential in debates under the purview of female ministers […] female ministers respond in a systematically different fashion to the speeches of female MPs.” In this case, influence is a measure of whether the language an individual used is then taken up by others, and this kind of analysis shows how the TheyWorkForYou dataset can be used to demonstrate not just counts of how many women were in Parliament, but the substantive effects of women holding office on the political process.
As Myf talked about yesterday, TheyWorkForYou’s Commons content now extends back to 1918, and so includes every speech by a female MP ever made. We hope this is a useful resource for anyone interested in exploring the history of the representation of women in the UK and have plans for a small project in the upcoming months to show in a simple way how this data can be used (please sign up to our mailing list if you’re interested in hearing about this when it’s completed).
FixMyStreet and WriteToThem
Understanding the under-representation of women is important across our services. Where men and women are experiencing different issues and concerns, imbalances in access (or use of access) potentially lead to differences in resource allocation.
The majority of reports on FixMyStreet.com are reported by men – but to make things more complicated, it’s not just that women make fewer reports, but women report substantively different kinds of reports.
Reka Solymosi, Kate Bowers and Taku Fujiyama investigated FixMyStreet reports and found (by determining gender from names of problem reporters) that different kinds of reports are more likely to be reported by men and women – they suggest that at “first glance it appears that men are more likely to report in categories related to driving (potholes and road problems), whereas women report more in categories related to walking (parks, dead animals, dog fouling, litter)”.
If different kinds of reports are differently gendered, this complicates thinking about how to improve how women use the website – as potential users are having substantially different experiences of problems in the real world well before they interact with the site. We have to engage with the nuance of this kind of finding to understand how to redress issues of access to services.
We’re currently in the process of extending this kind of analysis to our other service. For WriteToThem, we’ve learned that while the majority of people using the service to write to MPs are male (around 60%), this picture is different depending on the level of government – for instance the gender balance for people writing to councils is pretty close to 50/50.
As part of this, we’re investigating whether having the same gender as their representative makes people more likely to make contact. This has some interesting preliminary findings, and we hope to have more to say about this towards the end of the year.
Our research in this area is ongoing, and we’re keen to help people use our data to investigate under-representation – especially where you have expertise or knowledge that we don’t. If you’d like to discuss potential uses of the data please get in touch, or sign up to our mailing list to hear about future research releases.
Two weeks after you write to a representative on WriteToThem we send you a survey asking if they wrote back. We’ve traditionally used the data from these surveys to compare the responsiveness of individual MPs – but something we’re interested in at the moment is understanding more about systematic drivers of responsiveness. What features of a representative’s position or background makes them more or less likely to respond to messages?
The first fruit of that research is a paper in Parliamentary Affairs talking about using WriteToThem data to explore differences in responsiveness between representatives elected from constituencies and those elected from party lists in the Scottish Parliament, National Assembly for Wales, and the London Assembly.
We understand that most readers will not have journal access, so we’ve also written a summary for Democratic Audit that everyone can read here.
We’re actively investigating other factors that affect responsiveness (especially at the Westminster Parliament) and will write more in the coming months. If you’d like to make sure you don’t miss our findings, you can sign up to the research mailing list here.
mySociety services produce a lot of useful (and interesting!) data. Over the years we’ve often made components or the results of mySociety services available through APIs (like our MapIt service) or as open data to download (such as our EveryPolitician data).
What we haven’t been good at is showing you the full breadth of what we have available, or how component parts can be used together. Sometimes we find users of one aspect of mySociety data being unaware of other relevant datasets.
To fix this problem, we’ve created a new data portal – data.mysociety.org – to bring all the data we publish into one place. From the politicians of Albania to data about all ministerial and parliamentary roles UK MPs have held, everything can be found on one site.
Our research team will also use this site to publish supplementary materials to papers and blog posts that might be of use to others (such as a lookup table for the different codes used for UK Local Authorities). So we plan to keep adding data whenever we can!
How many Freedom of Information requests are sent through WhatDoTheyKnow as compared to those made directly to public bodies? Our new mini-site lets you explore Cabinet Office statistics in comparison to numbers from WhatDoTheyKnow.
Every quarter, the Cabinet Office releases Freedom of Information stats for a collection of central government ministries, departments and agencies. This provides a good benchmark for understanding how requests made from WhatDoTheyKnow relate to requests made through other routes. Back in 2010 we ran several blog posts about this, though we haven’t released any comparisons in recent years — and we’re now making up for lost time.
In 2016, WhatDoTheyKnow was the source of 17.14% of requests to audited public bodies. On the other hand, most WhatDoTheyKnow requests (88.51%) went to public bodies that the Cabinet Office figures don’t cover.
One interesting conclusion from this is that most FOI activity in the UK is not immediately visible from the official statistics. You can read more about what we learned from the numbers, or explore the data for yourself on the mini-site.
Image: Jerry Kiesewetter (Unsplash)
For the last few years mySociety’s research output has been living in its own little area of the main website. At the start this was fine, but as we’ve produced more research (which is good!) the website was not good at making clear what we had previously released and why you should read it (which is bad!).
To fix that we’ve brought all our research reports, papers and blog post together in one place. We also wanted to take the opportunity to make our research easier to access. For all our research going back to 2015, we now have a nice, mobile-responsive, easy-to-read version, as well as a text and a kindle .mobi file to go along with that. In several cases papers that had been published externally were released by the publisher under a Creative Commons licence – meaning these could be converted to the new format.
And don’t forget that you can sign-up for our research newsletter for exciting research updates!
Image: Nico Kaiser
When working with data that you didn’t set out to gather you have to be careful to think about what the data actually means, rather than what it seems to be saying. As an example, one of the “interesting” side effects of FixMyStreet is a database of places people have reported dog poop (or “dog fouling” as it tends to be called academically). We now have over 20,000 locations across the UK where nature’s call has both been heard, and reported.
My first thought when learning about this data was “that’s a lot of dog poop!” but it turns out 20,000 dog poops is not a lot of dog poop at all. There are an estimated 8.5 million dogs in the UK, assuming (on average) each one poops once a day, they’ll produce over 3.1 billion poops a year.
So actually, 20,000 poops over nine years is nothing compared to the amount of pooping going on. But just because our data is a drop in the bucket doesn’t mean we can’t learn interesting things from it. The first question to ask is if we have a representative sample of where all this dog fouling is going on. The answer, sadly, is no. But the reasons for that answer raise further questions – which is interesting!
When you map the location of dog poo complaints in England against the Index of Multiple Deprivation , you get this:
This tells us that reports about dog fouling are roughly parabolic – there are more in areas in the middle than those that are either very deprived or very not.
This is interesting because when Keep Britain Tidy actually went out into the world and checked (p. 14), they found this:
This graph tells a very different story, where dog fouling gets worse the more deprived the area. But why is this? And why doesn’t our data tell the same story?
One reason we would expect more dog poop in the most deprived areas is that the most deprived areas are more urban. Taking the same IMD deciles and using the ONS’s RUC categories to apply a eight point ‘ruralness’ scale (where 1 is ‘Urban major conurbation’ and 8 is ‘Rural village and dispersed in a sparse setting’) lets us see the average ‘ruralness’ of each decile. While this reflects that deprivation is spread across urban and rural areas – the most deprived areas tend to be more urban.
As urban areas have fewer natural places to dispose of dog waste, and the most deprived areas are more urban, we would expect the most deprived areas to have more dog fouling. We also know that measures that contribute to IMD scores (such as crime levels) are related to trust and social cohesion in an area. When social cohesion is lower, we would expect more dog fouling because owners feel less surveyed and are less concerned with the opinion of neighbours. The real world increase reported by the Keep Britain Tidy survey supports these relationships.
The drop off in our reported data compared to the real world can be explained by features of the general model for understanding FixMyStreet reports — some measures of deprivation are correlated with increased reports (because they relate to more problems) and others with decreased reports (because they hurt the ability or inclination of people to report). We would also expect areas with worse deprivation to have fewer reports because of disengagement with civic structures.
Quickly checking the English dog fouling data (so only 17,103 dog poops) against the same model confirms that significant relationships exist for the same deprivation indexes as the global dataset with the largest effect size of a measure of deprivation being for health – as health deprivation in an area goes up, reports of dog fouling increase.
What this tells us is that our dog data (and probably our data more generally) is clipped in areas of the highest deprivation. We’re not getting as many reports as the physical survey would suggest and so our data has very real limits in identifying the areas worse affected by a problem.
This is a lesson in being careful about interpreting datasets you pick up off the ground – if you used this data to conclude the most deprived areas had a similar dog poop problem to the least deprived areas you would be wrong. Because we have an independent source of the real world rate of problems, we can see there is a mismatch between distribution in reports and reality. Using this independent data of ‘actual problems’ for one of our categories makes us more aware that there is negative pressure on reports in highly deprived areas.
If you’d like to learn more about the history of dealing with dog poo on the street (and who wouldn’t want to learn more about that!) – I’ve very generously gone into more detail here.
: An index that combines thirty-seven indicators from seven domains (income, health, crime, etc) to provide a single figure for an area that is indicative of its level of deprivation relative to other areas.
:This is relative. Rural areas still have problems with bagged dog poo (“the ghastly dog poo bauble” hanging from branches – as MP Anne Main put it). There is also a risk to the health of cows from dog fouling in farmland – so there are unique rural dog poo problems.
: Ross et al. found “People who report living in neighborhoods with high levels of crime, vandalism, graffiti, danger, noise, and drugs are more mistrusting. The sense of powerlessness, which is common in such neighborhoods, amplifies the effect of neighborhood disorder on mistrust.”
Header image: https://www.flickr.com/photos/scottlowe/3931408440/
I’m just a few weeks into my position of Research Associate at mySociety and one of the things I’m really enjoying is the really, really interesting datasets I get to play with.
Take FixMyStreet, the site that allows you to report street issues anywhere in the UK. Councils themselves will only hold data for the issues reported within their own boundaries, but FixMyStreet covers all local authorities, so we’ve ended up with probably the most comprehensive database in the country. We have 20,000 reports about dog poop alone.
Now if you’re me, what to do with all that data? Obviously, you’d want to do something with the dog poop data. But you’d try something a bit more worthy first: that way people won’t ask too many questions about your fascination there. Misdirection.
How does it compare?
So, starting with worthy uses for that massive pile of data, I’ve tried to see how the number of reports in an area compares against other statistics we know about the UK. Grouping reports into ONS-defined areas of around 1,500 people, we can match the number of reports within an area each year against other datasets.
To start with I’m just looking at English data (Scotland, Wales and Northern Ireland have slightly different sets of official statistics that can’t be combined) for the years 2011-2015. I used population density information, how many companies registered in the area, if there’s a railway station, OFCOM stats on broadband and mobile-internet speeds, and components from the indices of multiple deprivation (various measures of how ‘deprived’ an area is, such as poor health, poor education prospects, poor air quality, etc) to try and build a model that predicts how many reports an area will get.
The good news: statistically we can definitely say that some of those things have an effect! Some measures of deprivation make reports go up, others make it go down. Broadband and mobile access makes them go up! Population density and health deprivation makes them go down.
The bad news: my model only explains 10% of the actual reports we received, and most of this isn’t explained by the social factors above but aspects of the platform itself. Just telling the model that the platform has got more successful over time, which councils use FixMyStreet for Councils for their official reporting platform (and so gather more reports) and where our most active users are (who submit a disproportionate amount of the total reports) accounts for 7-8% of what the model explains.
What that means is that most reasons people are and aren’t making reports is unexplained by those factors. So for the moment this model is useful for building a theory, but is far from a comprehensive account of why people report problems.
Here’s my rough model for understanding what drives areas to submit a significantly higher number of reports to FixMyStreet:
- An area must have a problem
Measures of deprivation like the ‘wider barriers to housing deprivation’ metric (this includes indicators on overcrowding and homelessness) as well as crime are associated with an increase in the number of reports. The more problems there are, the more likely a report is — so deprivation indicators we’d imagine would go alongside other problems are a good proxy for this.
- A citizen must be willing or able to report the problem
Areas with worse levels of health deprivation and adult skills deprivation are correlated with lower levels of reports. These indicators might suggest citizens less able to engage with official structures, hence fewer reports in these areas.
People also need to be aware of a problem. The number of companies in an area, or the presence of a railway station both increase the number of reports. I use these as a proxy for foot-traffic – where more people might encounter a problem and report it.
Population density is correlated with decreased reports which might suggest a “someone else’s problem” effect – a slightly decreased willingness to report in built-up areas where you think someone else might well make a report.
- A citizen must be able to use the website
As an online platform, FixMyStreet requires people to have access to the website before they can make a report. The less friction in this experience makes it more likely a report will be made.
This is consistent with the fact that an increased number of slow and fast home broadband connections (and fast more than slow ones) increases reports. This is also consistent with the fact that increased 3G signal in premises is correlated with increased requests.
Reporting problems on mobile will sometimes be easier than turning on the computer, and we’d expect areas where people more habitually use mobiles for internet access to have a higher number of reports than broadband access alone would suggest. If it’s slightly easier, we’d expect slightly more – which is what this weak correlation suggests.
Not all variables my model includes are significant or fit neatly into this model. These are likely working as proxy indicators for currently unaccounted for, but related factors.
I struggle, for instance, to come up with a good theory why measures of education deprivation for young people are associated with an increase in reports. I looked to see if there was a connection between an area having a school and having more reports on the basis of foot-traffic and parents feeling protective over an area – but I didn’t find an effect for schools like I did for registered companies.
So at the moment, these results are a mix of “a-hah, that makes sense” and “hmm, that doesn’t”. But given that we started with a dataset of people reporting dog poop, that’s not a terrible ratio at this point. Expanding the analysis into Scotland and Wales, analysing larger areas, or focusing on specific categories of reports might produce models that explain a bit more about what’s going on when people report what’s going wrong.
I’ll let you know how that goes.