Header image: Photo by Toa Heftiba on Unsplash
Devolution in the UK means that some core datasets are produced separately by the statistics authorities in different nations. The statistics are generated according to local definitions that are better tailored to the needs of that area (for instance, Scottish data is far more concerned about geographically isolated communities than other nations). A problem for organisations that run UK-wide services (like mySociety) is that there often is not a single benchmark for the data they require. The data for different nations is often measuring similar concepts, but branches off in incompatible ways.
Our previous analysis of deprivation data has used English data. This represents a majority of our users, but far from all of them, and means that other users are either being excluded from analysis, or being separately analysed in much smaller pools of data (meaning that it is harder to draw firm conclusions). To try and solve this problem, we have started to construct datasets that let us compare geographic data from across the UK. The datasets described below are available on GitHub, with methodology and analysis of the resulting dataset.
The first dataset is a simple measure of whether an area is rural or urban. This is required because Scotland and Northern Ireland have different thresholds for whether an area is urban. These have been adjusted to use the England and Wales definition, with the creation of a third ‘More Rural’ category to roughly match the Scottish definition. For more precise analysis, this dataset also includes a ‘density decile’ to every area of the UK, depending where it falls on a scale from ‘most densely populated’ to ‘least densely populated’.
The second dataset is more complicated and tries to create a single UK-wide measure of multiple deprivation. All four nations produce different indexes of multiple deprivation. These use different indicators, and different weights on different kinds of deprivation to better highlight the different needs of the different areas. These indexes are incompatible, as they are not based on the same underlying indicator and all the scores have been through multiple transformations before ending up in the final index. Exactly where the nearest equivalent deprived area is in England to an area in Wales cannot be exactly determined.
That said, you can make useful comparisons without needing to be exact. A 2016 paper by Abel and colleagues provided a possible approach to this, but is now based on out of date measures of deprivation. Their key insight was that measures of income and employment deprivation are broadly compatible between nations. These only accounted for 50% of the overall construction of the index, but as other measures of deprivation are highly correlated with these, the common elements explain the large majority of the variation in multiple deprivation. They use this to effectively map scores from one index into another (their methodology and the refinements made are explained in the readme). The more recent index for Northern Ireland uses an incompatible measure of income. This means that a UK-wide model can only be standardised using the employment score (which is still fairly predictive of overall multiple deprivation scores). A separate GB index that only includes England, Scotland and Wales can use both the original measures. Both are included in the dataset.
There are significant qualifications on the use of this, but where the usage is broad and inexact anyway, analysis is possible that compares locations from across the UK. Postcode or point information can be translated into a single UK-wide measure of deprivation. Non-English data is not trapped in pools too small to be useful, and can be used to simply add data to larger sets of analysis. We have already started using this approach on our demographics explorer minisite for additional measures of national deprivation.
For more information see the technical details for: