It’s a more common problem than you might think: given a list of postcodes, how can you match them to the administrative and electoral areas, such as wards or constituencies, that they sit within?
MapIt’s data mapping tool gives a quick, easy and cheap solution: just upload your spreadsheet of postcodes, tell it which type of area you want them matched to, and the data is returned to you — complete with a new column containing the information you need.
The tool can match your postcodes to every type of data that MapIt offers in its API, including council areas, Westminster constituencies, parish wards and even NHS Clinical Commissioning Groups (CCGs).
If that doesn’t sound like something you can imagine being useful, let’s look at a few hypothetical use cases (and if you have an actual case that you’d like to tell us about, please do let us know — we’re always keen to hear how our tools are being used).
Organisations, charities and campaigns sometimes need to match postcodes to administrative areas
Membership organisations, charities and campaigns usually collect the addresses of supporters, but don’t commonly ask them who their MP is (even if they did ask, most people in the UK don’t actually know the name of their MP).
But when a campaign asks followers to contact their MPs, it’s helpful to be able to suggest an angle based on whether the MP is known to be sympathetic to their cause, or not — indeed, there’s arguably no point in contacting MPs who are already known to be firmly on board.
So: input a spreadsheet of supporters’ postcodes, and get them matched to the associated Westminster constituencies.
For more advanced usages, organisations might match the MapIt tool’s output of postcodes with other datasets to discover the answers to questions like:
- Which members in a disability group have fewest GPs in their area, and might be finding it difficult to get help for their condition?
- Which supporters of a transport charity live in regions less served by public transport, and would be likely to take action to campaign for improved bus and train services?
- Which people affiliated to an ecological organisation live in predominantly rural areas and could help with a wildlife count?
Researchers sometimes need to match postcodes to administrative areas
Researchers often need to correlate people, institutions or locations with the boundaries they fall within.
They might have a list of postcodes for, say, underperforming schools, and want to find out whether they are clustered within authorities that have similar characteristics, like cuts to their funding or an administration that has a political majority one way or the other.
Teamed with other datasets, MapIt can help towards answering important questions like the number of people each CCG serves, how unemployment rates vary in different European regions, or average house prices within parliamentary constituencies.
Journalists sometimes need to match postcodes to administrative areas
Investigative or data journalists may obtain long spreadsheets full of postcodes in the course of their work, perhaps as a result of having submitted Freedom of Information requests to one or more authorities.
Perhaps they have the address of every university in the country, and there’s an election coming up — during the summer holidays. Knowing that students will mostly be in their home constituencies, they might be able to make informed predictions about how votes in the university towns will be affected.
Or let’s say that a journalist has gathered, from local councils, an address for every library scheduled to close. This could be compared with another dataset — perhaps literacy or crime rates — to draw conclusions over what impact the closures would have.
Part of a wider service
The one-off data mapping tool is just one service from mySociety’s MapIt, which is best known for its API.
This provides an ongoing service, typically for those running websites that ask users to input geographical points such as postcodes or lat/longs, and return tailored results depending on the boundaries those points fall within.
MapIt powers most mySociety sites, for example:
- When you drop a pin on the map while using FixMyStreet, MapIt provides the site with the administrative boundaries it falls within, so that the site can then match your report with the authority responsible for fixing it.
- When you type your postcode into WriteToThem, Mapit gives the site the information it needs to to display a list of every representative, from local councillor up to MEP, who represents your area.
- If you search for your postcode on TheyWorkForYou, MapIt tells the site what your Westminster constituency is and the site matches that to your MP. You can then be taken to their page with a record of how they have voted and everything they’ve said in Parliament.
Give it a try
Image: Thor Alvis
It’s official, there’s going to be a General Election in the UK on June 8th.
As you might suspect mySociety has lots of tools and services that you might find useful during the campaign whether you just want to find out the voting record of your current MP or if you’re planning on building a website or app to cover the campaign.
First things first: TheyWorkForYou.com already covers in lots of detail who your MPs are and how they voted. This should be your first port of call so that you can evaluate your incumbent MP, especially when you’re thinking about who to vote for next.
Over the next couple of weeks we are going to make some changes here and there to make relevant parts of the voting record more prominent, and more clearly explain how we calculate the voting records themselves.
If you’re planning on using the data we have in TheyWorkForYou you can access information on UK politicians, parliamentary debates, written answers, and written ministerial statement via our API at theyworkforyou.com/api
Tomorrow we’ll share a blog post explaining in a little more technical detail how to access the API and some advice on how to get the most out of the service.
Building a service or website that covers all or part of the country and want an easy way to let your users identify which constituency they are in? Then MapIt is your friend.
It already powers most of our own services and is widely used by the likes of Government Digital Services and our friends at Democracy Club.
You can sign up for for free at mapit.mysociety.org and if you need more calls it’s easy to upgrade to a monthly plan – you can get 10,000 calls a month for free if you are a charity or working on an open project – if you think you are going to be busier than that (a) congrats and (b) drop us an email at email@example.com
Helping Democracy Club
Speaking of Democracy Club we’re going to be wholeheartedly supporting their efforts to crowdsource a full set of candidate data in the run up to the election – they are gathering all of their ideas together in this Google Doc https://goo.gl/8WtZvc
We had planned to make some updates and amends to the YourNextRepresentative service that supports Democracy Club’s WhoCanIVotefor.co.uk site in the quiet period between major elections, ahem, but with the snap election called we’ll be doing what we can to make the site run faster and make whatever UI tweaks and fixes we can in the time available.
They will no doubt be looking for help in sourcing candidate data, so please do sign up to help and find out what you can do democracyclub.org.uk/blog/2017/04/18/its-ge2017
In summary and to make it easy you can find all of our relevant #GE2017 datasets and APIs here data.mysociety.org/datasets/?category=ge2017
It’s not too late to let your current MP know what you think on any subject of your choice via WriteToThem.com.
And finally, don’t forget to register to vote yourself at gov.uk/register-to-vote
MapIt has had a bit of a refresh to bring the look into line with the rest of the mySociety projects. At the same time, we thought we’d take the opportunity to make it a bit easier for non-technical folk to understand what it offers, and to make the pricing a little less opaque.
You may not be familiar with MapIt, but all the same, if you’ve ever found your MP on TheyWorkForYou, written to your representatives on WriteToThem, or reported an issue through FixMyStreet, you’re a MapIt user!
That’s because MapIt does the heavy lifting in the background when you enter a postcode or location, matching that input to the boundaries it falls within (ward, constituency, borough, etc). It is, if you like, the geographic glue that holds mySociety services together.
Like most of mySociety’s software offerings, MapIt is available for others to use. So for example, the GOV.UK website uses it to put users in touch with the right council for a number of services, and Prostate Cancer UK uses it on their campaign site, using MapIt’s knowledge of CCG (Clinical Commissioning Group) region boundaries.
And you can use MapIt too: if your app or website needs to connect UK locations with areas like constituencies or counties, it will save you a lot of time and effort.
Pricing and payment is a lot slicker now: while it was previously managed manually, you can now purchase what you need online, quickly and without the need for human intervention. It’s also quite simple to see the pricing options laid out.
We hope that this will make it easier for people to make use of the service, and better understand what level of usage they need. But if you need to experiment, there’s a free ‘sandbox’ to play about with!
As ever, we’re happy to provide significant discounts for charity and non-profit projects: see more details on the licensing page.
If you have any questions or comments please do get in touch.
There’s a new piece of data on MapIt, and it wasn’t added by us. It’s tiny but useful, and it’s slightly esoteric, so bear with us and we’ll explain why it’s worth your attention.
Local Authority codes come from the government’s set of canonical registers. They may not look much, but they’re part of a drive to bring consistency across a wide range of data sets. That’s important, and we’ll try to explain why.
One name can refer to more than one thing
If you try to buy a train ticket to Gillingham in the UK, and you are lucky enough to be served by a conscientious member of staff, they will check whether you are going to the Gillingham in Kent (GIL), or the one in Dorset (GLM).
The names of the two towns might be identical, but their three-letter station codes differ, and quite right too — how, otherwise, would the railway systems be able to charge the right fare? And more importantly, how many people would set off confidently to their destination, but end up in the wrong county?
I mention this purely to illustrate the importance of authoritative, consistent data, the principle that is currently driving a government-wide initiative to ensure that there’s a single canonical code for prisons, schools, companies, and all kinds of other categories of places and organisations.
Of particular interest to us at mySociety? Local authorities. That’s because several of our services, from FixMyStreet to WriteToThem, rely on MapIt to connect the user to the correct council, based on their geographical position.
One thing can have more than one name
I live within the boundaries of Brighton and Hove City Council.
That’s its official name, but when talking or writing about my local authority, I’m much more likely to call it ‘Brighton’, ‘Brighton Council’, or at a push, ‘Brighton & Hove Council’. All of which is fine within everyday conversation, but which is an approach which could cause mayhem for the kind of data that digital systems need (“machine readable” data, which is consistent, structured and in a format which can be ‘understood’ by computer programs).
Registers of Open Data
The two examples above go some way towards explaining why the Department for Local Government & Communities, with Government Digital Services (GDS), are in the process of creating absolute standards, not just for councils but for every outpost of their diverse and extensive set of responsibilities, from the Food Standards Agency to the Foreign & Commonwealth Office, the Land Registry and beyond.
Where possible, these registers are published and shared as Open Data that anyone can use. It’s all part of GDS’ push towards ‘government as a platform’, and in keeping with the work being done towards providing Open Data throughout the organisation. Where possible these registers are openly available, and can be used by anyone building apps, websites and systems.
And now we come to those Local Authority codes that you can find on MapIt.
Anyone can contribute to Open Source code
Like most mySociety codebases, MapIt is Open Source.
That means that not only can anyone pick up the code and use it for their own purposes, for free, but that they’re also welcome to submit changes or extensions to the existing code.
And that’s just how GDS’ Sym Roe submitted the addition of the register.
What it all means for you
If you’re a developer, the addition of these codes means that you can use MapIt in your app or web service, and be absolutely sure that it will integrate with any other dataset that’s using the same codes. So, no more guessing whether our ‘Plymouth’ is the same as the ‘Plymouth’ in your database; the three-letter code tells you that it is.
Plus, these register codes identify a local authority as an organisation, or a legal entity, as opposed to setting out the boundary, so that’s an extra layer of information which we are glad to be able to include.
Census data: there’s lots of it. It contains fascinating insights.
But as with many huge datasets, those insights are not always easy to find at first glance — nor is it easy for the untrained observer to see which parts are relevant to their own lives.
Wazimap in South Africa takes the country’s census data and turns it into something the user can explore interactively. Originally conceived as a tool for journalists, it turned out to be so accessible that it’s used by a much wider range of the population, from school children to researchers. It’s a great example of how you can transform dry data into something meaningful online, and it’s all done using free and open source tools.
Our points-to-boundaries mapping software MapIt is part of that mix, putting the data in context and ensuring that visitors can browse the data relevant to specific provinces, municipalities or wards.
We asked Greg Kempe of Code for South Africa, to fill us in on a bit more.
What exactly is Wazimap?
Wazimap helps South Africans understand where they live, through the eyes of the data from our 2011 Census. It’s a research and exploration tool that describes who lives in South Africa, from a country level right down to a ward, including demographics such as age and gender, language and citizenship, level of education, access to basic services, household goods, employment and income.
It has helped people understand not just where they work and live, but also that data can be presented in a way that’s accessible and understandable.
Users can explore the profile of a province, city or ward and compare them side-by-side. They can focus on a particular dataset to view just that data for any place in the country, look for outliers and interesting patterns in the distribution of an indicator, or draw an indicator on a map.
Of course Wazimap can’t do everything, so you can also download data into Excel or Google Earth to run your own analysis.
Wazimap is built on the open source software that powers censusreporter.org, which was built under a Knight News Challenge grant, and is a collaboration between Media Monitoring Africa and Code for South Africa.
Due to demand from other groups, we’ve now made Wazimap a standalone project that anyone can re-use to build their own instance: details are here.
How did it all begin?
Media Monitoring Africa approached Code for South Africa to build a tool to help journalists get factual background data on anywhere in South Africa, to help encourage accurate and informed reporting.
Code for South Africa is a nonprofit that promotes informed decision-making for positive social change, so we were very excited about collaborating on the tool.
Could MapIt be useful for your project? Find out more here
How exactly does MapIt fit into the project?
Mapit powers all the shape boundaries in Wazimap. When we plot a province, municipality or ward boundary on a map in Wazimap, or provide a boundary in a Google Earth or GeoJSON download, MapIt is giving Wazimap that data.
We had originally built a home-grown solution, but when we met mySociety’s Tony Bowden at a Code Camp in Italy, we learned about MapIt. It turned out to offer better functionality.
What level of upkeep is involved?
Wazimap requires only intermittent maintenance. We had municipal elections in August 2016 which has meant a number of municipal boundaries have changed. We’re waiting on Statistics South Africa to provide us with the census data mapped to these new boundaries so that we can update it. Other than that, once the site is up and running it needs very little maintenance.
What’s the impact of Wazimap?
We know that Wazimap is used by a wide range of people, including journalists, high school geography teachers, political party researchers and academics.
Code for South Africa has been approached a number of times, by people asking if they might reuse the Wazimap platform in different contexts with different data. Most recently, youthexplorer.org.za used it to power an interactive web tool providing a range of information on young people, helping policy makers understand youth-critical issues in the Western Cape.
We also know that it’s been used as a research tool for books and numerous news articles.
The success of the South African Wazimap has driven the development of similar projects elsewhere in Africa which will be launching soon, though MapIt won’t be used for those because their geography requirements are simpler.
What does the future hold?
As we’re building out Wazimap for different datasets, we’re seeing a need for taking it beyond just census data. We’re making improvements to how Wazimap works with data to make this possible and make it simpler for others to build on it.
Each new site gives us ideas for improvements to the larger Wazimap product. The great thing is that these improvements roll out and benefit anyone who uses it across every install.
Thanks very much to Greg for talking us through the Wazimap project and its use of MapIt. It’s great to hear how MapIt is contributing to a tool that, in itself, aids so many other users and organisations.
Need to map boundaries? Find out more about MapIt here
If you’ve visited the MapIt site this week, you might have noticed a change: we’ve introduced key-based authentication for API users.
This enables us to be more flexible about how we provide our service, which means you can be more flexible about how you serve your users.
MapIt is both an open source application and, via https://mapit.mysociety.org, a web service. Use of the API is free for low-volume, charitable use, while all other uses require a licence.
For the moment API keys are optional. We’ll always offer a free level of service to support independent developers and charities.
We’ll have more details soon about the increased flexibility this change will bring.
The Universal Credits system is replacing many other welfare benefits… but slowly. Its roll-out won’t be complete until 2022, meaning that many are, understandably, confused about just what applies within their own local area.
Now Lasa, in collaboration with the Low Incomes Tax Reform Group (LITRG), have launched a tool to help with that problem. Just input a postcode, and it displays information about which benefits apply — and, crucially, where to go for advice in your area.
It’s part of a suite of offerings, also available as widgets that can be placed onto any website. All fall within Lasa’s remit to support organisations in the delivery of social welfare law advice to the disadvantaged communities they serve.
We’re always glad to see MapIt used in other people’s projects, especially those that make a complex system easier to understand.
Apparently advice workers are already expressing their gratitude for the fact that they can have this information at their fingertips — so hats off to Lasa.
Are you still in the same ward? Check whether your ward boundaries have changed here.
May 5 is election day
If you’re a UK citizen, you have an election in your near future. We can say that with confidence.
May 5 sees elections not only for the Scottish Parliament, the National Assembly of Wales and the Northern Ireland Assembly, but also for many local councils. Londoners will be picking their London Assembly representatives and their Mayor. As if all that isn’t enough, there are also Police and Crime Commissioner Elections.
Ward boundaries are changing
You might think you already know where to vote, and who’s standing for election in your area.
But both are dictated by which ward you live in — and that may not be the one you’re used to, thanks to ongoing changes in ward boundaries.
It’s great to see the launch of SocialCareInfo, a new website which helps people in the UK find local & national social care resources.
All the more so because it uses one of our tools, MapIt, to match postcodes with the relevant local authorities. The site’s builders, Lasa, came to us when it became clear that MapIt did exactly what they needed.
Socialcareinfo.net covers the whole of the UK. Users begin by typing in their postcode, whereupon they are shown the range of services available to them.
That’s also how many of our own projects (think FixMyStreet, WriteToThem or TheyWorkForYou) begin, and there’s a good reason for that: users are far more likely to know their own postcode than to be certain about which local authority they fall under, or even who their MP is.
MapIt is really handy for exactly this kind of usage, where you need to match a person to a constituency or governing body. It looks at which boundaries the geographic input falls within, and it returns the relevant authorities.
We’re glad to see it working so well for SocialCareInfo, and we feel sure that the site will prove a useful resource for the UK.
A few of mySociety’s developers are at DjangoCon Europe in Cardiff this week – do say hello 🙂 As a contribution to the conference, what follows is a technical look (with bunny GIFs) into an issue we had recently with serving large amounts of data in one of our Django-based projects, MapIt, how it was dealt with, and some ideas and suggestions for using streaming HTTP responses in your own projects.
MapIt is a Django application and project for mapping geographical points or postcodes to administrative areas, that can be used standalone or within a Django project. Our UK installation powers many of our own and others’ projects; Global MapIt is an installation of the software that uses all the administrative and political boundaries from OpenStreetMap.
A few months ago, one of our servers fell over, due to running entirely out of memory.
Looking into what had caused this, it was a request for
/areas/O08, information on every “level 8” boundary in Global MapIt. This turned out to be just under 200,000 rows from one table of the database, along with associated data in other tables. Most uses of Global MapIt are for point lookups, returning only the few areas covering a particular latitude and longitude; it was rare for someone to ask for all the areas, but previously MapIt must have managed to respond within the server’s resources (indeed, the HTML version of that page had been requested okay earlier that day, though had taken a long time to generate).
resourcemodule, I manually ran through the steps of this particular view, running
print resource.getrusage(resource.RUSAGE_SELF).ru_maxrss / 1024after each step to see how much memory was being used. Starting off with only 50Mb, it ended up using 1875Mb (500Mb fetching and creating a lookup of associated identifiers for each area, 675Mb attaching those identifiers to their areas (this runs the query that fetches all the areas), 400Mb creating a dictionary of the areas for output, and 250Mb dumping the dictionary as JSON).
The associated identifiers were added in Python code because doing the join in the database (with e.g.
select_related) was far too slow, but I clearly needed a way to make this request using less memory. There’s no reason why this request should not be able to work, but it shouldn’t be loading everything into memory, only to then output it all to the client asking for it. We want to stream the data from the database to the client as JSON as it arrives; we want in some way to use Django’s StreamingHTTPResponse.
The first straightforward step was to sort the areas list in the database, not in code, as doing it in code meant all the results needed to be loaded into memory first. I then tweaked our JSONP middleware so that it could cope when given a StreamingHTTPResponse as well as an HTTPResponse. The next step was to use the json module’s
iterencodefunction to have it output a generator of the JSON data, rather than one giant dump of the encoded data. We’re still supporting Django 1.4 until it end-of-lifes, so I included workarounds in this for the possibility of StreamingHTTPResponse not being available (though then if you’re running an installation with lots of areas, you may be in trouble!).
But having a StreamingHTTPResponse is not enough if something in the process consumes the generator, and as we’re outputting a dictionary, when I pass that dictionary to the json’s
iterencode, it will suck everything into memory upon creation, only then iterating for the output – not much use! I need a way to have it be able to iterate over a dictionary…
The solution was to invent the iterdict, which is a subclass of dict that isn’t actually a dict, but only puts an iterable (of key/value tuples) on items and iteritems. This tricks python’s JSON module into being able to iterate over such a “dictionary”, producing dictionary output but not requiring the dict to be created in memory; just what we want.
I then made sure that the whole request workflow was lazy and evaluated nothing until it would reach the end of the chain and be streamed to the client. I also stored the associated identifiers on the area directly in another iterator, not via an intermediary of (in the end) unneeded objects that just take up more memory.
I could now look at the new memory usage. Starting at 50Mb again, it added 140Mb attaching the associated codes to the areas, and actually streaming the output took about 25Mb. That was it 🙂 Whilst it took a while to start returning data, it also let the data stream to the client when the database was ready, rather than wait for all the data to be returned to Django first.
But I was not done. Doing the above then revealed a couple of bugs in Django itself. We have GZip middleware switched on, and it turned out that if your StreamingHTTPResponse contained any Unicode data, it would not work with any middleware that set Content-Encoding, such as GZip. I submitted a bug report and patch to Django, and my fix was incorporated into Django 1.8. A workaround in earlier Django versions is to run your iterator through
map(smart_bytes, content)before it is output (that’s six’s iterator version of map, for Python 2/3 compatibility).
Now GZip responses were working, I saw that the size of these responses was actually larger than not having the GZip middleware switched on?! I tracked this down to the constant flushing the middleware was doing, again submitted a bug report and patch to Django, which also made it into 1.8. The earlier version workaround is to have a patched local copy of the middleware.
Lastly, in all the above, I’ve ignored the HTML version of our JSON output. This contains just as many rows, is just as big an output, and could just as easily cripple our server. But sadly, Django templates do not act as generators, they read in all the data for output. So what MapIt does here is a bit of a hack – it has in its main template a “!!!DATA!!!” placeholder, and creates an iterator out of the template before/after that placeholder, and one compiled template for each row of the results.
Now Django 1.8 is out, the alternate Jinja2 templating system supports a
generate()function to render a template iteratively, which would be a cleaner way of dealing with the issue (though the templates would need to be translated to Jinja2, of course, and it would be more awkward to support less than 1.8). Alternatively, creating a generator version of Django’s Template.render() is Django ticket #13910, and it might be interesting to work on that at the Django sprint later this week.
Using a StreamingHTTPResponse is an easy way to output large amounts of data with Django, without taking up lots of memory, though I found it does involve a slightly different style of programming thinking. Make sure you have plenty of tests, as ever 🙂 Streaming JSON was mostly straightforward, though needed some creative encouragement when wanting to output a dictionary; if you’re after HTML streaming and are using Django 1.8, you may want to investigate Jinja2 templates now that they’re directly supported.
[ I apologise in the above for every mistaken use of generator instead of iterator, or vice-versa; at least the code runs okay 🙂 ]