If you’ve visited the MapIt site this week, you might have noticed a change: we’ve introduced key-based authentication for API users.
This enables us to be more flexible about how we provide our service, which means you can be more flexible about how you serve your users.
MapIt is both an open source application and, via https://mapit.mysociety.org, a web service. Use of the API is free for low-volume, charitable use, while all other uses require a licence.
For the moment API keys are optional. We’ll always offer a free level of service to support independent developers and charities.
We’ll have more details soon about the increased flexibility this change will bring.
The Universal Credits system is replacing many other welfare benefits… but slowly. Its roll-out won’t be complete until 2022, meaning that many are, understandably, confused about just what applies within their own local area.
Now Lasa, in collaboration with the Low Incomes Tax Reform Group (LITRG), have launched a tool to help with that problem. Just input a postcode, and it displays information about which benefits apply — and, crucially, where to go for advice in your area.
It’s part of a suite of offerings, also available as widgets that can be placed onto any website. All fall within Lasa’s remit to support organisations in the delivery of social welfare law advice to the disadvantaged communities they serve.
We’re always glad to see MapIt used in other people’s projects, especially those that make a complex system easier to understand.
Apparently advice workers are already expressing their gratitude for the fact that they can have this information at their fingertips — so hats off to Lasa.
Are you still in the same ward? Check whether your ward boundaries have changed here.
May 5 is election day
If you’re a UK citizen, you have an election in your near future. We can say that with confidence.
May 5 sees elections not only for the Scottish Parliament, the National Assembly of Wales and the Northern Ireland Assembly, but also for many local councils. Londoners will be picking their London Assembly representatives and their Mayor. As if all that isn’t enough, there are also Police and Crime Commissioner Elections.
Ward boundaries are changing
You might think you already know where to vote, and who’s standing for election in your area.
But both are dictated by which ward you live in — and that may not be the one you’re used to, thanks to ongoing changes in ward boundaries.
It’s great to see the launch of SocialCareInfo, a new website which helps people in the UK find local & national social care resources.
All the more so because it uses one of our tools, MapIt, to match postcodes with the relevant local authorities. The site’s builders, Lasa, came to us when it became clear that MapIt did exactly what they needed.
Socialcareinfo.net covers the whole of the UK. Users begin by typing in their postcode, whereupon they are shown the range of services available to them.
That’s also how many of our own projects (think FixMyStreet, WriteToThem or TheyWorkForYou) begin, and there’s a good reason for that: users are far more likely to know their own postcode than to be certain about which local authority they fall under, or even who their MP is.
MapIt is really handy for exactly this kind of usage, where you need to match a person to a constituency or governing body. It looks at which boundaries the geographic input falls within, and it returns the relevant authorities.
We’re glad to see it working so well for SocialCareInfo, and we feel sure that the site will prove a useful resource for the UK.
A few of mySociety’s developers are at DjangoCon Europe in Cardiff this week – do say hello 🙂 As a contribution to the conference, what follows is a technical look (with bunny GIFs) into an issue we had recently with serving large amounts of data in one of our Django-based projects, MapIt, how it was dealt with, and some ideas and suggestions for using streaming HTTP responses in your own projects.
MapIt is a Django application and project for mapping geographical points or postcodes to administrative areas, that can be used standalone or within a Django project. Our UK installation powers many of our own and others’ projects; Global MapIt is an installation of the software that uses all the administrative and political boundaries from OpenStreetMap.
A few months ago, one of our servers fell over, due to running entirely out of memory.
Looking into what had caused this, it was a request for
/areas/O08, information on every “level 8” boundary in Global MapIt. This turned out to be just under 200,000 rows from one table of the database, along with associated data in other tables. Most uses of Global MapIt are for point lookups, returning only the few areas covering a particular latitude and longitude; it was rare for someone to ask for all the areas, but previously MapIt must have managed to respond within the server’s resources (indeed, the HTML version of that page had been requested okay earlier that day, though had taken a long time to generate).
resourcemodule, I manually ran through the steps of this particular view, running
print resource.getrusage(resource.RUSAGE_SELF).ru_maxrss / 1024after each step to see how much memory was being used. Starting off with only 50Mb, it ended up using 1875Mb (500Mb fetching and creating a lookup of associated identifiers for each area, 675Mb attaching those identifiers to their areas (this runs the query that fetches all the areas), 400Mb creating a dictionary of the areas for output, and 250Mb dumping the dictionary as JSON).
The associated identifiers were added in Python code because doing the join in the database (with e.g.
select_related) was far too slow, but I clearly needed a way to make this request using less memory. There’s no reason why this request should not be able to work, but it shouldn’t be loading everything into memory, only to then output it all to the client asking for it. We want to stream the data from the database to the client as JSON as it arrives; we want in some way to use Django’s StreamingHTTPResponse.
The first straightforward step was to sort the areas list in the database, not in code, as doing it in code meant all the results needed to be loaded into memory first. I then tweaked our JSONP middleware so that it could cope when given a StreamingHTTPResponse as well as an HTTPResponse. The next step was to use the json module’s
iterencodefunction to have it output a generator of the JSON data, rather than one giant dump of the encoded data. We’re still supporting Django 1.4 until it end-of-lifes, so I included workarounds in this for the possibility of StreamingHTTPResponse not being available (though then if you’re running an installation with lots of areas, you may be in trouble!).
But having a StreamingHTTPResponse is not enough if something in the process consumes the generator, and as we’re outputting a dictionary, when I pass that dictionary to the json’s
iterencode, it will suck everything into memory upon creation, only then iterating for the output – not much use! I need a way to have it be able to iterate over a dictionary…
The solution was to invent the iterdict, which is a subclass of dict that isn’t actually a dict, but only puts an iterable (of key/value tuples) on items and iteritems. This tricks python’s JSON module into being able to iterate over such a “dictionary”, producing dictionary output but not requiring the dict to be created in memory; just what we want.
I then made sure that the whole request workflow was lazy and evaluated nothing until it would reach the end of the chain and be streamed to the client. I also stored the associated identifiers on the area directly in another iterator, not via an intermediary of (in the end) unneeded objects that just take up more memory.
I could now look at the new memory usage. Starting at 50Mb again, it added 140Mb attaching the associated codes to the areas, and actually streaming the output took about 25Mb. That was it 🙂 Whilst it took a while to start returning data, it also let the data stream to the client when the database was ready, rather than wait for all the data to be returned to Django first.
But I was not done. Doing the above then revealed a couple of bugs in Django itself. We have GZip middleware switched on, and it turned out that if your StreamingHTTPResponse contained any Unicode data, it would not work with any middleware that set Content-Encoding, such as GZip. I submitted a bug report and patch to Django, and my fix was incorporated into Django 1.8. A workaround in earlier Django versions is to run your iterator through
map(smart_bytes, content)before it is output (that’s six’s iterator version of map, for Python 2/3 compatibility).
Now GZip responses were working, I saw that the size of these responses was actually larger than not having the GZip middleware switched on?! I tracked this down to the constant flushing the middleware was doing, again submitted a bug report and patch to Django, which also made it into 1.8. The earlier version workaround is to have a patched local copy of the middleware.
Lastly, in all the above, I’ve ignored the HTML version of our JSON output. This contains just as many rows, is just as big an output, and could just as easily cripple our server. But sadly, Django templates do not act as generators, they read in all the data for output. So what MapIt does here is a bit of a hack – it has in its main template a “!!!DATA!!!” placeholder, and creates an iterator out of the template before/after that placeholder, and one compiled template for each row of the results.
Now Django 1.8 is out, the alternate Jinja2 templating system supports a
generate()function to render a template iteratively, which would be a cleaner way of dealing with the issue (though the templates would need to be translated to Jinja2, of course, and it would be more awkward to support less than 1.8). Alternatively, creating a generator version of Django’s Template.render() is Django ticket #13910, and it might be interesting to work on that at the Django sprint later this week.
Using a StreamingHTTPResponse is an easy way to output large amounts of data with Django, without taking up lots of memory, though I found it does involve a slightly different style of programming thinking. Make sure you have plenty of tests, as ever 🙂 Streaming JSON was mostly straightforward, though needed some creative encouragement when wanting to output a dictionary; if you’re after HTML streaming and are using Django 1.8, you may want to investigate Jinja2 templates now that they’re directly supported.
[ I apologise in the above for every mistaken use of generator instead of iterator, or vice-versa; at least the code runs okay 🙂 ]
You might not be in the ward you think you are. Due to ongoing boundary changes, many people will be voting within new wards this year. Confused? Fortunately, you can use our new MapIt-powered ward comparison site to see whether you’ll be affected by any new boundaries.
Pop your postcode in at 2015wards.mysociety.org and if your boundary is changing you’ll see your old and new wards.
These alterations are generally put in place by the Local Government Boundary Commissions, in an aim to even out the number of constituents represented by each councillor.
All well and good, but mySociety’s developers don’t just roll out the big stuff. Smaller releases are happening all the time, and, as a bunch of them have all come at once, we’ve put together a round-up.
Oh – and it’s worth saying that your feedback helps us prioritise what we work on. If you’re using any of our software, either as an implementer or a front-end user, and there’s something you think could be better, we hope you’ll drop us a line.
Here’s what we’ve been doing lately:
We’ve just released version 1.2 of our postcodes-to-boundaries software.
The new version adds Django 1.7 and Python 3 support, as well as other minor improvements.
The latest version of our transcript-publishing software, 1.3, adds mainstream support for import from fellow component PopIt (or any Popolo data source). That’s key to making it a truly interoperable Poplus Component.
SayIt is now also available in Spanish. Additionally, there are improvements around Speakers and Sections, plus this release includes OpenGraph data.
Many thanks to James of Open North, who contributed improvements to our Akoma Ntoso import.
Our software for storing, publishing and sharing lists of politicians now has multi-language support in the web-based editing interface as well at the API level.
Release 0.20 of our Freedom of Information platform sees improvements both to the Admin interface and to the front-end user experience.
Administrators will be pleased to find easier ways to deal with spammy requests for new authorities, and manage the categories and headings that are used to distinguish different types of authority; users should enjoy a smoother path to making a new request.
Version 1.5 of the FixMyStreet platform fully supports the new Long Term Support (LTS) version of Ubuntu, Trusty Tahr 14.04.
Four new languages – Albanian, Bulgarian, Hebrew, and Ukranian – have been added. There are also some improvements across both admin and the front-end design, and a couple of bugs have been fixed.
Whatever mySociety or Poplus software you’re deploying, we hope these improvements make life easier. Please do stay in touch – your feedback is always useful, whether it’s via the Poplus mailing list (MapIt, PopIt, SayIt), the FixMyStreet community or the Alaveteli community.
One of the projects we’ve been working on at mySociety recently is that of making it easier for people to set up new versions of our sites in other countries. Something we’ve heard again and again is that for many people, setting up new web applications is a frustrating process, and that they would appreciate anything that would make it easier.
You can use these AMIs to create a running version of one of these sites on an Amazon EC2 instance with just a couple of clicks. If you haven’t used Amazon Web Services before, then you can get a Micro instance free for a year to try this out. (We have previously published an AMI for Alaveteli, which helped many people to get started with setting up their own Freedom of Information sites.)
Each AMI is created from an install script designed to be used on a clean installation of Debian squeeze or Ubuntu precise, so if you have a new server hosted elsewhere, you can use that script to similarly create a default installation of the site without being dependent on Amazon:
In addition, we’ve launched new sites with documentation for FixMyStreet and MapIt, which will tell you how to customize those sites and import data if you’ve created a running instance using one of the above methods.
These documentation sites also have improved instructions for completely manual installations of either site, for people who are comfortable with setting up PostgreSQL, Apache / nginx, PostGIS, etc.
We hope that these changes make it easier than ever before to reuse our code, and set up new sites that help people fix things that matter to them.
Photo credit: duncan
All of us at mySociety love the fact that there are so many interesting new civic and democratic websites and apps springing up across the whole world. And we’re really keen to do what we can to help lower the barriers for people trying to build successful sites, to help citizens everywhere.
Today mySociety is unveiling MapIt Global, a new Component designed to eliminate one common, time-consuming task that civic software hackers everwhere have to struggle with: the task of identifying which political or administrative areas cover which parts of the planet.
As a general user this sort of thing might seem a bit obscure, but you’ve probably indirectly used such a service many times. So, for example, if you use our WriteToThem.com to write to a politician, you type in your postcode and the site will tell you who your politicians are. But this website can only do this because it knows that your postcode is located inside a particular council, or constituency or region.
Today, with the launch of MapIt Global , we are opening up a boundaries lookup service that works across the whole world. So now you can lookup a random point in Russia or Haiti or South Africa and find out about the administrative boundaries that surround it. And you can browse and inspect the shapes of administrative areas large and small, and perform sophisticated lookups like “Which areas does this one border with?”. And all this data is available both through an easy to use API, and a nice user interface.
We hope that MapIt Global will be used by coders and citizens worldwide to help them in ways we can’t even imagine yet. Our own immediate use case is to use it to make installations of the FixMyStreet Platform much easier.
We’re able to offer this service only because of the fantastic data made available by the amazing OpenStreetMap volunteer community, who are constantly labouring to make an ever-improving map of the whole world. You guys are amazing, and I hope that you find MapIt Global to be useful to your own projects.
The developers who made it possible were Mark Longair, Matthew Somerville and designer Jedidiah Broadbent. And, of course, we’re also only able to do this because the Omidyar Network is supporting our efforts to help people around the world.
From Britain to the World
For the last few years we’ve been running a British version of the MapIt service to allow people running other websites and apps to work out what council or constituency covers a particular point – it’s been very well used. We’ve given this a lick of paint and it is being relaunched today, too.
MapIt Global is also the first of The Components, a series of interoperable data stores that mySociety will be building with friends across the globe. Ultimately our goal is to radically reduce the effort required to launch democracy, transparency and government-facing sites and apps everywhere.
If you’d like to install and run the open source software that powers MapIt on your own servers, that’s cool too – you can find it on Github.
About the Data
The data that we are using is from the OpenStreetMap project, and has been collected by thousands of different people. It is licensed for free use under their open license. Coverage varies substantially, but for a great many countries the coverage is fantastic.
The brilliant thing about using OpenStreetMap data is that if you find that the boundary you need isn’t included, you can upload or draw it direct into Open Street Map, and it will subsequently be pulled into MapIt Global. We are planning to update our database about four times a year, but if you need boundaries adding faster, please talk to us.
If you’re interested in the technical aspects of how we built MapIt Global, see this blog post from Mark Longair.
Commercial Licenses and Local Copies
MapIt Global and UK are both based on open source software, which is available for free download. However, we charge a license fee for commercial usage of the API, and can also set up custom installs on virtual servers that you can own. Please drop us a line for any questions relating to commercial use.
Ah, summer: walks in the park, lazing in the long grass, and the sound of chirping crickets – all overlaid with the clatter of a thousand keyboards.
That may not be your idea of summer, but it’s certainly the ways ours is shaping up. We’re participating in Google’s Summer of Code, which aims to put bright young programmers in touch with Open Source organisations, for mutual benefit.
What do the students get from it? Apart from a small stipend, they have a mentored project to get their teeth into over the long summer hols, and hopefully learn a lot in the process. We, of course, see our code being used, improved and adapted – and a whole new perspective on our own work.
Candidates come from all over the world – they’re mentored remotely – so for an organisation like mySociety, this offers a great chance to get insight into the background, politics and technical landscape of another culture. Ideas for projects that may seem startlingly obvious in, say, Latin America or India would simply never have occurred to our UK-based team.
This year, mySociety were one of the 180 organisations participating. We had almost 100 enquiries, from countries including Lithuania, India, Peru, Georgia, and many other places. It’s a shame that we were only able to take on a couple of the many excellent applicants.
We made suggestions for several possible projects to whet the applicants’ appetite. Mobile apps were popular, in particular an app for FixMyTransport. Reworking WriteToThem, and creating components to complement MapIt and PopIt also ranked highly.
It was exciting to see so many ideas, and of course, hard to narrow them down.
In the end we chose two people who wanted to help improve our nascent PopIt service. PopIt will allow people to very quickly create a public database of politicians or other figures. No technical knowledge will be needed – where in the past our code has been “Just add developers”, this one is “Just add data”. We’ll host the sites for others to build on.
Our two successful applicants both had ideas for new websites that would use PopIt for their datastore, exactly the sort of advanced usage we hope to encourage. As well as making sure that PopIt actually works by using it they’ll both be creating transparency sites that will continue after their placements ends. They’ll also have the knowledge of how to set up such a site, and in our opinion that is a very good thing.
We hope to bring you more details as their projects progress, throughout the long, hot (or indeed short and wet) summer.
PS: There is a separate micro-blog where we’re currently noting some of the nitty gritty thoughts and decisions that go into building something like PopIt. If you want to see how the project goes please do subscribe!
Top image by Elaine Millan, used with thanks under the Creative Commons licence.