Mapumental can turn vast datasets into visual tools that everyone understands. Faced with highly complex, yet crucial data from the Fire Protection Association, we had a chance to really put our technology through its paces.
Just how quickly could fire engines reach a given postcode in case of a fire? It’s a question that’s pivotal to decisions made by both the emergency services and the insurance industry.
But previously, it has been a challenge to present the data simply, because it involves so many variables.
Every region has its own factors, each of which will impact on fire engine response time. The number of vehicles at each station, the hours during which the station is manned, and the response policy of each individual fire authority will all play a part – and that’s before you even consider how geography might affect things.
Dr. Jim Glockling is Technical Director at the Fire Protection Association and Head of the Risk Insight, Strategy and Control Authority (RISCAuthority), an organisation for the advancement of risk management within the fire and security sectors. Jim approached mySociety with this question: how could we map this crucial, yet complicated data in a way that could be understood by RISCAuthority members at a glance?
It was clearly a job for Mapumental. Our transit-time mapping software was originally built to visualise public transport journey times, but its beauty is that ‘layers’ of data can be swapped out, allowing it to be used for all kinds of purposes.
Read more about mySociety’s data visualisation services here.
Assessing a property or postcode
And here’s the result of our pilot project. The maps on the right answer the following questions (click each image to see it at full size):
How quickly could 4 fire engines get to AL10 0XR ?
How does that change if the severity of the fire just requires one fire engine?
A user inputs a postcode, and can assess exactly how quickly a fire could be tackled in that area. The different levels of severity are measured by how many response vehicles are required, and changes in this number are immediately reflected on the map.
Assessing the general area
Which areas can four fire engines get to within 9 minutes 30 seconds at midday on a Saturday?
It’s also possible to assess the region’s overall response capability, without inputting a postcode. The user sets severity levels (number of fire engines, or High Volume Pumping or Aerial Appliance (ladder) is needed), the time and day of the week.
Where can an aerial appliance get to within 15 minutes at 2am on a weekday?
The FPA tool immediately highlights the areas that are accessible within the chosen parameters, drawing on the underlying data of journey times and information such as vehicle numbers and hours of operation for each individual fire station in the region.
With RISCAuthority, we tested the concept using data from one fire authority – Hertfordshire. mySociety’s task was to create a usable, elegant web interface that was as simple as possible to use, while still giving insurers the key data they needed.
The project called on everything we knew about clean design, usability and data structures. A key part of what makes Mapumental’s data visualisation so intuitive are its sliders: this enables the user to quickly explore variables on a map.
A tool with purpose
Dr Glockling explains: “Whilst not necessarily used as a component of insurance pricing, this information helps insurers administer risk control and fire protection advice to their customers in the context of what the Fire and Rescue Services will be able to achieve on their behalf.”
The response time is just one factor that insurance surveyors will take into account when they are assessing a building. “Where response and arrival times are not coherent with protecting the viability of the business in the event of fire, additional forms of in-built protection and control might be recommended, such as the installation of sprinkler systems.”
“In the longer term it is hoped such information will impact beneficially on the annual cost of fire in the UK.”
The pilot tool was well received by the FPA community, and the plan is now to work with RISCAuthority to roll it out to more fire authorities shortly, and then nationwide.
Dr Glockling explains the pilot study helped them to understand two factors:
Would they get buy-in from both insurers and Fire and Rescue services on the viability and usefulness of the project?
Was it possible to present such a massive amount of data in a format that was readily palatable to the intended audience?
He says, “Mapumental’s team displayed an immediate understanding of our requirement. Delivery was to time and the result has perfectly satisfied the de-risking ambition. The working relationships were very good throughout and we intend now to extend the pilot to full UK rollout.”
During this phase, we will be inputting still more detail to the data, including information on the types of fire engine available to each region, and the plotting of fire stations on the map.
The tool will be a valuable resource for the FPA and the insurance industry, and we really look forward to the roll-out later this year.
Mapumental specialises in visualising complex geographic data sets on intuitive, easy to use map tools. If you have a data visualisation project that will benefit from Mapumental, just get in touch. Or read more about mySociety’s data visualisation services here.
Photo by William Murphy (CC)
I am Duncan Parkes, a developer for mySociety, a non-profit full of web geeks. One of the things we try to do well here is to take complicated data and turn it into really usable tools – tools which are attractive to people who aren’t web (or data) geeks.
For some considerable time I’ve been working on Mapumental – a project that is about turning public transport timetable data into pretty, interactive maps featuring isochrones, shapes that show people where they can live if they want to have a commute of a particular time. You can play with the new version we just launched here. That particular map shows the commuting options to where the Queen lives. Slide the slider for full effect.
There are a couple of hard problems that need solving if you want to build a service with an interactive journey times overlay like this. You need to be able to calculate a *huge* number of journeys extremely quickly, and you need to be able to make custom map layers so that it all looks nice. But what I think might be most interesting for you is the way in which the contours get rendered on top of the maps.
It all started about three years ago, when the first version of the app – co-developed with the geniuses at Stamen – used Flash/Flex to draw contours on the maps, and to let people play with them. You can still play with a couple of versions of that technology from way back in 2007, that is, unless you’re using an iPad or iPhone, which of course don’t do Flash.
What was going on inside this Flash app was as follows. We needed to show the user any one of hundreds of different combinations of journey times (5 minutes, 12 minutes, 56 minutes, etc) depending on where they set the slider. Sending each one from the server as a tiled map overlay would be dead slow. Even Google – who have chosen to send new tiles each time – end up with a service which is surprisingly slow (try choosing a different time on this map).
With some help from Stamen, we decided that the way of making it possible to show many different contours very quickly was send the client just one set of tiles, where each tile contained all the data for a variety of journey times. What does that mean? Simple: each colour in the tile represented a different number of minutes travelling on the map. So a batch of pixels that are colour X, all show places that are 15 minutes from the centre of the map.
So, in this old Flash system, when you slide the slider along, the Flash app makes some of the coloured pixels opaque, and the others transparent. It was, in short, a form of colour cycling, familiar to lovers of 8 and 16 bit computer games.
However, from about 2010 onwards, the march of iOS spelt the end of Flash. And that meant that we couldn’t launch a shiny new site based on this technology, as lovely as it was. We had to work out some approach that would use modern web standards instead.
The Death of Flash Makes Life Difficult – for a while
How do we replicate the experience of dragging a slider and seeing the map change like in the original Mapumental demo, but without Flash? One of the things that made the original Mapumental nice to use was how smooth the image changes were when you dragged the slider. Speed really matters to create that sort of organic effect that makes the demo so mesmerising.
So as we started to tackle the question “How do we make this work in a post-Flash world?”. And the first thought was “Let’s do away with those map tiles, filled with all that journey time data!”. After all – why send any tiles to a modern browser, if it can just render nice shapes on the fly?
So we had a go. Several goes. At first we tried rendering SVG circles around each public transport stop – but that was too slow, particularly when zoomed out. Then we tried rendering circles in Canvas, and whilst that was OK in sparsely populated places it sucked in the cities, where people would actually want to use it.
Back to Colour Cycling – Using Web Standards
So, I had a bit of a look at the waterfall. It seems to work by holding in memory a structure which has all the pixels which change and all the colours they should change to and when. This works beautifully for the waterfall picture, but only a limited number of the pixels in that image actually change colour, and the image is quite small. For a full screen web browser with a big map in, this didn’t seem promising, although I’d love to see someone try.
Unfortunately, there is no way to change the palette of an image that you’ve put on the canvas. In fact, there’s no way to change the palette of an HTML img element: all you can do is assign it a new src attribute.
But this gets back to the original problem – we don’t want to download new mapping for every different position on the time slider. We definitely can’t afford to have the client downloading a new image source for every tile whenever the slider is moved, so we had to find a way to make that src at the client end and get that into the src attribute.
The Breakthrough – Data URIs and Base64 encoding
So we started trying data URIs. For those of you not familiar, these allow you to put a whole object into your HTML or CSS, encoded in Base64. They’re commonly used to prevent pages having to make extra downloads for things like tiny icons.
My new plan was that the client, having downloaded each palette-based image, would make a Base64 encoded version of it, which it could then use to build a version with the right palette and assign this as a data URI of the tile.
So in summary, what we built does this:
- The server calculates the journey times and renders them to palette-based tiles.
- It sends these to the client, encoded in Base64, and with the initial bits up to the palette and transparency chunks removed.
- At the client end, we have a pre-prepared array of 255 ‘starts’ of PNGs that we combine with the later parts of the ’tiles’ from the tile server to make data URIs.
- When you drag the slider it combines the appropriate ‘start’ of a PNG with the bulk of the tile that has been downloaded from the server, and assigns that to the src attribute of the tile.
And that’s how the nice overlays on Mapumental work. But as so often in coding, the really interesting devil is in the detail – read on if you’re interested.
Diving into Base64 and the PNG file format – The Gnarly Bits
So – why are there 255 of these ‘starts’ of these PNGs, and what do I mean by a ‘start’ anyway?
PNG files are divided up into an 8 byte signature (the same for every PNG file) and a number of chunks, where each chunk consists of 4 bytes to tell you its length, 4 bytes of its name, some data, and 4 bytes of cyclic redundancy check. In this case, what I call a ‘start’ of a PNG is the 8 byte signature, the 25 byte of the IHDR chunk, and the PLTE (palette) and tRNS (transparency) chunks. The PLTE chunk has 12 bytes of overhead and 3 bytes per colour, and the tRNS chunk has 12 bytes of overhead and 1 byte per colour.
Base64 encoding is a way of representing binary data in text so that it can be used in places where you would normally expect text – like URIs. Without going into too much detail, it turns groups of 3 bytes of binary gumpf into 4 bytes of normal ASCII text without control characters in it, which can then be put into a URI.
Why do we have 255 colours, rather than the maximum 256 which are available in a palette? Because we need the break between the end of the tRNS chunk and the start of the IDAT chunk in the PNG file to align with a break between groups of three bytes in the Base64 encoded image. We need the length of these starts to be a multiple of 3 bytes in the original PNG format, which translates into a multiple of 4 bytes in the Base64 encoded version, so we can cut and shut the images without corruption.
Which just goes to show that even though web GIS technologies may feel like they are approaching a zenith of high level abstraction, there’s still some really gnarly work to be done to get the best out of current browsers.
If you’ve been following mySociety for a while, you’ll know that we have been interested in making maps that show commuting times for several years.
However, we’ve never made public a simple, free, useful version of our slidy-swooshy Mapumental journey times technology. Until today.
Today we pull the wraps off Mapumental Property , a house-hunting service covering England, Scotland and Wales, designed to help you work out where you might live if you want a public transport commute of a particular maximum duration. Have a go, and we guarantee you’ll find it an oddly compelling experience.
We think it’s a genuinely useful tool – especially since unlike some of the other players in this space, we’ve got all the different kinds of public transport, right across the whole of Great Britain. We hope that some of you will find it helpful when deciding where to live.
However, this launch doesn’t mean mySociety is bent on taking over the property websites sector. Mapumental Property isn’t a challenger to the likes of Rightmove, it’s a calling card – an advertisement for our skills – which we hope will help mySociety to attract people and organisations who want beautiful, useful web tools built for them.
In particular we’d like people interested in Mapumental to note that:
- We like to build attractive, usable web tools for clients of all kinds.
- We know how to use complex data to make simple, lovely things.
- We can do some mapping technology that others haven’t worked out yet.
If any of that is of interest, please get in touch, or read about our software development and consulting services.
I’d like to thank quite a few people for helping with this launch. Duncan Parkes was the lead developer, Matthew Somerville ably assisted. Jedidiah Broadbent did the design. The idea originally came from the late Chris Lightfoot, and me, Tom Steinberg. Francis Irving built the first version, and Stamen came up with the awesome idea of using sliders in the first place (and built some early tech). Kristina Glushkova worked on business development, and Zoopla’s API provides the property data. I’m also grateful to Ed Parsons of Google for very kindly giving us a hat tip when they built some technology that was inspired by Mapumental. Thanks to everyone – this has been a long time coming.
We’ll follow up soon with a post about the technology – and in particular how we got away from using Flash. It has been an interesting journey.
It’s high time we updated you on Mapumental, our journey-time mapping project. For those who may not remember, Mapumental is based on a simple idea: to visualise transit times, by public transport, from or to any postcode in Great Britain.
It all began in 2006, when the Department for Transport approached us to see what we might do with public transport data; in 2009 we won an investment loan from Channel 4 and Screen West Midlands which enabled us to build a beta tool – you might have played with it. If not, go on, have a go. It’s fun!
It’s been quite a long journey to where we are today. Unlike many mySociety projects, funding for Mapumental’s development came from a commercial investment loan, with a condition that we set it up as a business. For that reason, it’s not enough that it’s beautiful and useful – we need to find ways for it to be profitable, too. All revenues are set to come back to fund our not-for-profit activities.
We could tell from very early on in the project that Mapumental would be a sought-after tool for all sorts of purposes, from business to personal use. For example, you can see commute times at a glance, so it’s great for house-hunters and job-seekers. Consequently, it’s also great for the property and recruitment industries.
“Your maps look amazing, such a great way of representing what could be really boring data, but isn’t.” – A jobseeker
We can see loads of other possibilities too – like urban planning. This sort of analysis would have been far more expensive in the past; with Mapumental, planners can see at a glance how accessible a new development would be by public transport. Its potential uses are wide-ranging, answering questions for businesses, organisations, charities, and public facilities – especially those wanting to maximise accessibility or encourage use of greener transport options.
“The maps are a fantastic, a great tool and should be used for every planning application. I will be using Mapumental for all of our projects!” – Lee Taylor, Veridis Design
We’ve recently refined a product that’s pared down from the dynamic maps you may remember from that beta tool: static maps. These are simple, non-interactive maps which show transit time in bands. They’re flexible in that they can be generated for any postcode, with any maximum travel time, and depict travel at any given time of day.
We can provide a one-off map for personal use, or batches of many thousands of maps – as we have done for estate agents Foxtons, who now have a Mapumental map on every property listing.
As we generate more and more maps for different uses, showing different parts of the country, we’re really enjoying digging out all sorts of surprising facts – like how it’s quicker to travel from Watford to Westminster than it is from some parts of Harringay. Or how Cardiff University students might sensibly live at all points east as far as Newport, but will be stymied for transport in the west if they live anywhere other than Barry or Bridgend.
In fact, our very favourite use so far has come from an individual who centred his map around his home postcode. He tells us he has printed it off and put it up by the front door, so that on his way out of the house, he can find a new and surprising destination for day-trips.
Find out more on the Mapumental website – and please do spread the word among friends and colleagues who might benefit from a Mapumental map.
I’ve been doing lots of research around “cloud computing” recently, so we can change how Mapumental works and take it out of private beta.
One thing that’s struck me is that there doesn’t seem to be a proper, industry standard name to distinguish what to me are two fundamentally different sorts of “cloud computing”. I’m focusing here entirely on cloud services for programmers (let’s leave what it means to end users or businesses for another day).
Here are my own names and descriptions of them:
1) Cloud hardware server provision (Cloud HSP)
Low level APIs for making and destroying (virtual) servers, and loading machine images onto them. e.g. Amazon Elastic Compute Cloud, Rackspace Cloud Servers, Eucalyptus’s EC2 bits. Basically, what Eucalyptus v 1.5 can do and what libcloud should do. (By analogy, this is the assembly language of cloud computing)
2) Cloud developer service provision (Cloud DSP) A service that a developer accesses with one name and a simple API, and behind the scenes it scales for him, automatically. e.g. Amazon Queue Service, Rackspace Cloud Files. (By analogy, this layer is the C programming language of cloud computing)
[as an aside, Google AppEngine is an interesting one. It is definitely in the Cloud DSP category, but I think it is larger than that - it is a whole set of APIs all in that category. Something like Google DataStore is a single Cloud DSP, albeit one apparently only accessible within AppEngine apps]
It’s possible to use a Cloud HSP (assembly language), along with a bunch of your own software or open source software, to build new Cloud DSPs (C code). Right now this is pretty hard – even quite well known open source distributed datasbases like CouchDB still need scripting to even make them replicate. The code that makes and destroys servers and gives the service one name, needs manually stringing with quite new bits of wire (things like scalr and Wackamole).
For this reason, I’m reluctant for mySociety to get into the “making our own Cloud DSP out of Cloud HSP” game. It feels to me like a suck of time, and like we wouldn’t be able to guarantee without lots of careful and expensive testing that it would scale. I’m more tempted to use the commercial Cloud DSP services where possible, even though they are proprietary. But use them via our own abstraction layer, so we can change as we need to. Of course, we have some C++ code (the public transport route finder), so will have to use the Cloud HSP API to get that going, perhaps with Amazon’s Auto Scaling. But it can jolly well use AQS and S3 to talk to other services.
So, what do you think about the names Cloud HSP/DSP? Are there already existing names for the distinction that I’m making? Is it a useful distinction for you? Can you think of better names?
Here is a diagram of how the backend of Mapumental works. Take it in the spirit that Chris Lightfoot set when he made a similar diagram for the No. 10 petitions site – although many such diagrams are useless, hopefully this one contains useful information.
If you haven’t seen Mapumental yet, first take a look at the video, and sign up for the private beta.
(Click on the diagram for a large version)
Below, I’ve explained what the main components are, and some interesting things about them.
Everything can, at least in theory, run on lots of servers. Currently we are only actually using one server for web requests, because of problems with HAProxy. We’re runnning isodaemons on two different servers.
Basic web application – it started out as raw Python, but the more Matthew hacks on it the more Django libraries he pulls in. Soon it’ll be indistinguishable from a Django app. When someone enters a new postcode, it adds it to the work queue in the PostgreSQL database, then refreshes waiting for the job to be finished. Then it displays the flash application (made by Stamen), set up to load the appropriate tile layers.
Tile server and cache – This uses the Python-based TileCache, calling Geospatial Data Abstraction Library (GDAL) to help render the tiles from points. It was originally written by Stamen, and expanded by mySociety. GDAL isn’t perfect, it doesn’t have fancy enough algorithms for my liking. e.g. Using a median rather than a weighted mean.
Isodaemons – These are controlled by a Python script, but the bulk of the code is custom written in C++. Slightly crazily, this can find the quickest route by public transport for each of 300,000 journeys from every station in the UK to a particular station, arriving at a particular time, in 10 to 30 seconds.
I had no idea how to do this, but luckily I live in Cambridge, UK. It’s a city fit to bursting with computer scientists. Many of the jobs are dull, and need little computing, never mind science – like writing interface layers for SQL server. So if you have a real interesting problem it’s easy to get help!
The universal advice was to use Dijkstra’s algorithm, which needed a bit of adaptation to work efficiently over space-time, rather than just space. Normally it is used for planning routes round a map, but public transport isn’t like that, you have to arrive in time for each particular train, so time affects what journeys you can take.
I originally wrote it in Python, which was not only too slow, but used up far far too much RAM. It could never have loaded the whole dataset in. However, the old Python code is still run by the test script, to double check the C++ code against. It is also still used to make the binary timetable files, see below.
Travel times, 1 binary file / postcode – I briefly attempted to insert 300,000 rows into PostgreSQL for each postcode looked up, but it was obvious it wasn’t going to scale. Going back to basics, it now just saves the time taken to travel to each station in a simple binary file – two bytes for each station, 600k in total. The tile server then does random access lookups into that file, as it renders each tile. It only needs to look up the values for the stations it knows are on/near the tile.
There’s various other bits:
Thanks to everyone who helped make Mapumental, we couldn’t have done it without lots of clever people.
I realise the above is a sketchy overview, so please ask questions in the comments, and I’ll do my best to answer them.