Mapumental can turn vast datasets into visual tools that everyone understands. Faced with highly complex, yet crucial data from the Fire Protection Association, we had a chance to really put our technology through its paces.
Just how quickly could fire engines reach a given postcode in case of a fire? It’s a question that’s pivotal to decisions made by both the emergency services and the insurance industry.
But previously, it has been a challenge to present the data simply, because it involves so many variables.
Every region has its own factors, each of which will impact on fire engine response time. The number of vehicles at each station, the hours during which the station is manned, and the response policy of each individual fire authority will all play a part – and that’s before you even consider how geography might affect things.
Dr. Jim Glockling is Technical Director at the Fire Protection Association and Head of the Risk Insight, Strategy and Control Authority (RISCAuthority), an organisation for the advancement of risk management within the fire and security sectors. Jim approached mySociety with this question: how could we map this crucial, yet complicated data in a way that could be understood by RISCAuthority members at a glance?
It was clearly a job for Mapumental. Our transit-time mapping software was originally built to visualise public transport journey times, but its beauty is that ‘layers’ of data can be swapped out, allowing it to be used for all kinds of purposes.
Read more about mySociety’s data visualisation services here.
Assessing a property or postcode
And here’s the result of our pilot project. The maps on the right answer the following questions (click each image to see it at full size):
How quickly could 4 fire engines get to AL10 0XR ?
How does that change if the severity of the fire just requires one fire engine?
A user inputs a postcode, and can assess exactly how quickly a fire could be tackled in that area. The different levels of severity are measured by how many response vehicles are required, and changes in this number are immediately reflected on the map.
Assessing the general area
Which areas can four fire engines get to within 9 minutes 30 seconds at midday on a Saturday?
It’s also possible to assess the region’s overall response capability, without inputting a postcode. The user sets severity levels (number of fire engines, or High Volume Pumping or Aerial Appliance (ladder) is needed), the time and day of the week.
Where can an aerial appliance get to within 15 minutes at 2am on a weekday?
The FPA tool immediately highlights the areas that are accessible within the chosen parameters, drawing on the underlying data of journey times and information such as vehicle numbers and hours of operation for each individual fire station in the region.
With RISCAuthority, we tested the concept using data from one fire authority – Hertfordshire. mySociety’s task was to create a usable, elegant web interface that was as simple as possible to use, while still giving insurers the key data they needed.
The project called on everything we knew about clean design, usability and data structures. A key part of what makes Mapumental’s data visualisation so intuitive are its sliders: this enables the user to quickly explore variables on a map.
A tool with purpose
Dr Glockling explains: “Whilst not necessarily used as a component of insurance pricing, this information helps insurers administer risk control and fire protection advice to their customers in the context of what the Fire and Rescue Services will be able to achieve on their behalf.”
The response time is just one factor that insurance surveyors will take into account when they are assessing a building. “Where response and arrival times are not coherent with protecting the viability of the business in the event of fire, additional forms of in-built protection and control might be recommended, such as the installation of sprinkler systems.”
“In the longer term it is hoped such information will impact beneficially on the annual cost of fire in the UK.”
The pilot tool was well received by the FPA community, and the plan is now to work with RISCAuthority to roll it out to more fire authorities shortly, and then nationwide.
Dr Glockling explains the pilot study helped them to understand two factors:
Would they get buy-in from both insurers and Fire and Rescue services on the viability and usefulness of the project?
Was it possible to present such a massive amount of data in a format that was readily palatable to the intended audience?
He says, “Mapumental’s team displayed an immediate understanding of our requirement. Delivery was to time and the result has perfectly satisfied the de-risking ambition. The working relationships were very good throughout and we intend now to extend the pilot to full UK rollout.”
During this phase, we will be inputting still more detail to the data, including information on the types of fire engine available to each region, and the plotting of fire stations on the map.
The tool will be a valuable resource for the FPA and the insurance industry, and we really look forward to the roll-out later this year.
Mapumental specialises in visualising complex geographic data sets on intuitive, easy to use map tools. If you have a data visualisation project that will benefit from Mapumental, just get in touch. Or read more about mySociety’s data visualisation services here.
Photo by William Murphy (CC)
If you’re searching for a new home, give Mapumental Property a try. lt narrows property results down, only showing you houses that fall within a decent commute time from the places you visit regularly – like work, school, or the shops. Here, have a go – it’s fun.
Irritation is the mother of invention
Several years ago, some of our colleagues were looking for a house to rent.
They weren’t set on a particular town. There were two important factors: that it was within a reasonable commute from central London, where they frequently attended meetings; and that the rent was affordable.
Faced with these requirements, most of us would sift through property sites and cross-reference the listings manually with public transport information. It’s rather time-consuming, and slightly irritating, but hey-ho, it has to be done.
But mySociety is in the business of building useful web tools, so when something irritates us like this, we look to see if we can solve the problem through the magic of code. In a stroke of good timing, it was at just around this time that the Department for Transport approached us to ask us to work with their public transport data – and Mapumental was conceived.
The key was to combine Ordnance Survey postcodes with the DfT’s data about journey times, NPTDR (National Public Transport Data Repository). This data set takes a ‘snapshot’ of every public transport journey in Great Britain for a selected week in October each year.
Sounds simple? The process was not without its challenges. Prime among them was the problem of displaying map tiles, plus the vast quantities of transport data, within a reasonable amount of time, no matter which postcode or zoom level the user chose. As we know, a ‘reasonable amount of time’ for a page to load is a metric which is forever shrinking.
By 2006, we had created Mapumental’s first iteration. Users could input a postcode and see all areas of the country that could be reached by public transport, divided into coloured travel-time bands. In 2009, Francis Irving, the mySociety coder behind Mapumental’s early endeavours, explained the technology he’d used. It was Flash-dependent, and a few years later, developer Duncan wrote about some of the technical hurdles he overcame replacing the Flash elements, in view of the rise of the iPhone, which famously doesn’t ‘do’ Flash.
Hoorah! Now our colleagues could type in a central London postcode and see everywhere that fell within a 40-minute journey from there. It wasn’t long before we added median house price data, too.
Beauty is in the eye of the crowd
We even added a ‘scenicness’ rating: if the beauty of your surroundings was important to you, you could rule out anywhere below a certain level of attractiveness.
How did we assess how scenic every area in the UK is? By crowdsourcing the information – our ScenicOrNot website displays a random photograph from every square mile of the British isles, inviting people to rate them. It is surprisingly compulsive.
A showcase tool
Mapumental may have been born from our own needs, but we knew from the beginning that it would have wider applications. It has always been the sort of project that got people excited, once they saw it in action.
We wanted to show how elegantly Mapumental can handle all kinds of data, starting with houses for sale and rent – so we developed Mapumental Property. It’s not intended as a serious competitor to the giant property websites out there. Rather, it’s an all-singing, all-dancing demonstration of Mapumental’s strengths.
In this case, the data is from the property website Zoopla, and you can narrow it down to show rental or sales property within your chosen price bands and commute distances. You can even add multiple destination points, so that households of two or more people can find their optimum location.
But Mapumental is not just about property: swap out that Zoopla layer, and you could put in anything else you can imagine – hospital locations, supermarkets, schools, job vacancies… you name it.
The beauty of Mapumental is that now we’ve done the really hard part, incorporating new data layers is relatively simple. Recent work for the Fire Protection Association and the Welsh Government, among others, has shown its versatility.
Now how about you?
We believe that Mapumental’s possibilities are pretty much endless. Have you got an unloved, difficult-to-navigate dataset that Mapumental could breathe new life into? Or would your stakeholders benefit from being able to see your data displayed on a map? Let us know.
I am Duncan Parkes, a developer for mySociety, a non-profit full of web geeks. One of the things we try to do well here is to take complicated data and turn it into really usable tools – tools which are attractive to people who aren’t web (or data) geeks.
For some considerable time I’ve been working on Mapumental – a project that is about turning public transport timetable data into pretty, interactive maps featuring isochrones, shapes that show people where they can live if they want to have a commute of a particular time. You can play with the new version we just launched here. That particular map shows the commuting options to where the Queen lives. Slide the slider for full effect.
There are a couple of hard problems that need solving if you want to build a service with an interactive journey times overlay like this. You need to be able to calculate a *huge* number of journeys extremely quickly, and you need to be able to make custom map layers so that it all looks nice. But what I think might be most interesting for you is the way in which the contours get rendered on top of the maps.
It all started about three years ago, when the first version of the app – co-developed with the geniuses at Stamen – used Flash/Flex to draw contours on the maps, and to let people play with them. You can still play with a couple of versions of that technology from way back in 2007, that is, unless you’re using an iPad or iPhone, which of course don’t do Flash.
What was going on inside this Flash app was as follows. We needed to show the user any one of hundreds of different combinations of journey times (5 minutes, 12 minutes, 56 minutes, etc) depending on where they set the slider. Sending each one from the server as a tiled map overlay would be dead slow. Even Google – who have chosen to send new tiles each time – end up with a service which is surprisingly slow (try choosing a different time on this map).
With some help from Stamen, we decided that the way of making it possible to show many different contours very quickly was send the client just one set of tiles, where each tile contained all the data for a variety of journey times. What does that mean? Simple: each colour in the tile represented a different number of minutes travelling on the map. So a batch of pixels that are colour X, all show places that are 15 minutes from the centre of the map.
So, in this old Flash system, when you slide the slider along, the Flash app makes some of the coloured pixels opaque, and the others transparent. It was, in short, a form of colour cycling, familiar to lovers of 8 and 16 bit computer games.
However, from about 2010 onwards, the march of iOS spelt the end of Flash. And that meant that we couldn’t launch a shiny new site based on this technology, as lovely as it was. We had to work out some approach that would use modern web standards instead.
The Death of Flash Makes Life Difficult – for a while
How do we replicate the experience of dragging a slider and seeing the map change like in the original Mapumental demo, but without Flash? One of the things that made the original Mapumental nice to use was how smooth the image changes were when you dragged the slider. Speed really matters to create that sort of organic effect that makes the demo so mesmerising.
So as we started to tackle the question “How do we make this work in a post-Flash world?”. And the first thought was “Let’s do away with those map tiles, filled with all that journey time data!”. After all – why send any tiles to a modern browser, if it can just render nice shapes on the fly?
So we had a go. Several goes. At first we tried rendering SVG circles around each public transport stop – but that was too slow, particularly when zoomed out. Then we tried rendering circles in Canvas, and whilst that was OK in sparsely populated places it sucked in the cities, where people would actually want to use it.
Back to Colour Cycling – Using Web Standards
So, I had a bit of a look at the waterfall. It seems to work by holding in memory a structure which has all the pixels which change and all the colours they should change to and when. This works beautifully for the waterfall picture, but only a limited number of the pixels in that image actually change colour, and the image is quite small. For a full screen web browser with a big map in, this didn’t seem promising, although I’d love to see someone try.
Unfortunately, there is no way to change the palette of an image that you’ve put on the canvas. In fact, there’s no way to change the palette of an HTML img element: all you can do is assign it a new src attribute.
But this gets back to the original problem – we don’t want to download new mapping for every different position on the time slider. We definitely can’t afford to have the client downloading a new image source for every tile whenever the slider is moved, so we had to find a way to make that src at the client end and get that into the src attribute.
The Breakthrough – Data URIs and Base64 encoding
So we started trying data URIs. For those of you not familiar, these allow you to put a whole object into your HTML or CSS, encoded in Base64. They’re commonly used to prevent pages having to make extra downloads for things like tiny icons.
My new plan was that the client, having downloaded each palette-based image, would make a Base64 encoded version of it, which it could then use to build a version with the right palette and assign this as a data URI of the tile.
So in summary, what we built does this:
- The server calculates the journey times and renders them to palette-based tiles.
- It sends these to the client, encoded in Base64, and with the initial bits up to the palette and transparency chunks removed.
- At the client end, we have a pre-prepared array of 255 ‘starts’ of PNGs that we combine with the later parts of the ’tiles’ from the tile server to make data URIs.
- When you drag the slider it combines the appropriate ‘start’ of a PNG with the bulk of the tile that has been downloaded from the server, and assigns that to the src attribute of the tile.
And that’s how the nice overlays on Mapumental work. But as so often in coding, the really interesting devil is in the detail – read on if you’re interested.
Diving into Base64 and the PNG file format – The Gnarly Bits
So – why are there 255 of these ‘starts’ of these PNGs, and what do I mean by a ‘start’ anyway?
PNG files are divided up into an 8 byte signature (the same for every PNG file) and a number of chunks, where each chunk consists of 4 bytes to tell you its length, 4 bytes of its name, some data, and 4 bytes of cyclic redundancy check. In this case, what I call a ‘start’ of a PNG is the 8 byte signature, the 25 byte of the IHDR chunk, and the PLTE (palette) and tRNS (transparency) chunks. The PLTE chunk has 12 bytes of overhead and 3 bytes per colour, and the tRNS chunk has 12 bytes of overhead and 1 byte per colour.
Base64 encoding is a way of representing binary data in text so that it can be used in places where you would normally expect text – like URIs. Without going into too much detail, it turns groups of 3 bytes of binary gumpf into 4 bytes of normal ASCII text without control characters in it, which can then be put into a URI.
Why do we have 255 colours, rather than the maximum 256 which are available in a palette? Because we need the break between the end of the tRNS chunk and the start of the IDAT chunk in the PNG file to align with a break between groups of three bytes in the Base64 encoded image. We need the length of these starts to be a multiple of 3 bytes in the original PNG format, which translates into a multiple of 4 bytes in the Base64 encoded version, so we can cut and shut the images without corruption.
Which just goes to show that even though web GIS technologies may feel like they are approaching a zenith of high level abstraction, there’s still some really gnarly work to be done to get the best out of current browsers.
If you’ve been following mySociety for a while, you’ll know that we have been interested in making maps that show commuting times for several years.
However, we’ve never made public a simple, free, useful version of our slidy-swooshy Mapumental journey times technology. Until today.
Today we pull the wraps off Mapumental Property , a house-hunting service covering England, Scotland and Wales, designed to help you work out where you might live if you want a public transport commute of a particular maximum duration. Have a go, and we guarantee you’ll find it an oddly compelling experience.
We think it’s a genuinely useful tool – especially since unlike some of the other players in this space, we’ve got all the different kinds of public transport, right across the whole of Great Britain. We hope that some of you will find it helpful when deciding where to live.
However, this launch doesn’t mean mySociety is bent on taking over the property websites sector. Mapumental Property isn’t a challenger to the likes of Rightmove, it’s a calling card – an advertisement for our skills – which we hope will help mySociety to attract people and organisations who want beautiful, useful web tools built for them.
In particular we’d like people interested in Mapumental to note that:
- We like to build attractive, usable web tools for clients of all kinds.
- We know how to use complex data to make simple, lovely things.
- We can do some mapping technology that others haven’t worked out yet.
If any of that is of interest, please get in touch, or read about our software development and consulting services.
I’d like to thank quite a few people for helping with this launch. Duncan Parkes was the lead developer, Matthew Somerville ably assisted. Jedidiah Broadbent did the design. The idea originally came from the late Chris Lightfoot, and me, Tom Steinberg. Francis Irving built the first version, and Stamen came up with the awesome idea of using sliders in the first place (and built some early tech). Kristina Glushkova worked on business development, and Zoopla’s API provides the property data. I’m also grateful to Ed Parsons of Google for very kindly giving us a hat tip when they built some technology that was inspired by Mapumental. Thanks to everyone – this has been a long time coming.
We’ll follow up soon with a post about the technology – and in particular how we got away from using Flash. It has been an interesting journey.
After a great deal of hard work we are pleased to announce that Mapumental Property will be launching on the 8th of November 2012.
Mapumental Property uses public transport open data from across the country to show you areas you can live that are an acceptable commute from your office, school or other destination. We have data on buses, trains, trams, tubes – so we look into all the combinations that might help you get to work quickly.
We’ve built the site to solve a problem that these other big sites don’t quite get right – commuting. Nobody likes to commute a minute further than strictly necessary. But in a world of complex public transport networks, especially in our big cities, it can be highly unclear where you might be able to live and still get to work in 30 minutes. Mapumental Property will help, and it works anywhere in Britain. It works in Aberdeen just as well as Shoreditch.
From next week people in Britain can easily see areas that are less than a specified amount of time away from a place of work, study or other importance, by public transport. So if you’ve ever thought “I wish I could see a map of everywhere less than half an hour’s commute from this office”, this is your answer.
Look for more updates next week!
It’s high time we updated you on Mapumental, our journey-time mapping project. For those who may not remember, Mapumental is based on a simple idea: to visualise transit times, by public transport, from or to any postcode in Great Britain.
It all began in 2006, when the Department for Transport approached us to see what we might do with public transport data; in 2009 we won an investment loan from Channel 4 and Screen West Midlands which enabled us to build a beta tool – you might have played with it. If not, go on, have a go. It’s fun!
It’s been quite a long journey to where we are today. Unlike many mySociety projects, funding for Mapumental’s development came from a commercial investment loan, with a condition that we set it up as a business. For that reason, it’s not enough that it’s beautiful and useful – we need to find ways for it to be profitable, too. All revenues are set to come back to fund our not-for-profit activities.
We could tell from very early on in the project that Mapumental would be a sought-after tool for all sorts of purposes, from business to personal use. For example, you can see commute times at a glance, so it’s great for house-hunters and job-seekers. Consequently, it’s also great for the property and recruitment industries.
“Your maps look amazing, such a great way of representing what could be really boring data, but isn’t.” – A jobseeker
We can see loads of other possibilities too – like urban planning. This sort of analysis would have been far more expensive in the past; with Mapumental, planners can see at a glance how accessible a new development would be by public transport. Its potential uses are wide-ranging, answering questions for businesses, organisations, charities, and public facilities – especially those wanting to maximise accessibility or encourage use of greener transport options.
“The maps are a fantastic, a great tool and should be used for every planning application. I will be using Mapumental for all of our projects!” – Lee Taylor, Veridis Design
We’ve recently refined a product that’s pared down from the dynamic maps you may remember from that beta tool: static maps. These are simple, non-interactive maps which show transit time in bands. They’re flexible in that they can be generated for any postcode, with any maximum travel time, and depict travel at any given time of day.
We can provide a one-off map for personal use, or batches of many thousands of maps – as we have done for estate agents Foxtons, who now have a Mapumental map on every property listing.
As we generate more and more maps for different uses, showing different parts of the country, we’re really enjoying digging out all sorts of surprising facts – like how it’s quicker to travel from Watford to Westminster than it is from some parts of Harringay. Or how Cardiff University students might sensibly live at all points east as far as Newport, but will be stymied for transport in the west if they live anywhere other than Barry or Bridgend.
In fact, our very favourite use so far has come from an individual who centred his map around his home postcode. He tells us he has printed it off and put it up by the front door, so that on his way out of the house, he can find a new and surprising destination for day-trips.
Find out more on the Mapumental website – and please do spread the word among friends and colleagues who might benefit from a Mapumental map.
We released our new service yesterday, which allows anyone to order personalised travel commuter maps for any location in Great Britain. Those of you who’ve followed this project for a while might be interested to know how we came to take this route.
Having finished working on the backend and hosting infrastructure of the Mapumental technology last year, we started thinking about the products that should be built with it. To help us work this out, we talked to lots of people in sectors where journey times matter a lot: residential and commercial property, job search, tourism and public services. What we found is that while everyone loved the dynamic location search technology, there were many situations when people wanted to have a simple static map of commuting times.
We heard that these maps would be useful to individuals looking for jobs or property – but also organisations, from property sites to providers of public services, businesses and entertainment venues who’d like a map to put on their website and brochures, or to use in internal analysis.
At first we were surprised, but the more we thought about it, the more sense it made. Our search tool, which we are currently working on updating, serves a different purpose: it shows a combination of search criteria, including travel times, and lets the user play with different parameters interactively. But it did not provide a simple snapshot of travel times for a location, divided in bands which are very helpful in assessing commuting times. So we set out to make the map image service, which is what we launched yesterday.
This was not particularly straightforward to make, and there were many things to consider: how exactly should the shop work, and what should it offer people? We have settled on four core options for the standard maps: total time mapped, direction of travel (whether the location is where one arrives at, or departs from), arrival or departure time, and custom map title. These maps are really easy to order from the website, and we can make them very quickly.
Online ordering works really well for small quantities, but is not ideal for high-volume clients. So we also created a new API – a URL fetcher which allows to create maps in high quantities, as and when needed. These maps can be fully customised, from the choice of colours to number of bands and zoom levels.
The the very first user of our API is Foxtons, the estate agent, who added commuter maps to their property listings last week. It is suitable for any property, jobs or hotels site who hold location information (postcodes, or latitude and longitude) for their listings. The API can equally be used by those needing maps for internal purposes, such as city planners, public services and businesses with multiple branches.
We are really excited that the service has gone live, and we hope that it helps people and organisations in all sorts of ways. A big thank you to Channel 4 and Screen West Midlands, who have provided the commercial investment to enable the development of Mapumental technology and the new service.
If you have any feedback or comments, we’d love to hear them.
Sample map: travel times to Wembley Stadium
We’re delighted to announce that leading London estate agent Foxtons has become the first property player to use Mapumental maps on its website. Visitors to Foxtons.co.uk will now see that every property listed includes a travel time map, highlighted in Foxtons’ brand colours.
Foxtons, whose website just won an award for Best Interface Design at the 2011 International Business Awards, were quick to see the value of travel time maps for house-hunters. Thousands of listings now display a simple, beautiful, map showing how long a commute to work or visit to friends will take on public transport – vital pieces of information to consider when looking for a new home.
The property sector is not the only area of business that stands to benefit from Mapumental’s ground-breaking mapping technology. Mapumental is already talking to major players in the travel industry and recruitment sectors. Virtually any business that needs to show users how much time it takes to travel to or from a given spot will find these maps very valuable.
One of Mapumental’s core strengths is its flexibility when it comes to volume – it can provide anything from a single map at a great price to tens of thousands at a significant volume discount.
The service utilises travel-time mapping technology developed by mySociety, drawing journey data from the NPTDR dataset. The same data also drives mySociety’s newest project FixMyTransport.com, which launched just last week, and covers all modes of public transport within GB.
For the maps service, our algorithm calculates journey times from any given point (postcode or latitude and longitude) to every other point in Great Britain. These journey times are displayed as a heatmap, on a background from OpenStreetMap.
Foxtons has made use of the new Mapumental API which enables clients to define the maps’ appearance precisely according to their company preferences. Parameters for choice include:
- maximum travel time
- number of time bands to show
- colour scheme
- the direction of travel (to or from the chosen location)
- target arrival or departure time
- other information (such as title and legend) that goes on the map.
The image is then automatically created and can be published on a website and/or included in printed materials. Website owners can publish the maps themselves, or we can create bespoke integration solutions for them..
To find out more about how Mapumental might work for you, please drop us a line.
Here are some samples of our maps:
Travel times from a residential development in Sevenoaks, departing at 7am
Travel times from St Pancras Reneissance Hotel, departing at 8am
Travel times to reach Cardiff University by 10am
I’ve been doing lots of research around “cloud computing” recently, so we can change how Mapumental works and take it out of private beta.
One thing that’s struck me is that there doesn’t seem to be a proper, industry standard name to distinguish what to me are two fundamentally different sorts of “cloud computing”. I’m focusing here entirely on cloud services for programmers (let’s leave what it means to end users or businesses for another day).
Here are my own names and descriptions of them:
1) Cloud hardware server provision (Cloud HSP)
Low level APIs for making and destroying (virtual) servers, and loading machine images onto them. e.g. Amazon Elastic Compute Cloud, Rackspace Cloud Servers, Eucalyptus’s EC2 bits. Basically, what Eucalyptus v 1.5 can do and what libcloud should do. (By analogy, this is the assembly language of cloud computing)
2) Cloud developer service provision (Cloud DSP) A service that a developer accesses with one name and a simple API, and behind the scenes it scales for him, automatically. e.g. Amazon Queue Service, Rackspace Cloud Files. (By analogy, this layer is the C programming language of cloud computing)
[as an aside, Google AppEngine is an interesting one. It is definitely in the Cloud DSP category, but I think it is larger than that - it is a whole set of APIs all in that category. Something like Google DataStore is a single Cloud DSP, albeit one apparently only accessible within AppEngine apps]
It’s possible to use a Cloud HSP (assembly language), along with a bunch of your own software or open source software, to build new Cloud DSPs (C code). Right now this is pretty hard – even quite well known open source distributed datasbases like CouchDB still need scripting to even make them replicate. The code that makes and destroys servers and gives the service one name, needs manually stringing with quite new bits of wire (things like scalr and Wackamole).
For this reason, I’m reluctant for mySociety to get into the “making our own Cloud DSP out of Cloud HSP” game. It feels to me like a suck of time, and like we wouldn’t be able to guarantee without lots of careful and expensive testing that it would scale. I’m more tempted to use the commercial Cloud DSP services where possible, even though they are proprietary. But use them via our own abstraction layer, so we can change as we need to. Of course, we have some C++ code (the public transport route finder), so will have to use the Cloud HSP API to get that going, perhaps with Amazon’s Auto Scaling. But it can jolly well use AQS and S3 to talk to other services.
So, what do you think about the names Cloud HSP/DSP? Are there already existing names for the distinction that I’m making? Is it a useful distinction for you? Can you think of better names?
Here is a diagram of how the backend of Mapumental works. Take it in the spirit that Chris Lightfoot set when he made a similar diagram for the No. 10 petitions site – although many such diagrams are useless, hopefully this one contains useful information.
(Click on the diagram for a large version)
Below, I’ve explained what the main components are, and some interesting things about them.
Everything can, at least in theory, run on lots of servers. Currently we are only actually using one server for web requests, because of problems with HAProxy. We’re runnning isodaemons on two different servers.
Basic web application – it started out as raw Python, but the more Matthew hacks on it the more Django libraries he pulls in. Soon it’ll be indistinguishable from a Django app. When someone enters a new postcode, it adds it to the work queue in the PostgreSQL database, then refreshes waiting for the job to be finished. Then it displays the flash application (made by Stamen), set up to load the appropriate tile layers.
Tile server and cache – This uses the Python-based TileCache, calling Geospatial Data Abstraction Library (GDAL) to help render the tiles from points. It was originally written by Stamen, and expanded by mySociety. GDAL isn’t perfect, it doesn’t have fancy enough algorithms for my liking. e.g. Using a median rather than a weighted mean.
Isodaemons – These are controlled by a Python script, but the bulk of the code is custom written in C++. Slightly crazily, this can find the quickest route by public transport for each of 300,000 journeys from every station in the UK to a particular station, arriving at a particular time, in 10 to 30 seconds.
I had no idea how to do this, but luckily I live in Cambridge, UK. It’s a city fit to bursting with computer scientists. Many of the jobs are dull, and need little computing, never mind science – like writing interface layers for SQL server. So if you have a real interesting problem it’s easy to get help!
The universal advice was to use Dijkstra’s algorithm, which needed a bit of adaptation to work efficiently over space-time, rather than just space. Normally it is used for planning routes round a map, but public transport isn’t like that, you have to arrive in time for each particular train, so time affects what journeys you can take.
I originally wrote it in Python, which was not only too slow, but used up far far too much RAM. It could never have loaded the whole dataset in. However, the old Python code is still run by the test script, to double check the C++ code against. It is also still used to make the binary timetable files, see below.
Travel times, 1 binary file / postcode – I briefly attempted to insert 300,000 rows into PostgreSQL for each postcode looked up, but it was obvious it wasn’t going to scale. Going back to basics, it now just saves the time taken to travel to each station in a simple binary file – two bytes for each station, 600k in total. The tile server then does random access lookups into that file, as it renders each tile. It only needs to look up the values for the stations it knows are on/near the tile.
There’s various other bits:
- cron jobs for sending out invites
- converting timetable data from ATCO-CIF to the binary format
- loading static layer data into the database
- precaching every tile for static datasets
- Squid and Apache and FastCGI both sit in front of the web applications
- for speed, we cache the mapping background tiles from Cloudmade
- when zoomed out, there is code to cull which stations are used to draw tiles
- of course, a bunch of test code
Thanks to everyone who helped make Mapumental, we couldn’t have done it without lots of clever people.
I realise the above is a sketchy overview, so please ask questions in the comments, and I’ll do my best to answer them.