mySociety would never have been able to make Mapumental in the way we did if it wasn’t for the help of San Franciso-based geovisualisation gurus Stamen. They came up with the brilliant idea of sliders instead of static contours lines, they built the flash front end, and, crucially, they helped make sure all the contours had just the right degree of splodginess for a satisfyingly splodgy user experience.
We’ve been hinting for a while about a secret project that we’re working on, and today I’m pleased to be able to take the wraps off Mapumental. It’s currently in Private Beta but invites are starting to flow out.
Built with support from Channel 4’s 4IP programme, Mapumental is the culmination of an ambition mySociety has had for some time – to take the nation’s bus, train, tram, tube and boat timetables and turn them into a service that does vastly more than imagined by traditional journey planners.
In its first iteration it’s specially tuned to help you work out where else you might live if you want an easy commute to work.
Francis Irving, the genius who made it all work, will post on the immense technical challenge overcome, soon. My thanks go massively to him; to Stamen, for their lovely UI, and to Matthew, for being brilliant as always.
Words don’t really do Mapumental justice, so please just watch the video 🙂 Update: Now available here in HD too
Also new: We’ve just set up a TheyWorkForYou Patrons pledge to help support the growth and improvement of that site. I can neither confirm nor deny that pledgees might get invites more quickly than otherwise 😉
See also: the main travel-time maps report.
Our newly released travel time maps are currently shooting round the internet. It was great fun making them, and you might like to have a go too – there are plenty of public datasets you could overlay on the same base maps, using the same flash app (source code). There are a few notes about how we made them on the page itself, and the associated real time page. For a far more interesting view of the development process, read Tom Carden from Stamen’s account.
The most interesting blog post I’ve seen to come from this is Whitehall staff have no life by Simon Dickson, who was inspired by the maps to think about the destruction of social capital caused by commuting. “Whitehall staff on all but the highest salaries can’t expect to live anywhere near their work, and hence can’t expect to have any kind of a social (capital) life.”
You may remember that back in 2006 mySociety published some maps showing how long it took to commute places via public transport.
We’ve just made some more which have some lovely new features we reckon you’ll probably like a lot.
If you’d like to see more maps like this in your area, please ask your local transport authority to get in touch with us, or nudge these people 🙂
PS As always, Francis Irving remains a genius.
The next mySociety Disruptive Tech Talk is a week today at 7.30pm at the London Knowledge Lab on Emerald St.
This time we have Steve Coast, founder of Open Street Map. When Open Street Map started a few years ago, I thought it would never take off. Earlier this year I accidentally went to their conference in Manchester, and was blown away. There’s a whole community of active people, collaboratively building a vector map of not just the whole country, but the whole world. And it is very usable now – for example, my home town of Cambridge is extremely high quality.
If you’re interested in mapping, or in how to organise communities that disrupt with technology, then come along. But please sign up as the last event was full to capacity! It’s free.
So, we’ve been a bit quiet on this blog, but naturally busy. I just did my invoice and timesheet for last month, and remembered how bitty it has been. In one day I often do things to 3 websites, and that is just CVS commit messages – no doubt I handled emails for more. This makes it quite hard to summarise what has been happening, and also quite hard to measure how much time we spend maintaining each website.
We’ve recently made a London version of PledgeBank, which I’ll remind Tom to explain about on the main news blog. It is a PledgeBank “microsite”, with a special query for the front page and all pledges page that shows only pledges in Greater London. Which is conveniently almost exactly a circle radius 25km with centre at 51.5N -0.1166667E. I worked that out by dividing the area (found on the Greater London Wikipedia page) by pi and taking the square root And rounding up a bit.
Yesterday we launched a new call for proposals – head on over, and tell us your ideas for new civic websites. It is another WordPress modification, but this time to the very blog that you’re reading now. The form for submitting proposals I made anew, It creates a new WordPress low-privileged user by directly inserting into the database, and then calls the function wp_insert_post to create a post by them in a special category. The rest of the blogging software then trivially does comments, RSS, search, email alerts and archiving.
Meanwhile, Chris has written some monitoring software for our servers, to alert us of problems and potential problems. Perl modules do the tests, things like enough disk space and that web servers that are up. I’ve been tweaking it a bit, for example adding a test to watch for long-running PostgreSQL queries which indicate a deadlock. We’ve got a problem in the PledgeBank SMS code which causes deadlocks sometimes, which we’re still debugging.
So, a silly post for today: Postcodeine. This is a British version of Ben Fry’s zipdecode, a “tool” for visualising the distribution of zipcodes in the United States. This is, as has been pointed out to me, wholly pointless, but it’s quite fun and writing it was an interesting exercise (it also taught me a little bit about AJAX, the web’s technology trend du jour). If you want the source code, it’s at the foot here; licence is the Affero GPL, as for all the other mySociety code.
(I should say, by the way, that I wrote this in my copious spare time. It’s copyright mySociety because I don’t have the right to use the postcode database myself.)
… or, “how near is ‘nearby’?”
On PledgeBank we offer search and local alert features which will tell users about pledges which have been set up near them, the idea being that if somebody’s organising a street party in the next street over, you might well want to hear about it, but if it’s somebody a thousand miles away, you probably don’t.
At the moment we do this by gathering location data from pledge creators (either using postcodes, or location names via Gaze), and comparing it to search / alert locations using a fixed distance threshold — presently 20km (or about 12 miles). This works moderately well, but leads to complaints from Londoners of the form “why have I been sent a pledge which is TEN MILES away from me?” — the point being that, within London, people’s idea of how far away “nearby” things is is quite different from that of people who live in the countryside — they mean one tube stop, or a few minutes’ walk, or whatever. If you live in the countryside, “nearby” might be the nearest village or even the nearest town.
So, ages ago we decided that the solution to this was to find some population density data and use it to produce an estimate for what is “nearby”, defined as, “the radius around a point which contains at least N people”. That should capture the difference between rural areas and small and large towns.
(In fairness, the London issue could be solved by having the code understand north vs south of the river as a special case, and never showing North-Londoners pledges in South London. But that’s just nasty.)
Unfortunately the better solution requires finding population density data for the whole world, which is troublesome. There seem to be two widely-used datasets with global coverage: NASA SEDAC’s Gridded Population of the World, and Oak Ridge National Laboratory’s Landscan database. GPW is built from census data and information about the boundaries of each administrative unit for which the census data is recorded, and Landscan improves on this by using remote-sensing data such as the distribution of night-time lights, transport networks and so forth.
(Why, you might wonder, is Oak Ridge National Laboratory interested in such a thing? It is, apparently, “for estimating ambient populations at risk” from natural disasters and whatnot. That’s very worthy, but I can’t help but wonder whether the original motivation for this sort of work may have been a touch more sinister. But what do I know?)
Anyway, licence terms seem to mean that we can use the GPW data and we can’t use the Landscan data, which is a pity, since the GPW data is only really very good in its coverage of rich western countries which produce very detailed census returns on, e.g., a per-municipality basis. Where census returns are only available on the level of regions, the results are less good. Anyway, subject to that caveat, it seems to solve the problem. Here’s a map showing a selection of points, and the circles around them which contain about 200,000 people (that seems to be about the right value for N):
The API to access this will go into the Gaze interface, but it’s not live yet. I’ll document the RESTful API when it is.
One last note, which might be of use to people working with the GPW data in the future. GPW is a cell-based grid: each cell is a region lying between two lines of longitude and two lines of latitude, and within each cell three variables are defined: the population in the cell, the population density of the cell, and the land area of the cell. (This is one of those rare exceptions described in to Alvy Ray Smith’s rant, A Pixel Is Not A Little Square….) But note that the land area is not the surface area of the cell, and the population density is not the population divided by the surface area of the cell!
This becomes important in the case of small islands; for instance (a case I hit debugging the code) the Scilly Isles. The quoted population density for the Scilly Isles is rather high: somewhere between 100 and 200 persons/km2, but when integrating the population density to find the total population in an area, this is absolutely not the right value to use: the proper value there is the total population of a cell, divided by its total surface area. The reason for that is that when sampling from the grid to find the value of the integrand (the population density) you don’t know, a priori, whether the point you’re sampling at has landed on land or non-land, but the quoted population density assumes that you are always asking about the land. When integrating, the total population of each cell should be “smeared out” over the whole area of the cell. If you don’t do this then you will get very considerable overestimates of the population in regions which contain islands.
A very quick post to announce the launch of a public interface to our Gaze web gazetteer service. The motivation behind Gaze is collecting location information from users without using maps (a clunky approach with poor accessibility and licensing problems) or postcodes (which do not have universal coverage and have privacy issues as well as licensing problems). Instead the idea is to use place names to identify locations, even in the presence of ambiguity, alternate names, etc. We do this by providing a search service over a large gazetteer (2.2 million places and 3 million names), and supplying additional contextual information to disambiguate common place names. The API is very simple, with one major function and two other supporting ones.
Anyway, without further ado, here is the API. Internally we use one based on RABX, but we’ve done a special “RESTful” API for everyone else. All requests should be HTTP GETs; all parameters must be in UTF-8; and all responses are in UTF-8 plain text or comma-separated values. All calls should be passed to the URL,
selecting a particular function by specifying the HTTP parameter f, for instance
Available functions are:
- IPv4 address of a host, in dotted-quad format
Guess the country of location of a host from its IP address. The result of this call will be an ISO country code, followed by a line feed; or, if it was not possible to determine a country, a line feed on its own.
- No parameters.Return the list of countries for which the find_places call has a gazetteer available. The list is returned as a list of ISO country codes followed by line feeds.
- ISO country code of country in which to search for places
- state in which to search for places; presently this is only meaningful for country=US (United States), in which case it should be a conventional two-letter state code (AZ, CA, NY etc.); optional
- query term input by the user; must be at least two characters long
- largest number of results to return, from 1 to 100 inclusive; optional; default 10
- minimum match score of returned results, from 1 to 100 inclusive; optional; default 0
Returns in CSV format (as defined by this internet draft) with a one-line header a list of the following fields:
- name of the place described by this row
- blank, or the name of an administrative region in which this place lies (for instance, a county)
- blank, or a list of nearby places, separated by commas
- WGS-84 latitude of place in decimal degrees, north-positive
- WGS-84 longitude of place in decimal degrees, east-positive
- blank, or containing state code for US
- match score for this place, from 0 to 100 inclusive
Enjoy! Questions and comments to email@example.com, please.