Much of what we do here at mySociety relies on Open Data, so naturally we support Open Data Day. In case you haven’t come across this event before, here’s the low-down:
Open Data Day is a gathering of citizens in cities around the world to write applications, liberate data, create visualizations and publish analyses using open public data to show support for and encourage the adoption open data policies by the world’s local, regional and national governments.
If you’re planning on being a part of Open Data Day, you may find some of mySociety’s feeds, tools and APIs useful. This post attempts to put them all in one place. (more…)
This month we released a new version of FixMyStreet. Amongst the new features, fixes, and thingamajigs were some small improvements added by two volunteers, Andrew and Andy.
Even though these are not core pieces of functionality — in fact, precisely because they are not — we want to draw your attention to why they were included, and why this is a Very Good Thing. And perhaps, if you’re a coder who wants to put something into an Open Source project but hasn’t quite found a way in, Andrew and Andy’s work will nudge you into becoming a contributor too.
One of the axioms of Open Source is that, because anyone can read the source code, in theory anyone can contribute to it. In practice, though, it’s not really as simple as that. Both ends of the “anyone can contribute” idea require effort:
- Before contributing to a project of any complexity (as we hope you want to do), there’s often a lot to learn, or figure out, before any work can even begin;
- Before accepting contributions to such a project (keen as we are to do so), there’s an overhead of testing, checking, and managing the incoming code.
The ugly real world
The basic issue here is that software is complex — no matter how well-written, tested and documented program code is, if the problem it’s solving is in the real world, it’s not going to be simple.
This is especially true of anything used by the public, because often you can only make things seem simple at the front (such as a clean web interface or “user journey”) by working hard behind the scenes with data structures and processes that handle the underlying complexity. It’s inevitably true of any projects which have been developed over time — programmers like to use the term “legacy code” to describe anything that wasn’t written then way they’d choose to write it now.
Often the problems that software is solving are not quite as obvious as they first appear. At mySociety we’ve got a wealth of experience and actual usage data that ultimately changes the way we build, and develop, our platforms. We understand the fields we work in well (technically, the nerds like to call these the “problem domain”), whether it’s governmental practice or civic user behaviour, and that’s often knowledge that’s not encapsulated anywhere in the program code.
Furthermore, any established platform must protect against the risk that new changes break old behaviour — something that regression testing is designed to catch. This is especially important on platforms like FixMyStreet or Alaveteli where the software is already running in multiple installations.
This is why we have a team of full-time, experienced, and (thanks to our rigorous recruitment process) talented programmers who can invest the time and effort to be familiar with all these things when they set to work coding.
But this builds up to an impediment: sensitivity to any of these issues is enough to make anyone think twice about simply forking our code and starting to hack on it for us.
How it sometimes works
In practice, then, how does anything get contributed? How come it doesn’t all get written by our own coders?
The answer is, of course, we do work with major contributors outside our own team (have a look at the activity on our github repos to see them) — but it always requires a period of support and on-line discussion both before and during the process. There’s also the business of testing, and sometimes politely pushing back on, pull requests (which is how code contributions are submitted). But the fact of the matter is that this is only possible for people who are willing to spend time familiarising themselves with the specific code, technologies, and practices that we’re using on that project. These tend to be hard-code devs, and — here’s the point — they’re always experienced Open-Sourcers: this will never be the first time they’ve worked on such a project.
Which is where the little features come in.
The joy of small
We noticed this problem — that contributing code to our projects is simply not easy for us or for contributors. Importantly, it’s not just us: it’s Open Source everywhere. But we can’t simply dismiss the opportunity for contribution. We want to encourage coders to do this, because we believe that Open Source is intrinsically a good thing.
We do two things to make it easier to contribute:
- We identify small features that a coder can approach without a full understanding of the code and the problem domain;
- We help people (like you!) get started by opening up a laptop at our weekly meetups.
The first of these seems obvious now: when we add issues (an idea for a new feature, or maybe a bugfix) to our github repos, if we think they’re candidates for manageable, isolated work, we tag them with the label: Suitable for volunteers (like this).
Often these turn out to be “nice-to-haves” that one of our full-time devs can’t be pulled off more pressing problems to add just now. (Case in point: Andrew added a date-picker to the FixMyStreet admin stats page, and three of our own staff had stumbled upon and applauded the difference it had made within a week of it going live).
It means it’s much easier for you to get involved, because often it’s a little, isolated piece of code. And it’s much more manageable for us, because the change you’ll be submitting is also isolated.
So if you’re looking for something to tackle, pick one of those issues, and let us know (just to check nobody else has baggsied it already). Fork the repo, cut the code, write the tests, submit a pull request!
But wait — if that last paragraph made you gulp, here’s the second thing we do: meetups. Of course, this is less helpful if you can’t make it to London on Wednesdays, but the concept is sound. Put simply, if there is a barrier to entry to diving in, and if one-on-one time with a dev, and some pizza, is what it takes to overcome that, it’s time well spent for you to come and see us.
Not 100% confident with git? Not sure when
db/schema.sql gets used or how we like to handle migrations? No problem: we’re happy to guide you.
If this has struck a chord with you — you’d love to be an Open Source contributor one day, and you think mySociety projects make the world a better place — perhaps you should take a poke in our repos, or come along to a meetup. Start small, but do start.
Oh, and Andrew and Andy — thanks guys
More about volunteering for mySociety
Photo by Matt Katzenberger (CC)
As you may know, TheyWorkForYou hasn’t displayed proceedings from the Scottish Parliament for a couple of years – but we’re glad to say that we’ve now fixed that. You can read debates from the main chamber from the Official Report and sign up to alerts from the Scottish Parliament here – just as you can for the UK Parliament and the Northern Ireland Assembly.
For those who are interested in the ‘whys’, in January of 2011, the Scottish Parliament changed the way that they published the Official Report on their website. This change broke our scraper and parser – that is, the pieces of software that fetch content and turn it into structured data.
mySociety is a small organisation with many priorities, and, because it wasn’t a simple fix, we weren’t able to allocate resources to it. So massive thanks are due to our developer Mark, who made the necessary changes to our code in his own free time.
You can help
There’s still more work to be done to get TheyWorkForYou’s data for Scotland to be as complete as it was before they changed their website, such as restoring written answers. If you think you have the expertise to help with that or any of the other issues for TheyWorkForYou Scotland, then we’d love to hear from you. And there’s still the Welsh Assembly to work on too!
Photo by Shelley Bernstein (CC)
Simple things are the most easily overlooked. Two examples: a magician taking a wand out of his pocket (see? so simple that maybe you’ve never thought about why it wasn’t on the table at the start), or the home page on www.fixmystreet.com.
The first thing FixMyStreet asks for is a location. That’s so simple most people don’t think about it; but it doesn’t need to be that way. In fact, a lot of services like this would begin with a login form (“who are you?”) or a problem form (“what’s the problem you want to report?”). Well, we do it this way because we’ve learned from years of experience, experiment and, yes, mistakes.
We start off by giving you, the user, an easy problem (“where are you?”) that doesn’t offer any barrier to entry. Obviously, we’re very generous as to how you can describe that location (although that’s a different topic for another blog post). The point is we’re not asking for accuracy, since as soon as we have the location we will show you a map, on which you can almost literally pinpoint the position of your problem (for example, a pothole). Pretty much everyone can get through that first stage — and this is important if we want people to use the service.
How important? Well, we know that when building a site like FixMyStreet, it’s easy to forget that nobody in the world really needs to report a pothole. They want to, certainly, but they don’t need to. If we make it hard for them, if we make it annoying, or difficult, or intrusive, then they’ll simply give up. Not only does that pothole not get reported, but those users probably won’t bother to try to use FixMyStreet ever again.
So, before you know it, by keeping it simple at the start, we’ve got your journey under way — you’re “in”, the site’s already helping you. It’s showing you a map (a pretty map, actually) of where your problem is. Of course we’ve made it as easy as possible for you to use that map. You see other problems, already reported so maybe you’ll notice that your pothole is already there and we won’t have wasted any of your time making you tell us about it. Meanwhile, behind the scenes, we now know which jurisdictions are responsible for the specific area, so the drop-down menu of categories you’re about to be invited to pick from will already be relevant for the council departments (for example) that your report will be going to.
And note that we still haven’t asked you who you are. We do need to know — we send your name and contact details to the council as part of your report — but you didn’t come to FixMyStreet to tell us who you are, you came first and foremost to report the problem. So we focus on the reporting, and when that is all done then, finally, we can do the identity checks.
Of course there’s a lot more to it than this, and it’s not just civic sites like ours that use such techniques (most modern e-commerce sites have realised the value of making it very easy to take your order before any other processing; many governmental websites have not). But we wanted to show you that if you want to build sites that people use, you should be as clever as a magician, and the secret to that is often keeping it simple — deceptively simple — on the outside.
This post was written by mySociety developer Dave Whiteland, and first published on our DIY mySociety blog.
Thanks to everyone who came to the inaugural mySociety Hack Night – and thanks too to our hosts, the Open Data Institute for such a great space to work in.
Topics ranged from community-building in post-conflict societies, to mountain rescue in Wales, via an extended front-end for WriteToThem which would put campaigns in context. It really showed what a lot of exciting ideas there are, just waiting for someone to launch into them.
We’ll be running these nights every Wednesday: we’re currently booking for the following dates, 6:00 -9:00 pm.
Places are restricted, so drop us a line on firstname.lastname@example.org if you’d like to be sure of getting in. All you need is a little coding experience and a laptop.
We’d also like to start a conversation in the comments below, so that like-minded folk can think about hacking together. If you’re looking for people to help you with an idea, or if you see something you like the look of, leave a note below and try to synchronise which nights you’ll be attending.
Photo by Being Focal (CC)
Today, we are using the phrase “Alaveteli upgrade” rather a lot – and not just because it’s such a great tongue-twister. It’s also a notable milestone for our open-source community.
Alaveteli is the software that underlies WhatDoTheyKnow, our Freedom of Information website. The code can also be deployed by people in other countries who wish to set up a similar site. If you’re a ‘front-end user’, someone who just uses WhatDoTheyKnow to file or read FOI requests, this upgrade will go unnoticed… assuming all goes well at our end, that is. But if you’re a developer who’d like to use the platform in your own country, it makes several things easier for you.
Alaveteli will now be using the Rails 3 series – the series we were previously relying on, 2, has become obsolete. One benefit is that we’re fully supported by the core Rails team for security patches. But, more significant to our aim of sharing our software with organisations around the world, it makes Alaveteli easier to use and easier to contribute to. It’s more straightforward to install, dependencies are up-to-date, code is clearer, and there’s good test coverage – all things that will really help developers get their sites up and running without a problem.
Rails cognoscenti will be aware that series 4.0 is imminent – and that we’ve only upgraded to 3.1 when 3.2 is available. We will be upgrading further in due course – it seemed sensible to progress in smaller steps. But meanwhile, we’re happy with this upgrade! The bulk of the work was done by Henare Degan and Matthew Landauer of the Open Australia Foundation, as volunteers – and we are immensely grateful to them. Thanks, guys.
Find the Alaveteli code here – or read our guide to getting started.
Image credit: Sashi Manek (cc)
This is the third of our recent series of Open311 blog posts: we started by explaining why we think Open311 is a good idea, and then we described in a non-techie way how Open311 works. In this post we’ll introduce our proposed extension to Open311, and show how we use it in FixMyStreet.
The crux of our suggested improvement is this: normal people want to know what has happened to their problem, and Open311 currently isn’t good enough at telling them whether or not it has been dealt with. To be more specific, our additions are all about reports’ status change, by which we mean something like this:
I just totally fixed it.
That’s robot-311 from the previous post, if you’ve dropped in here without reading the previous posts. Once again we’re blurring the distinction between client and user (the girl you’ll see below) a little, to make things simpler to follow.
Every month in the UK, thousands of problems are reported on www.fixmystreet.com and, moments later, sent on to the councils who will fix them. Here’s what happens with a problem report for something like a pothole or a flickering streetlight:
- You create the report on FixMyStreet.
- FixMyStreet sends that report to the right department at the right council.
- That body puts it into its own back-end system.
- Later, when the council fixes the problem, FixMyStreet is updated, and everyone knows it’s fixed.
On the face of it, you might think we need only care about 1 and 2. But really, FixMyStreet isn’t just about dispatching reports, it’s about helping to get things like potholes actually fixed. And neither citizens nor local governments benefit if work gets done but nobody finds out about it – which is part 4 on the list above.
What do we mean by “status change”?
The example at the top of the page shows the robot effectively changing a problem’s status to “fixed”.
Actually, statuses can be simple, such as either OPEN or CLOSED, or more detailed, such as “under investigation”, “crew has been dispatched”, “fixed”, and so on. But since we’re only concerned here with the status changing, that specific vocabulary deployed doesn’t really matter – it can be anything.
In situations where FixMyStreet is not integrated with council systems (i.e we just send email problem reports) FixMyStreet problems still frequently get marked as fixed, because anyone can change the status of a report just by visiting the page and clicking the button. Obviously, though, we prefer to have FixMyStreet directly connected to the local government back-end databases, so that news of a fixed report can be automatically bubbled from the back-office up into FixMyStreet and out onto the net.
And here’s where the problem lies: Open311 doesn’t quite support this business of getting problem updates from the back office out to the public. So first, we’ll show you how it can be done today, using Open311, and we’ll explain why this isn’t a good option. Then we’ll show our preferred solution, which we’ve proposed as an extension.
Looking at everything just to spot one change (bad)
One way to notice if any problems’ statuses have changed is to use Open311 to ask for every single service request, and see if any of them have a different status since the last time you checked.
Tell me all the service requests you’ve ever received
request 981276 the pothole on the corner by Carpenter Street is now CLOSED (I filled in the pothole)
request 988765 the pothole by bus stop on Nigut Road is now CLOSED (I filled in the pothole)
request 998610 gaping hole at the end of Sarlacc Road is now OPEN (the pothole fell through)
request 765533 where the street was cracked outside Taffey’s Snake Pit is now CLOSED (I filled in the pothole)
. . .
continues for thousands of requests
Um, OK. Now I’ll look at all these and see if any have changed since I last asked *sigh*
Obviously there are some problems with this. Even though Open311 lets you ask for quite specific service requests, you have to ask for all of them, because by definition you don’t know which ones might have changed. Remember, too, that problems can potentially change status more than once, so just because it’s been marked as CLOSED once doesn’t mean it won’t become OPEN again later. This exchange is very wasteful, very slow and ultimately (with enough reports) may become de facto impossible.
Asking for just the changes (good)
So here’s a better way of doing it. We’ve actually been doing this for some months, and now seems the time to share.
The client asks the server for just the updates on a regular basis, so any requests that have recently changed get updated on FixMyStreet automatically, usually just a few minutes later.
Have you changed the status of any of service requests today?
Yes, request 981276 was CLOSED at 3 o’clock (I filled in the pothole)
Or, more practically for keeping FixMyStreet up to date:
Have you changed the status of any of service requests in the last 15 minutes?
This is handled by our extension to Open311,
GET Service Request Updates. There’s also an optional equivalent call for putting updates into the server (
POST Service Request Update), which would apply if the client changed the status after the service request had been submitted.
Note that the server identifies the problem with its own reference (that is,
981276 is the council’s reference, not a FixMyStreet ID, for example). This is important because not all these requests necessarily came from this particular client. Remember that all service requests are available through the Open311
GET Service Requests call anyway (as shown above). So the server doesn’t send each service request back in its entirety: just its ID, the new status, when it changed, and a brief description.
In practice the client wouldn’t usually ask for “today”. In fact, we typically send a request asking for any updates in the last 15 minutes, and then at the end of the day ask for the whole day’s updates, just to check none were missed.
The technical bit
From a client’s point of view, this is simply an extra call like others in the Open311 API. So it’s just a request over HTTP(S) for XML (or JSON, if required).
We deliberately make the client poll the server for updates and pull them in, rather than expecting the server to push updates out. This frees the server from any obligation to track which clients (for there may be more than just one) care about which updates. The requests themselves are sent with unique IDs, allocated by the server, so the client can dismiss duplicates. It’s also robust in the event of connection failures, so if there are timeouts or retry logic, that’s for the clients to worry about, not the server. Basically, this is all to make it as light on the server as possible: the only real issue is that it must be able to provide a list of updates. This usually means adding a trigger to the database, so that when a problem’s status is updated a record of that update is automatically created. It’s the table of those “service request update” records that incoming requests are really querying.
We have published our own recommendation of this mechanism, which FixMyStreet already implements, alongside the FixMyStreet codebase.
Is that it?
Yup, that’s it.
This extension is in addition to the Open311 specification — it doesn’t break existing implementations in any way. Obviously this means FixMyStreet’s Open311 implementation is compatible with existing Open311 servers. But we hope that others working on Open311 systems will consider our extension so that clients are kept better informed of the status of the problems being fixed.
Why are statuses so important that it is worth extending the Open311 spec?
mySociety didn’t originally build FixMyStreet because we wanted to get potholes fixed. We built it because we wanted nervous, politically inexperienced people to know what it felt like to ask the government to do something, and to be successful at that. We wanted to give people the buzz of feeling like they have a bit of power in this world, even if the most tiny amount.
If the government fixes a problem and the citizen doesn’t find out it’s a double loss. The citizen becomes disillusioned and weakened, and the government doesn’t get the credit it is due. Everyone loses. We think that Open311 is a key mechanism for making large numbers of people feel that the government does respond to their needs. It just needs a bit of an upgrade to do it better. We hope very much that the wider community tests and endorses our extensions, and it can be folded in to the next official version of the Open311 standard.
In the previous blog post we explained why we think Open311 is a good idea. In this post we’ll explain what it actually does.
Open311 is very simple, but because it’s fundamentally a technical thing it’s usually explained from a technical point of view. So this post describes what Open311 does without the nerdy language (but with some nerdy references for good measure). At the end there’s a round-up of the terms so you can see how it fits in with the actual specification.
We’re using an unusual example here — a blue cat stuck up a tree — to show how applicable Open311 is to a wide range of problems. Or, to put it another way, this is not just about potholes.
So… someone has a problem they want to report (for this discussion, she’s using a service like FixMyStreet).
There’s one place where that report needs to be sent (in the UK, that’s your council). That administrative body (the council) almost certainly has a database full of problems which only their staff can access.
I have a problem :–(
I fix problems!
In this example, FixMyStreet is an Open311 client and the council is an Open311 server. The server is available over HTTP(S), so the client can access it, and the server itself connects to the council’s database. In reality it’s a little bit more complicated than that (for now we’ll ignore clients that implement only part of Open311, multiple servers, and decent security around these connections), but that is the gist of it.
Although it’s not technically correct to confuse the client with the user, or the server with the council, it makes things a lot easier to see it this way, so we’ll use those terms throughout.
To start things off, the client can ask the server: what services do you provide?
Until the client has asked the server what problems it can fix, it can’t sensibly request any of them.
What services do you offer?
POT: fix potholes
TELE: clean public teleports
PET: get pets down from trees
JET: renew jetpack licenses …
FixMyStreet can use the response it gets from such a service discovery to offer different categories to people reporting problems. We actually put them into the drop-down menu that appears on the report-a-problem page.
In the Open311 API, this is handled by
GET Service List. Each service has its own
service_code which the client must use when requesting it. Note that these services and their codes are decided by the server; they are not defined by the Open311 specification. This means that service discovery can easily fit around whatever services the council already offers. The list of services can (and does) vary widely from one council to the next.
Some services require specific information when they are requested. For example, it might be important to know how deep a pothole is, but it’s not relevant for a streetlight repair.
Tell me more about the PET service!
I can get pets down from trees, but when you request the service, you *must* tell me what kind of animal the pet is, OK?
In the Open311 API, this is handled by the
GET Service Definition method. It’s not necessary for a simple Open311 implementation. In fact, it only makes sense if the service discovery explicitly told the client to ask about the extra details, which the server does by adding
metadata="true" to its response for a given service.
Requesting a service
This is where it gets useful. The client can request a service: this really means they can report a problem to the server for the body to deal with. Some submissions can be automatically rejected:
My hoverboots are broken :–( I need BOOT service!
404: Bzzzt error! I don’t fix hoverboots (use service discovery to see what I *do* fix)
Hey! Blueblue is up a tree! I need PET service (for cats)!
400: error! You forgot to tell me where it is.
If the report is in good order, it will be accepted into the system. Open311 insists that every problem has a location. In practice this is usually the exact position, coordinates on planet Earth, of the pin that the reporter placed on the map in the client application (in this case FixMyStreet.com).
I need PET service (for cats)! Blueblue is stuck up the biggest tree in the park :–(
200: OK, got it… the unique ID for your request is now 981276
In the Open311 API, this is handled by
POST Service Request. You need an API key to do this, which simply means the server needs to know which client this is. Sometimes it makes sense for the server to have additional security such as IP address restriction, and login criteria that’s handled by the machines (not the user).
Listing known requests
The server doesn’t keep its reports secret: if asked, it will list them out. The client can ask for a specific report (using the ID that the server gave when the report was submitted, for example) or for a range of dates.
Did anyone ask you for help yesterday?
Yes, I got two requests:
request 981299: TELE dirty teleport at the cantina (I’m waiting for a new brush)
request 971723: POT pothole at the junction of Kirk and Solo (I filled it in)
In the Open311 API this is handled by
GET Service Request(s). The client can indicate which requests should be listed by specifying the required service request id, service code, start date, end date or status.
Does Open311 work?
Oh yes. On the Open311 website, you can see the growing list of places, organisations, and suppliers who are using it.
The technical bit
In a nutshell: Open311 responds to HTTP requests with XML data (and JSON, if it’s wanted). There’s no messing around with SOAP and failures are reported as the HTTP status code with details provided in the content body.
You can see the specification for Open311 (GeoReport v2). It doesn’t feature blue cats, but if you look at the XML examples you’ll be able to recognise the same interaction described here. And remember the specification is an open standard, which means anyone can (and, we think, should) implement it when connecting a client and server in order to request civic services.
In the next blog post we’ll look at how FixMyStreet uses Open311 to integrate with local council systems, and explain why we’re proposing, and utilising, some additions to the Open311 specification.
Illustrated especially for us by René Carbonell.
Summer may seem like a long time ago, but despite the cold outside, we’ve been looking back over our participation in Google’s Summer of Code project. It’s almost enough to warm us up!
This post is an attempt to record the process from our point of view. We hope it will be useful for other organisations considering participating next year, and for students who want to know more about how the scheme works.
What is Google Summer of Code?
It’s a programme sponsored by Google’s philanthropic arm, giving students the chance to experience real-life coding on open source software.
The scheme is open to students all over the world, who are then paired up with open source organisations like us. The students gain paid work experience and mentoring; the organisations gain willing workers and some fresh new perspectives; the world gains some more open source code to use or develop further.
Everyone’s a winner, basically.
2012 was our first year on the programme: once we had been accepted on the scheme, we were given two student slots – the maximum allowed for a first-time organisation.
Given mySociety’s wide suite of codebases, there were several projects that could have benefited. We listed all our ideas, and let people apply for the ones they found appealing.
Goodness, there were a lot of applicants! It was very heartening to discover that there is such an enthusiastic community of young coders all around the world – even if it did take us a long time to sift through them all and make our choices.
You might remember our post back in May, when we announced that we’d made our choices. We were delighted to get working with Dominik from Germany and Chetan from India.
As things turned out, our students ended up working on a project that wasn’t even on our original list: PopIt, our super-easy ‘people and positions’ software.
That’s because once we spoke to our chosen students, we realised they had the skills that could really help us forge ahead with this project – and once we discussed it with them, they were keen. So PopIt it was.
Germany and India are a bit of a commute away, but fortunately development work can be managed remotely. We know this particularly well at mySociety: our core team work from home and are scattered across the UK.
The only difference here was the 6+ hour time difference between us and India: it was important to be rigorous about checking in at times when Chetan would be awake!
We communicated via IRC (instant chat), email, and occasionally Skype, and it all worked well.
Edmund, the team member chosen to be mentor, broke the required tasks down into big pieces so that the students would have realistic work units of several days each.
What was achieved
PopIt is primarily a tool for helping people create and run parliamentary monitoring websites (like TheyWorkForYou) with minimal coding knowledge/effort, though we anticipate that it will have many other uses too.
Our students spent the first half of the summer learning and improving the PopIt codebase. Once they were confident in it, they created their own sites using PopIt as a datasource to test the API, and, hopefully, create a valuable reference resource for the community.
Dominik added a migration tool to PopIt, which lets you upload data as a CSV. This means that you can start a site with a database of names, positions and dates at its heart – within seconds.
His test site was a professors’ database (the code is here). Dom also wrote some helpful posts on the dev blog like this one.
Chetan created an image proxy that lets us serve images in a smart way that makes sense for APIs. His test site was for Indian representatives (here’s the code).
Neither site is being maintained now, which just confirms that it is harder to run a site than to start it. This is not a failing, though. The creation of these sites, along with Chetan and Dom’s feedback, helped us understand where improvements needed to be made. In the course of one summer, PopIt became much more mature.
Looking back on the Summer of Code
Edmund attended a follow-up ‘mentors’ summit’ at the Googleplex in California – he found it very helpful to compare notes with other organisations and find out what had worked best for them all, and he made some good contacts too.
Assuming we get the chance again, would we participate in 2013? Our experience was very positive, but as yet we are undecided, purely because of the fluid nature of our workflow: we don’t yet know whether time and resources will permit.
Obviously, we have enjoyed great benefits from the scheme, but that has depended on quite a bit of input from our side, and we need to be sure that we can ensure that happens again.
Edmund has compiled a list of advice, from the practical (ask students to treat the placement like a full-time job; test coding skills before acceptance) to the desirable (a weekly blog post from participants; make sure you over-estimate the time you’ll spend mentoring). If you’re thinking of participating next year, he’d be happy to pass on his tips for ensuring that you, and your assigned students, get the best out of the Google Summer of Code. Just drop him a line.
I am Duncan Parkes, a developer for mySociety, a non-profit full of web geeks. One of the things we try to do well here is to take complicated data and turn it into really usable tools – tools which are attractive to people who aren’t web (or data) geeks.
For some considerable time I’ve been working on Mapumental – a project that is about turning public transport timetable data into pretty, interactive maps featuring isochrones, shapes that show people where they can live if they want to have a commute of a particular time. You can play with the new version we just launched here. That particular map shows the commuting options to where the Queen lives. Slide the slider for full effect.
There are a couple of hard problems that need solving if you want to build a service with an interactive journey times overlay like this. You need to be able to calculate a *huge* number of journeys extremely quickly, and you need to be able to make custom map layers so that it all looks nice. But what I think might be most interesting for you is the way in which the contours get rendered on top of the maps.
It all started about three years ago, when the first version of the app – co-developed with the geniuses at Stamen – used Flash/Flex to draw contours on the maps, and to let people play with them. You can still play with a couple of versions of that technology from way back in 2007, that is, unless you’re using an iPad or iPhone, which of course don’t do Flash.
What was going on inside this Flash app was as follows. We needed to show the user any one of hundreds of different combinations of journey times (5 minutes, 12 minutes, 56 minutes, etc) depending on where they set the slider. Sending each one from the server as a tiled map overlay would be dead slow. Even Google – who have chosen to send new tiles each time – end up with a service which is surprisingly slow (try choosing a different time on this map).
With some help from Stamen, we decided that the way of making it possible to show many different contours very quickly was send the client just one set of tiles, where each tile contained all the data for a variety of journey times. What does that mean? Simple: each colour in the tile represented a different number of minutes travelling on the map. So a batch of pixels that are colour X, all show places that are 15 minutes from the centre of the map.
So, in this old Flash system, when you slide the slider along, the Flash app makes some of the coloured pixels opaque, and the others transparent. It was, in short, a form of colour cycling, familiar to lovers of 8 and 16 bit computer games.
However, from about 2010 onwards, the march of iOS spelt the end of Flash. And that meant that we couldn’t launch a shiny new site based on this technology, as lovely as it was. We had to work out some approach that would use modern web standards instead.
The Death of Flash Makes Life Difficult – for a while
How do we replicate the experience of dragging a slider and seeing the map change like in the original Mapumental demo, but without Flash? One of the things that made the original Mapumental nice to use was how smooth the image changes were when you dragged the slider. Speed really matters to create that sort of organic effect that makes the demo so mesmerising.
So as we started to tackle the question “How do we make this work in a post-Flash world?”. And the first thought was “Let’s do away with those map tiles, filled with all that journey time data!”. After all – why send any tiles to a modern browser, if it can just render nice shapes on the fly?
So we had a go. Several goes. At first we tried rendering SVG circles around each public transport stop – but that was too slow, particularly when zoomed out. Then we tried rendering circles in Canvas, and whilst that was OK in sparsely populated places it sucked in the cities, where people would actually want to use it.
Back to Colour Cycling – Using Web Standards
So, I had a bit of a look at the waterfall. It seems to work by holding in memory a structure which has all the pixels which change and all the colours they should change to and when. This works beautifully for the waterfall picture, but only a limited number of the pixels in that image actually change colour, and the image is quite small. For a full screen web browser with a big map in, this didn’t seem promising, although I’d love to see someone try.
Unfortunately, there is no way to change the palette of an image that you’ve put on the canvas. In fact, there’s no way to change the palette of an HTML img element: all you can do is assign it a new src attribute.
But this gets back to the original problem – we don’t want to download new mapping for every different position on the time slider. We definitely can’t afford to have the client downloading a new image source for every tile whenever the slider is moved, so we had to find a way to make that src at the client end and get that into the src attribute.
The Breakthrough – Data URIs and Base64 encoding
So we started trying data URIs. For those of you not familiar, these allow you to put a whole object into your HTML or CSS, encoded in Base64. They’re commonly used to prevent pages having to make extra downloads for things like tiny icons.
My new plan was that the client, having downloaded each palette-based image, would make a Base64 encoded version of it, which it could then use to build a version with the right palette and assign this as a data URI of the tile.
So in summary, what we built does this:
- The server calculates the journey times and renders them to palette-based tiles.
- It sends these to the client, encoded in Base64, and with the initial bits up to the palette and transparency chunks removed.
- At the client end, we have a pre-prepared array of 255 ‘starts’ of PNGs that we combine with the later parts of the ’tiles’ from the tile server to make data URIs.
- When you drag the slider it combines the appropriate ‘start’ of a PNG with the bulk of the tile that has been downloaded from the server, and assigns that to the src attribute of the tile.
And that’s how the nice overlays on Mapumental work. But as so often in coding, the really interesting devil is in the detail – read on if you’re interested.
Diving into Base64 and the PNG file format – The Gnarly Bits
So – why are there 255 of these ‘starts’ of these PNGs, and what do I mean by a ‘start’ anyway?
PNG files are divided up into an 8 byte signature (the same for every PNG file) and a number of chunks, where each chunk consists of 4 bytes to tell you its length, 4 bytes of its name, some data, and 4 bytes of cyclic redundancy check. In this case, what I call a ‘start’ of a PNG is the 8 byte signature, the 25 byte of the IHDR chunk, and the PLTE (palette) and tRNS (transparency) chunks. The PLTE chunk has 12 bytes of overhead and 3 bytes per colour, and the tRNS chunk has 12 bytes of overhead and 1 byte per colour.
Base64 encoding is a way of representing binary data in text so that it can be used in places where you would normally expect text – like URIs. Without going into too much detail, it turns groups of 3 bytes of binary gumpf into 4 bytes of normal ASCII text without control characters in it, which can then be put into a URI.
Why do we have 255 colours, rather than the maximum 256 which are available in a palette? Because we need the break between the end of the tRNS chunk and the start of the IDAT chunk in the PNG file to align with a break between groups of three bytes in the Base64 encoded image. We need the length of these starts to be a multiple of 3 bytes in the original PNG format, which translates into a multiple of 4 bytes in the Base64 encoded version, so we can cut and shut the images without corruption.
Which just goes to show that even though web GIS technologies may feel like they are approaching a zenith of high level abstraction, there’s still some really gnarly work to be done to get the best out of current browsers.