1. Publishing and analysing data: our workflow

    This is a more technical blog post in companion to our recent blog about local climate data. Read on if you’re interested in the tools and approaches we’re using in the Climate team to analyse and publish data. 

    How we’re handling common data analysis and data publishing tasks.

    Generally we do all our data analysis in Python and Jupyter notebooks. While we have some analysis using R, we have more Python developers and projects, so this makes it easier for analysis code to be shared and understood between analysis and production projects. 

    Following the same basic ideas as (and stealing some folder structure from) the cookiecutter data science approach that each small project should live in a separate repository, we have a standard repository template for working with data processing and analysis. 

    The template defines a folder structure, and standard config files for development in Docker and VS Code. A shared data_common library builds a base Docker image (for faster access to new repos), and common tools and utilities that are shared between projects for dataset management. This includes helpers for managing dataset releases, and for working with our charting theme. The use of Docker means that the development environment and the GitHub Actions environment can be kept in sync – and so processes can easily be shifted to a scheduled task as a GitHub Action. 

    The advantage of this common library approach is that it is easy to update the set of common tools from each new project, but because each project is pegged to a commit of the common library, new projects get the benefit of advances, while old projects do not need to be updated all the time to keep working. 

    This process can run end-to-end in GitHub – where the repository is created in GitHub, Codespaces can be used for development, automated testing and building happens with GitHub Actions and the data is published through GitHub Pages. The use of GitHub Actions especially means testing and validation of the data can live on Github’s infrastructure, rather than requiring additional work for each small project on our servers.

    Dataset management

    One of the goals of this data management process is to make it easy to take a dataset we’ve built for our purposes, and make it easily accessible for re-use by others. 

    The data_common library contains a dataset command line tool – which automates the creation of various config files, publishing, and validation of our data. 

    Rather than reinventing the wheel, we use the frictionless data standard as a way of describing the data. A repo will hold one or more data packages, which are a collection of data resources (generally a CSV table). The dataset tool detects changes to the data resources, and updates the config files. Changes between config files can then be used for automated version changes. 

    Screenshot of the CLI --help options for the dataset tool.

    Data integrity

    Leaning on the frictionless standard for basic validation that the structure is right, we use pytest to run additional tests on the data itself. This means we define a set of rules that the dataset should pass (eg ‘all cells in this column contain a value’), and if it doesn’t, the dataset will not validate and will fail to build. 

    This is especially important because we have datasets that are fed by automated processes, read external Google Sheets, or accept input from other organisations. The local authority codes dataset has a number of tests to check authorities haven’t been unexpectedly deleted, that the start date and end dates make sense, and that only certain kinds of authorities can be designated as the county council or combined authority overlapping with a different authority. This means that when someone submits a change to the source dataset, we can have a certain amount of faith that the dataset is being improved because the automated testing is checking that nothing is obviously broken. 

    The automated versioning approach means the defined structure of a resource is also a form of automated testing. Generally following the semver rules for frictionless data (exception that adding a new column after the last column is not a major change), the dataset tool will try and determine if a change from the previous version is a MAJOR (backward compatibility breaking), MINOR (new resource, row or column), or PATCH (correcting errors) change. Generally, we want to avoid major changes, and the automated action will throw an error if this happens. If a major change is required, this can be done manually. The fact that external users of the file can peg their usage to a particular major version means that changes can be made knowing nothing is immediately going to break (even if data may become more stale in the long run).

    Screenshot of example pytest tests, showing ensuring an authority has been assigned a nation

    Data publishing and accessibility

    The frictionless standard allows an optional description for each data column. We make this required, so that each column needs to have been given a human readable description for the dataset to validate successfully. Internally, this is useful as enforcing documentation (and making sure you really understand what units a column is in), and means that it is much easier for external users to understand what is going on. 

    Previously, we were uploading the CSVs to GitHub repositories and leaving it as that – but GitHub isn’t friendly to non-developers, and clicking a CSV file opens it up in the browser rather than downloading it. 

    To help make data more accessible, we now publish a small GitHub Pages site for each repo, which allows small static sites to be built from the contents of a repository (the EveryPolitician project also used this approach). This means we can have fuller documentation of the data, better analytics on access, sign-posting to surveys, and better sign-posted links to downloading multiple versions of the data. 

    Screenshot of data descriptions of the local authorities dataset

    The automated deployment means we can also very easily create Excel files that packages together all resources in a package into the same file, and include the meta-data information about the dataset, as well as information about how they can tell us about how they’re using it. 

    Publishing in an Excel format acknowledges a practical reality that lots of people work in Excel. CSVs don’t always load nicely in Excel, and since Excel files can contain multiple sheets, we can add a cover page that makes it easier to use and understand our data by packaging all the explanations inside the file. We still produce both CSVs and XLSX files – and can now do so with very little work.

    Screenshot of downloadable excel file showing different sheets and descriptions

    For developers who are interested in making automated use of the data, we also provide a small package that can be used in Python or as a CLI tool to fetch the data, and instructions on the download page on how to use it

    Screenshot of the command line download instructions for a dataset

    At mySociety Towers, we’re fans of Datasette, a tool for exploring datasets. Simon Willison recently released Datasette Lite, a version that runs entirely in the browser. That means that just by publishing our data as a SQLite file, we can add a link so that people can explore a dataset without leaving the browser. You can even create shareable links for queries: for example, all current local authorities in Scotland, or local authorities in the most deprived quintile. This lets us do some very rapid prototyping of what a data service might look like, just by packaging up some of the data using our new approach.

    Screen shot of Datasette Lite showing a query of authorities in Scotland

    Data analysis

    Something in use in a few of our repos is the ability to automatically deploy analysis of the dataset when it is updated. 

    Analysis of the dataset can be designed in a Jupyter notebook (including tables and charts) – and this can be re-run and published on the same GitHub Pages deploy as the data itself. For instance, the UK Composite Rural Urban Classification produces this analysis. For the moment, this is just replacing previous automatic README creation – but in principle makes it easy for us to create simple, self-updating public charts and analysis of whatever we like. 

    Bringing it all back together and keeping people to up to date with changes

    The one downside of all these datasets living in different repositories is making them easy to discover. To help out with this, we add all data packages to our data.mysociety.org catalogue (itself a Jekyll site that updates via GitHub Actions) and have started a lightweight data announcement email list. If you have got this far, and want to see more of our data in future – sign up!

    Image: Sigmund

  2. Seeing spots with FixMyStreet Pro

    If you visit FixMyStreet and suddenly start seeing spots, don’t rush to your optician: it’s just another feature to help you, and the council, when you make a report.

    In our last two blog posts we announced Buckinghamshire and Bath & NE Somerset councils’ adoption of FixMyStreet Pro, and looked at how this integrated with existing council software. It’s the latter which has brought on this sudden rash.

    At the moment, you’ll only see such dots in areas where the council has adopted FixMyStreet Pro, and gone for the ‘asset locations’ option: take a look at the Bath & NE Somerset installation to see them in action.

    What is an asset?

    mySociety developer Struan explains all.

    Councils refer to ‘assets’; in layman’s language these are things like roads, streetlights, grit bins, dog poo bins and trees. These assets are normally stored in an asset management system that tracks problems, and once hooked up, FixMyStreet Pro can deposit users’ reports directly into that system.

    Most asset management systems will have an entry for each asset and probably some location data for them too. This means that we can plot them on a map, and we can also include details about the asset.

    When you make a report, for example a broken streetlight, you’ll be able to quickly and easily specify that precise light on the map, making things a faster for you. And there’s no need for the average citizen to ever know this, but we can then include the council’s internal ID for the streetlight in the report, which then also speeds things up for the council.

    Map layers

    So, how do we get these assets on to the map? Here’s the technical part:

    The council will either have a map server with a set of asset layers on it that we can use, or they’ll provide us with files containing details of the assets and we’ll host them on our own map server.

    The map server then lets you ask for all the streetlights in an area and sends back some XML with a location for each streetlight and any associated data, such as the lamppost number. Each collection of objects is called a layer, mostly because that’s how mapping software uses them. It has a layer for the map and then adds any other features on top of this in layers.

    Will these dots clutter up the map for users who are trying to make a report about something else?

    Not at all.

    With a bit of configuration in FixMyStreet, we associate report categories with asset layers so we only show the assets on the map when the relevant category is selected.

    We can also snap problem reports to any nearby asset which is handy for things like street lights as it doesn’t make sense to send a report about a broken street light with no associated light.

    Watch this space

    And what’s coming up?

    We’re working to add data from roadworks.org, so that when a user clicks on a road we’ll be able to tell them if roadworks are happening in the near future, which might have a bearing on whether they want to report the problem — for example there’s no point in reporting a pothole if the whole road is due to be resurfaced the next week.

    Then we’ll also be looking at roads overseen by TfL. The issue with these is that while they are always within a council area, the council doesn’t have the responsibility of maintaining them, so we want to change where the report is going rather than just adding in more data. There’s also the added complication of things like “what if the issue is being reported on a council-maintained bridge that goes over a TFL road”.

    There’s always something to keep the FixMyStreet developers busy… we’ll make sure we keep you updated as these new innovations are added.

    From a council and interested in knowing more? Visit the FixMyStreet Pro website

  3. The latest from Alaveteli Professional: prototyping, testing, reducing risk

    Last time we updated you about Alaveteli professional, the Freedom of Information toolset for journalists that we’re building, we were just coming out of our discovery phase.

    Since then, we’ve made strides through the alpha and early beta part of our development process. In alpha, the idea is to build dummy versions of the tool that work in the minimum way possible — no bells and whistles — to test concepts, and our assumptions. Having thought hard about the potential problems of Alaveteli Professional, now is the time for us to try the approaches that we believe will solve them, by making prototypes of how the tool might work and testing them with a very small group of users.

    In the early stages of beta, our priority has been to get to the point where a Freedom of Information request can go through all its various processes, from composition to response, with the features that a journalist user would need. Once that’s in place, it allows us to add other features on top and see how they would integrate.

    This pattern —  discovery, alpha, beta, release — is a well-tested method by which to produce a final product that works as it should, while avoiding costly mistakes.

    Risk management

    Alpha and beta testing, perhaps unexcitingly, are all about the reduction of risk: in the words of the startup mantra, it’s good to ‘fail fast’— or rather, it’s better to know early on if something doesn’t work, rather than spend time and money on something that doesn’t fit the bill.

    So, for Alaveteli Professional, what are the risks that have been keeping us awake at night?

    We think the biggest priority is to ensure that there’s actually added value for journalists in using a service like this. Clearly, the Freedom of Information process is already available to all, whether via our own site WhatDoTheyKnow, or directly.

    We need to be able to demonstrate tangible benefits: that Alaveteli Professional can save journalists time; help them be more efficient in managing their requests; maybe help them get information that otherwise wouldn’t be released; and give them access to rich data they wouldn’t otherwise be able to access.

    For all we said about failing fast, the alpha phase also meant committing to some fairly big technical decisions that, ideally, we wouldn’t like to reverse.

    Decisions like, do we build the service into the existing Alaveteli codebase, or go for a new standalone one (we went for the former)? From the user’s point of view, should Alaveteli Professional look like a totally different site, or like a registration-only part of WhatDoTheyKnow (we chose the latter)?

    And onto beta

    As we move from alpha to beta, we’re finding out what happens when real users make real requests through the service, and making adjustments based on their feedback.

    What do they think of the way we’ve implemented the ability to embargo requests – does it make sense to them? Do they trust us to keep embargoed requests private? Are they able to navigate between different interfaces in a way that seems intuitive? mySociety designer Martin has been figuring out how to take the cognitive load off the user and give them just the information they need, when they need it.

    We’re also returning to prototyping mode to work out how to implement new features, like the ability to send round robin requests to multiple authorities, in an effective and responsible way. The other half of our design team, Zarino, has been showing us that a slideshow in presentation mode can be an effective tool for demonstrating how users might interact with an interface.

    As we continue to round out the feature set in the UK, we’re also cooking up plans in the Czech Republic so that later in the year we can present the tools to a new audience of journalists there and again, use their feedback to make the tools more flexible so that they can be used in different jurisdictions.

    As you can see, there’s lots going on, and we’re all really excited to be finally getting some real life users in front of the tools that we’ve been thinking about, and working on, for so many weeks. Don’t forget to sign up to the mailing list if you’d like to keep up with Alaveteli Professional as it develops.


    Image: Jeff Eaton (CC BY-SA)

  4. Forward into alpha: what we’ve learned about Alaveteli Professional

    As we recently mentioned, one of mySociety’s current big projects is Alaveteli Professional, a Freedom of Information toolset for journalists.

    It’s something we wanted to build, and something we believed there was a need for: but wanting and believing do not make a sound business case, and that’s why we spent the first few weeks of the project in a ‘discovery’ phase.

    Our plan was to find out as much as we could about journalists, our prospective users — and particularly just how they go about using FOI in their work. Ultimately, though, we were seeking to understand whether journalists really would want, or need, the product as we were imagining it.

    So we went and talked to people at both ends of the FOI process: on the one hand, journalists who make requests, and on the other, the information officers who respond to them.

    Since we’re planning on making Alaveteli Professional available to partners around the world, it also made sense to conduct similar interviews outside the UK. Thanks to links with our Czech partner, running Informace Pro Všechny on Alaveteli, that was a simple matter. A recent event at the Times building in London also allowed us to present and discuss our findings, and listen to a couple of interesting expert presentations: Matt Burgess of Buzzfeed talked about some brilliant use of FOI to expose criminal landlords, and listed FOI officers’ biggest complaints about journalists. Josh Boswell of the Sunday Times was equally insightful as he ran through the ways that he uses FOI when developing stories.

    These conversations have all helped.

    The life of an investigative journalist is never simple

    Alaveteli Professional process diagram drawn by Mike Thompson

    The insights our interviewees gave us were turned by Mike Thompson (formerly of mySociety, and brought back in for this phase) into a simple process model showing how journalists work when they’re pursuing an investigation using FOI.

    After conceiving of a story that requires input from one or more FOI request, every journalist will go through three broad phases: research; request and response; and the final data analysis and writing. The more complicated cases can also involve refused requests and the appeals process.

    For a busy working journalist, there are challenges at every step. Each of these adds time and complexity to the process of writing a story, which is an anathema to the normal daily news cycle. FOI-based stories can be slow, and timing unpredictable — editors do not particularly like being told that you’re working on a story, but can’t say when it will be ready, or how much value it will have.

    During the research phase diligent journalists will make a time-consuming trawl through resources like authorities’ own disclosure logs and our own site WhatDoTheyKnow (or its equivalents in other countries), to see if the data they need has already been released.

    Where a ‘round robin’ request is planned, asking for information from multiple authorities — sometimes hundreds — for information, further research is needed to ensure that only relevant bodies are included. In our two-tired council system, where different levels of authority deal with different responsibilities, and not always according to a consistent pattern, that can be a real challenge.

    Wording a request also takes some expertise: get that wrong and the authorities will be coming back for clarification, which adds even more time to the process.

    Once the request has been made it’s hard to keep on top of correspondence, especially for a large round robin request. Imagine sending a request to every council in the country, as might well be done for a UK-wide story, and then dealing with each body’s acknowledgements, requests for clarifications and refusals.

    When the responses are in journalists often find that interpretation is a challenge. Different authorities might store data or measure metrics differently from one another; and pulling out a meaningful story means having the insight to, for example, adjust figures to account for the fact that different authorities are different sizes and cater for differently-dispersed populations.

    Sadly, it’s often at this stage that journalists realise that they’ve asked the wrong question to start with, or wish that they’d included an additional dimension to the data they’ve requested.

    What journalists need

    As we talked through all these difficulties with journalists, we gained a pretty good understanding of their needs. Some of these had been on our list from the start, and others were a surprise, showing the value of this kind of exploration before you sit down to write a single line of code.

    Here’s what our final list of the most desirable features looks like:

    An embargo We already knew, anecdotally, that journalists tend not to use WhatDoTheyKnow to make requests, because of its public nature. It was slightly sobering to have this confirmed via first person accounts from journalists who had had their stories ‘stolen’… and those who admitted to having appropriated stories themselves! Every journalist we spoke to agreed that any FOI tool for their profession would need to include a way of keeping requests hidden until after publication of their story.

    However, this adds a slight dilemma. Using Alaveteli Professional and going through the embargo-setting process introduces an extra hurdle into the journalist’s process, when our aim is, of course, to make the FOI procedure quicker and smoother. Can we ensure that everything else is so beneficial that this one additional task is worthwhile for the user?

    Talking to journalists, we discovered that almost all are keen to share their data once their story has gone live. Not only does it give concrete corroboration of the piece, but it was felt that an active profile on an Alaveteli site, bursting with successful investigations, could add to a journalist’s reputation — a very important consideration in the industry.

    Request management tools Any service that could put order into the myriad responses that can quickly descend into chaos would be welcome for journalists who typically have several FOI requests on the go at any one time.

    Alaveteli Professional’s dashboard interface would allow for a snapshot view of request statuses. Related requests could be bundled together, and there would be the ability to quickly tag and classify new correspondence.

    Round-robin tools Rather than send a notification every time a body responds (often with no more than an acknowledgement), the system could hold back, alerting you only when a request appears to need attention, or send you status updates for the entire project at predefined intervals.

    Refusal advice Many journalists abandon a request once it’s been refused, whether from a lack of time or a lack of knowledge about the appeals process. WhatDoTheyKnow Professional would be able to offer in-context advice on refusals, helping journalists take the next step.

    Insight tools Can we save journalists’ time in the research phase, by giving an easy representation of what sort of information is already available on Alaveteli sites, and by breaking down what kind of information each authority holds? That could help with terminology, too: if a request refers to data in the same language that is used internally within the council, then their understanding of the request and their response is likely to be quicker and easier.

    Onwards to Alpha

    We’re currently working on the next part of the build — the alpha phase.

    In this, we’re building quick, minimally-functional prototypes that will clearly show how Alaveteli Professional will work, but without investing time into a fully-refined product. After all, what we discover may mean that we change our plans, and it’s better not to have gone too far down the line at that point.

    If you are a journalist and you would like to get involved with testing during this stage and the next — beta — then please do get in touch at alaveteli-professional@mysociety.org.


    Image: Goodwines (CC by-nc-nd/2.0)

  5. Multi-selectin’ fun on FixMyStreet

    You might already be enjoying one of the usability improvements that FixMyStreet version 2.0 has brought, though it’s possible that you haven’t given it much thought.

    multi-selectIn these days of eBay and department store shopping, we’re all quite used to refining results through the use of multiple checkboxes.

    But for FixMyStreet, we hadn’t given much thought to letting you filter reports by more than one dimension, until Oxfordshire County Council suggested that it would be a useful feature.

    For quite some time, you’d been able to filter by category and status (“Show me all pothole reports” or “Show me all ‘unfixed’ reports”), but this new functionality is more flexible.

    You can now select multiple categories and multiple statuses simultaneously (“show me all pothole and graffiti reports that are unfixed or in progress”) — and all through the power of tickboxes.

    If you’re a non-technical person, that’s all you need to know: just enjoy the additional flexibility next time you visit FixMyStreet. But if you are a coder, you might like to read more about how we achieved this feature: for you, Matthew has written about it over on the FixMyStreet Platform blog.

  6. Better-looking emails for FixMyStreet

    If you’ve used FixMyStreet recently — either to make a report, or as a member of a council who receives the reports — you might have noticed that the site’s automated emails are looking a lot more swish.

    Where previously those emails were plain text, we’ve now upgraded to HTML, with all the design possibilities that this implies.

    It’s all part of the improvements ushered in by FixMyStreet Version 2.0, which we listed, in full, in our recent blog post. If you’d like a little more technical detail about some of the thought and solutions that went into this switch to HTML, Matthew has obliged with in a blog post over on FixMyStreet.org.

  7. A new Sass workflow for Alaveteli theming

    We recently made some changes to our Freedom of Information software, Alaveteli, that makes it twice as easy for partners to change the look and feel of their site. This post goes quite in-depth, so it’s probably best suited for designers, developers and other technically-minded readers


    This year we embarked on a ground-up redesign of Alaveteli’s theming capabilities to make customising Alaveteli easier and faster. We used a mixture of techniques to try to achieve this and since releasing the update we’re happy to have seen a decrease in the complexity of our themes, and the time we spend making them.

    Alaveteli, our Freedom of Information platform is one of our most deployed software packages and each deployment requires some customisation, from simple branding changes to complex functionality to support a country’s FOI process. This is either done internally by our own developers, or our partners take this on themselves.

    An Alaveteli deployment consists of two things, the core: the guts of Alaveteli with no customisations, and the theme: the aesthetic and functional customisations made to each site. We ship Alaveteli with a fully-featured base theme that gives partners a good starting point to work from, but they can also start their own theme from scratch.

    After talking to our partners and the Alaveteli team we felt that the process of theming was too complicated and involved. It required a lot of knowledge of Alaveteli’s inner workings, was coded in a way that made it hard to understand, and almost always required the intervention of a front-end expert, even to make trivial changes.

    Our vision for how Alaveteli’s theming should work was:

    • Intuitive Front-end developers and designers should be able to follow and understand the templates without needing expert knowledge about Alaveteli’s inner-workings.
    • Flexible Overriding Alaveteli’s core code should be easy, so a theme should use low-specificity, componentised code to enable this.
    • Supportive Should a partner decide to only change a few colours, or embark on a large-scale redesign Alaveteli, the theming process should never get in the way.

    How we did it

    Using the tools we have

    Alaveteli is built on Ruby on Rails, so our first goal was to make better use of Ruby on Rail’s template inheritance.

    Firstly, we split all of the Sass files into logical components.

    Each component has two associated Sass files, a layout.scss and a style.scss. layout.scss describes only layout of a component, e.g. the physical dimensions and position on a page. style.scss describes the styles, for example typefaces, colours, borders and other presentational properties.

    Should a partner wish to totally redesign or replace a component, they can replace both files with their own. Alternatively they can just choose to replace one part of the component, for example keeping the layout, but replacing the styles.

    Ensuring the lowest possible specificity

    When customising Alaveteli we want to be sure that a partner never has to fight with the base theme to achieve their goals. The most important part of this is to ensure CSS styles are easy to override.

    Firstly we removed all ID selector’s from Alaveteli’s stylesheets and replaced them with class selectors, giving us a good jumping off point as class selectors have much lower specificity than IDs. We then rewrote and refactored Alaveteli’s CSS using the Block Element Modifier methodology (BEM).

    BEM helped us in two ways: it’s written in a way that makes CSS self-describing, for example the code snippet .header__navigation__list tells us that this list is a child of the navigation element and a grandchild of the header element.

    As well as this, using a single class to define the styles for this element means that using BEM gives our styles the lowest possible specificity, so partners can override them with little resistance.

    Using a settings partial

    All of our theme’s styles use the variables defined in _settings.scss when referencing colours, typefaces, spacing, and even logo files so the bulk of Alaveteli’s customisation can be done in our base theme’s settings partial _settings.scss.

    This allows partners to make simple changes which have a huge impact on the look and feel of the site. We’ve found that the large majority of our theme customisation can be done by just changing the variables in this file and it has sped up the theming process markedly.

    Impact

    We’ve quietly rolled out these updates to Alaveteli over the past 12 months and the impact of the changes has been encouraging.

    Internally, we’ve found our theming projects are faster and more efficient. We conservatively estimate that it takes half the time to make a new theme than it used to.

    Thanks to the self-documenting nature of BEM and the styles encapsulated in our settings partial, Alaveteli theming requires a lot less understanding of the platform’s inner workings. Our partners are able to be more self-sufficient and rarely require our intervention to assist with simple customisations.

    Maintenance and support of our themes is greatly simplified, as the structure is consistent across projects, we’re able to roll out further changes and fixes in a more predictable way.
    Overall, we’re really happy with how the changes worked out and the things we’ve learnt from this project we’ll be carrying into our others too, so look out for changes like across our work in the future.

  8. Unpicking code, for the good of the project

    Last year, when we were helping to develop YourNextMP, the candidate-crowdsourcing platform for the General Election, we made what seemed like an obvious decision.

    We decided to use PopIt as the site’s datastore — the place from which it could draw information about representatives: their names, positions, et cetera. We’d been developing PopIt as a solution for parliamentary monitoring sites, but we reckoned it would also be a good fit for YourNextMP.

    That turned out to be the wrong choice.

    YourNextMP was up and running in time for the election, but at the cost of many hours of intensive development as we tried to make PopIt do what was needed for the site.

    Once you’ve got an established site in production, changing the database it uses isn’t something you do lightly. But on returning to the codebase to develop it for international reuse, we had to admit that, in the words of mySociety developer Mark Longair, PopIt was “actually causing more problems than it was solving”. It was time to unpick the code and take a different approach.

    Mark explains just what it took to decide to change course in this way, over on his own blog.

    The post contains quite a bit of technical detail, but it’s also an interesting read for anyone who’s interested in when, and why, it’s sometimes best to question the decisions you’ve made.

     

    Image: Michelle (CC)

  9. Photo upload and progressive enhancement for FixMyStreet

    FixMyStreet has been around for nearly nine years, letting people report things and optionally include a photo; the upshot of which is we currently have a 143GB collection of photographs of potholes, graffiti, dog poo, and much more. 🙂

    For almost all that time, attaching a photo has been through HTML’s standard file input form; it works, but that’s about all you can say for it – it’s quite ugly and unfriendly.

    We have always wanted to improve this situation – we have a ticket in our ticketing system, Display thumbnail of photo before submitting it, that says it dates from 2012, and it was probably in our previous system even before that – but it never quite made it above other priorities, or when it was looked at, browser support just made it too tricky to consider.

    Here’s a short animation of FixMyStreet’s new photo upload, which also allows you to upload multiple photos:

    For the user, the only difference from the current interface is that the photo field has been moved higher up the form, so that photos can be uploading while you are filling out the rest of the form.

    Personally, I think this benefit is the largest one, above the ability to add multiple photos at once, or the preview function. Some of our users are on slow connections – looking at the logs I see some uploads taking nearly a minute – so being able to put that process into the background hopefully speeds up the submission and makes the whole thing much nicer to use.

    Progressive enhancement

    When creating a new report, it can sometimes happen that you fill in the form, include a photo, and submit, only for the server to reject your report for some reason not caught client-side. When that happens, the form needs to be shown again, with everything the user has already entered prefilled.

    There are various reasons why this might happen; perhaps your browser doesn’t support the HTML5 required attribute (thanks Safari, though actually we do work around that); perhaps you’ve provided an incorrect password.

    However, browsers don’t remember file inputs, and as we’ve seen, photo upload can take some time. From FixMyStreet’s beginnings, we recognised that re-uploading is a pain, so we’ve always had a mechanism whereby an uploaded photo would be stored server side, even if the form had errors, and only an ID for the photo was passed back to the browser so that the user could hopefully resubmit much more quickly.

    This also helped with reports coming in from external sources like mobile phone apps or Flickr, which might come with a photo already attached but still need other information, such as location.

    Back in 2011, I wrote about how FixMyStreet maps are progressively enhanced, starting with a base of HTML image maps and layering on the normal slippy map experience on top. This has always been the way I have worked, and adding a snazzy JavaScript photo upload was no different.

    mySociety designer Zarino used dropzonejs to supply the “pop”™, and this works in a nicely easy-to-progressively-enhance way, hiding existing file input(s) and providing fallbacks. And with the behaviour the site has had since 2007, adding the server side element of this new photo upload was actually very straightforward – receive a photo and return its ID for a snippet of JavaScript to insert into the hidden form field of photo ID that has always been there in case of form error. No need to worry about how to match up the out-of-band photos with the main form submission, it’s all already taken care of. If the JavaScript doesn’t or can’t work for whatever reason, the old behaviour is still there, using the same mechanisms.

    Of course there were edge cases and things to tidy up along the way, but if the form hadn’t taken into account the user experience of error edge cases from the start, or worse, had assumed all client checks were enough, then nine years down the line my job would have been a lot harder.

    Anyway, long story short, adding photos to your FixMyStreet reports is now a smoother process, and you should try it out.

  10. Now you can reverse hasty decisions—on Gender Balance, at least

    As players were quick to notice, decisions made on our politician-sorting game Gender Balance were final. Thanks to volunteer coder Andy Lulham, that’s now been rectified with an ‘undo’ button.

    Gender Balance is our answer to the fact that there’s no one source of gender information across the world’s legislatures—read more about its launch here. It serves up a series of politicians’ names and images, and asks you to identify the gender for each. Your responses, along with those of other players, helps compile a set of open data that will be available to all.

    Many early players told us, however, that it’s all too easy to accidentally click the wrong button. (The reasons for this may be various, but we can’t help thinking that it’s often because there are so many males in a row that the next female comes as a bit of a surprise…)

    In fact, this shouldn’t matter too much, because every legislature is served up to multiple players, and over time any anomalies will be ironed out of the data. That doesn’t stop the fact that it’s an upset to the user, though, and in the site’s first month of existence, an undo button has been the most-requested feature.

    genderbalance-undo-1

    Thanks to the wonders of open source, anyone can take the code and make modifications or improvements, and that’s just what Andy did in this case. He submitted this pull request (if you look at that, you can see the discussion that followed with our own developers and our designer Zarino). We’ve merged his contribution back into the main code so all players will now have the luxury of being able to reverse a hasty decision. Thanks, Andy!

     

     

    Photo credit: Head in Hands CC BY-NC 2.0 by Alex Proimos