1. Machine learning can make climate action plans more explorable

    We’ve used machine learning to make practical improvements in the search on CAPE – our local government climate information portal.

    The site contains hundreds of documents and climate action plans from different councils, and they’re all searchable.

    One aim of this project is to make it easier for everyone to find the climate information they need: so councils, for example, can learn from each other’s work; and people can easily pull together a picture on what is planned across the country.

    The problem is that these documents often use different terms to talk about the same basic ideas – meaning that using the search function requires an expert understanding of which different keywords to search for in combination.

    Using machine learning, we’ve now made it so the search will automatically include related terms. We’ve also improved the accessibility of individual documents by highlighting which key concepts are discussed in the document.

    Screenshot of CAPE: Search results for "flooding" , including 41 related terms such as drainage, river, and flooding. Screenshot of CAPE action plan listing. Shows a climate emergency action plan, and some of 65 topics extracted from the document: e.g. air quality, biodiversity, carbon budgets

    How machine learning helps

    We’re already using machine learning techniques as part of our work clustering similar councils based on emissions profile, but we hadn’t previously looked at how machine learning approaches could be applied to big databases of text like CAPE.

    As part of our funding from Quadrature Climate Foundation, we were supported to take part in the Faculty Fellowship – where people transitioning from academic to industrial data science jobs are partnered with organisations looking to explore how machine learning can benefit their work.

    Louis Davidson joined us for six weeks as part of this programme. After a bit of exploration of the data, we decided on a project looking at this problem of improving the search, as there was a clear way a machine learning solution could be applied: using a language model to identify key concepts that were present across all the documents. You can watch Louis’ end of project presentation on YouTube.

    Moving from similar words to similar concepts

    Louis took the documents we had and used a language model (in this case, BERT) to produce ‘embeddings’ for all the phrases they contained.

    When language models are trained on large amounts of text, this changes the internal shape of the model so that text with similar meanings ends up being ‘closer’ to each other inside the model. An ‘embedding’ is a series of numbers that represent this location. By looking at the distance between embeddings, we can identify groups of similar terms with similar meanings. While a more basic text similarity approach would say that ‘bat’ and ‘bag’ are very similar, a model that sorts based on meaning would identify that ‘bat’ and ‘owl’ are more similar.

    This means that without needing to re-train the model (because you’re not really concerned with what the model was originally trained to do), you can explore the similarities between concepts.

    There are approaches to this that store a “vector database” of these embeddings which can be directly searched – but we’ve gone for a simpler approach that doesn’t require a big change to how CAPE was already working.

    Using the documents we have, we automatically identified (and manually selected a group of) common concepts that are found across a range of documents – and the original groups of words that relate to those concepts.

    When a search is made we now consult this list of similar phrases, and search for these at the same time. This gives us a practical way of improving our existing processes without adding new technical requirements when adding new documents or searching the database.

    Because we now have this list of common concepts, we are also pre-searching for these concepts to provide, for each document, links to where that concept is discussed within it. With this change, the contents of individual documents are more visible, with it easier to quickly identify interesting contents depending on what you are interested in.

    Potential of machine learning for mySociety

    Our other websites, like TheyWorkForYou and WhatDoTheyKnow, similarly have a large amount of text that this kind of semantic search can make more accessible — and we can already see how they might be useful to those relying on data around climate and the environment WhatDoTheyKnow in particular has huge amounts of environmental information fragmented across replies to hundreds of different authorities.

    Generative AI and machine learning have huge potential to help us make the information we hold more accessible. At the same time, we need to understand how to incorporate new techniques into our services in a way that is sustainable over time.

    Through experiments like this with CAPE, we are learning how to think about machine learning, which problems we have that it applies to, and understand new skills we need to work with it. Thanks to Louis, and his Faculty advisors for his work and their support on this project.

    Image: Ravaly on Unsplash.

  2. Climate monthnotes: May 2023

    As we barrel into Summer at full speed, here’s a summary of what mySociety’s climate team got up to in May.

    If you’re interested in working with us on any of this, or you want to use any of our data (or ask us to collect some data for you) then get in touch!

    Neighbourhood Warmth: alpha testing a vision of community-powered retrofit

    As Siôn blogged a few days ago, Neighbourhood Warmth has been, and will continue to be, a major focus for us over May–July this year.

    Last month, we grappled with some thorny design questions (how do we test appetite for community-led retrofit? how could a service support both climate activists and neighbours who just need lower energy bills?) and started building a working alpha, which we’ll be testing out in online workshops with a handful of pilot communities around the UK this June/July.

    We also had a number of really encouraging calls with other organisations working in this space – all of us keen on finding some way to square the circle of solving the UK’s massive domestic decarbonisation challenge. If you’re interested, you can read much more in Siôn’s seprate monthnotes for this project.

    CAPE: making sense of messy data around local authorities’ climate plans

    From our newest climate tool (Neighbourhood Warmth) to our longest running – CAPE. This May we progressed two big improvements to CAPE, which we’re hoping to deploy and test out in June/July.

    The first uses AI / machine learning to extract clusters of related topics from our database of every local authority climate action plan in the UK, so you can more find other plans which mention topics close to your heart. We’re hoping these auto-extracted topics will also make it easier to quickly see what’s inside a document, without reading it from head to foot.

    The second change is a big re-think of how we help local authorities find their “climate twins”, or other councils likely to face similar climate challenges. We’re in the early stages of this little mini-project, but I’m excited that we might be able to come up with something that really brings together all of the various datapoints CAPE holds on each council, in a way that you just can’t get anywhere else. More on this, hopefully, in our June or July monthnotes!

    Council Climate Action Scorecards: crowdsourcing and verifying council actions on climate

    May saw the end of the “Right of Reply” period for councils to contribute their feedback on Climate Emergency UK’s volunteer assessors’ analysis of their climate actions. All of this marking and feedback process has been handled through a webapp custom built by mySociety, and it’s encouraging to see that oiver 80% of local authorities in the UK logged into the site to check their score, and around 70% of local authorities provided feedback on their provisional marks!

    We’re really proud of how this year’s Council Climate Action Scorecards are shaping up, and can’t wait to start sharing them in the Autumn. Our partners, Climate Emergency UK, have put a huge effort into making these as fair and up-to-date a representation of actual local authority action on climate change. Now they enter their final “Audit” phase, consolidating councils’ feedback against the volunteers’ first marks, after which we’ll be able to calculate each council’s final score.

    Local Intelligence Hub: a treasure-trove of constituency-level climate data

    The Local Intelligence Hub—the face of our collaboration with The Climate Coalition—soft launched to Climate Coalition members at the end of April. But just because the site is now in the hands of members, doesn’t mean work stops! Alexander has been continuing to collect and import new datasets around fuel poverty, the cost of living, and child poverty – as well as improving the reliability of advanced features like shading constituencies on the map. Meanwhile, our other Alex has been grappling with some Google Analytics-related challenges (tracking Custom Events with cookie-less GA4 – one for the geeks!) which I’m sure he’ll blog about in due course.

    If you’re part of an organisation in The Climate Coalition, you can request a free account on the Local Intelligence Hub, and try out the tools and datasets for yourself. For everyone else, we’re still hoping to launch a public version of the tool later this year.

    Header image: Krista

  3. Training an AI to generate FixMyStreet reports

    Artificial intelligence and machine learning seem to be everywhere at the moment – every day there’s a new story about the latest smart assistant, self-driving car or the impending take over of the world by robots. With FixMyStreet having recently reached one million reports, I started wondering what kind of fun things could be done with that dataset.

    Inspired by a recent post that generated UK place names using a neural network, I thought I’d dip my toes in the deep learning sea and apply the same technique to FixMyStreet reports. Predictably enough the results are a bit weird.

    I took the titles from all the public reports on fixmystreet.com as the training data, and left the training process to run overnight. The number crunching was pretty slow and the calculations had barely reached 5% in the morning. I suspect the training set was a bit too large, at over 1M entries, but end result still gives enough to work with.

    The training process produces checkpoints along the way, which you can use to see how the learning is progressing. After 1000 iterations the model was starting to be aware that it should use words, but didn’t really know how to spell them:

    Mertricolbes
    Ice does thrown campryings
    Sunky riking proper, badger verwappefing cars off uping is!
    Finst Knmp
    Lyghimes Jn fence
    Moadle bridge is one descemjop
    

    After 15000 iterations it’s starting to get the hang of real words, though still struggling to form coherent sentences.

    Untaxed cacistance.
    Broken Surface in ARRUIGARDUR. Widdy movering
    Cracked already nail some house height avenue.
    Light not worky
    I large pot hole
    Dumped shood road nod at street.
    Grim
    Dog man
    Ongorently obstructing sofas. This birgs.
    Serious Dirches
    

    After 68000 iterations there seems to be enough confusion in the training data that things start to go south again with the default parameters:

    Urgely councille at jnc swept arobley men.
    They whention to public bend to street? For traffic light not working
    

    Tweaking the ‘temperature’ of the sampling process produces increasingly sensible results:

    Large crumbling on pavement
    Potholes all overgrown for deep pothole
    Very van causing the road
    Very deep potholes on pavement
    Weeds on the pavement
    Several potholes in the road
    Rubbish Dumped on the road markings
    Potholes on three away surface blocking my peride garden of the pavement
    Potholes and rubbish bags on pavement
    Poor road sign damaged
    Poor street lights not working
    Dog mess in can on road bollard on pavement
    A large potholes and street light post in middle of road
    

    As well as plenty of variations on the most popular titles:

    Pot hole
    Pot hole on pavement
    Pot holes and pavement around
    Pot holes needings to path
    Pothole
    Pothole dark
    Pothole in road
    Pothole/Damaged to to weeks
    Potholes
    Potholes all overgrown for deep pothole
    Potholes in Cavation Close
    Potholes in lamp post Out
    Potholes in right stop lines sign
    Potholes on Knothendabout
    Street Light
    Street Lighting
    Street light
    Street light fence the entranch to Parver close
    Street light not working
    Street light not working develter
    Street light out opposite 82/00 Tood
    Street lights
    Street lights not working in manham wall post
    Street lights on path
    Street lights out
    

    It also seems to do quite well at making up road names that don’t exist in any of the original reports (or in reality):

    Street Light Out - 605 Ridington Road
    Signs left on qualing Road, Leave SE2234
    4 Phiphest Park Road Hasnyleys Rd
    Apton flytipping on Willour Lane
    The road U6!
    

    Here are a few of my favourites for their sheer absurdity:

    Huge pothole signs
    Lack of rubbish
    Wheelie car
    Keep Potholes
    Mattress left on cars
    Ant flat in the middle of road
    Flytipping goon!
    Pothole on the trees
    Abandoned rubbish in lane approaching badger toward Way ockgatton trees
    Overgrown bush Is broken - life of the road.
    Poo car
    Road missing
    Missing dog fouling - under traffic lights
    

    Aside from perhaps generating realistic-looking reports for demo/development sites I don’t know if this has any practical application for FixMyStreet, but it was fun to see what kind of thing is possible with not much work.


    Your donations keep our sites running.
    Donate now


    Image: Scott Lynch (CC by/2.0)