1. Leaky Pipes: What’s wrong with donations data

    As part of our WhoFundsThem work we want to make better information available about money in politics. 

    Last year we released a report Beyond Transparency – looking at the UK Parliament’s register of financial interests, and wider arguments about how we fund politics. 

    Today we’re releasing a follow-up report: Leaky Pipes (read online or download as a PDF). This covers what we’ve learned (and what we think could be better) about the systems for reporting election donations. You can also re-watch the launch event on YouTube

    This report started because we were a bit confused about the different ways data could be declared and reported.  And to be honest, we’re still a bit confused – but we have more diagrams to explain why. 

    What we explore in this report are the multiple routes for declarations, different thresholds for disclosure, and uneven public access. This makes cross-checking difficult and leaves gaps where information can vanish depending on how a donation flows (direct to candidate vs via party), how large it is, and whether the candidate wins.

    The result is that candidates and agents face complex reporting requirements, electoral administrators hold paper-heavy returns that are hard to inspect, and the public (and sometimes regulators) struggle to build a consistent picture of who is funding whom.

    From this, we’ve made recommendations on making reporting easier to do correctly, faster to publish, and simpler to scrutinise:

    • Move to a “report once” process that informs multiple systems
    • Harmonise public disclosure at £1,000
    • Create a comprehensive public database above that threshold
    • Create a safe private database below the threshold for research and evaluation purposes

    Building on this, we suggest three practical avenues for follow-up work that would strengthen the case for reform and help design better systems:

    • User research and prototyping to map how a “report once” service would work for candidates, agents, administrators, Parliament, and the Electoral Commission. 
    • Sampling local authority returns to demonstrate the scale and type of inconsistencies between routes.
    • Exploring a data-sharing agreement for controlled research access to the Electoral Commission’s small-donor/return data.

    The report can be read online or downloaded as a PDF.

    Header image: Photo by Meg on Unsplash

  2. New research report: Supporting good communication

    With WriteToThem.com we want to run a service that helps people write the right message to the right place. That means helping users express themselves effectively and keeping the service a constructive channel between constituents and representatives by deterring abusive messages.

    Abuse and intimidation aimed at elected representatives does not just harm the person receiving it. It corrodes the openness and trust that democratic culture needs, and it can deter people (especially those from under-represented groups) from taking part in public life at all. 

    We think we’re in a good position to play a constructive role in this area. One problem that has been raised is frustration at bouncing around layers of government, where a key benefit of WriteToThem is getting people to the right layer first. But we need to go further than that to understand how we can discourage abusive messages – both to directly implement approaches, and to trial patterns that could be implemented by a wider range of parliaments and local authorities.

    We’ve been exploring what a “toxicity” risk score would look like in our infrastructure and have released a report of our findings so far. We trialled a range of options — from baseline keyword matching, to Google’s Perspective API, to running lightweight models locally (IBM Granite Guardian), and then to LLM-based grading as a second pass for tricky cases like implicit threats or messages quoting abuse from third parties.

    But having a risk score is less important than how it is used. We’ve mapped out a few different approaches beyond a manual moderation approach – such as soft “nudge” prompts (encouraging people to reconsider wording before sending), cool-down delays for higher-risk messages (without removing someone’s ability to contact their representative), and informative flags for recipients (for example, passing along a risk score or relevant metadata on a message).

    Our next step has mapped out some technical possibilities to talk to more people about which approaches make sense  – which we’ll be doing as part of our wider Welsh Government funded democratic engagement work to improve WriteToThem.

    For more details on the approaches tested, potential issues with different methods of implementation, and unanswered questions, you can read the report online.

    Image: Pawel Czerwinski

  3. New report: WriteToThem Insights

    Understanding more about constituent communication

    We’ve released a new report exploring insights from WriteToThem about the content of constituent communication – you can read the whole report online or a summary below. 

    WriteToThem.com is a long-running mySociety service that enables people across the UK to contact their elected representatives by entering their postcode and sending a message through the site.

    This service provides a unique opportunity to understand the flow of communication between many constituents and many representatives. Our WriteToThem Insights report uses surveys to understand more about what people are writing about. 

     While previous work identified patterns in response rates and deprivation gradients, this experiment focuses on understanding what people are writing about, distinguishing between casework (individual problem-solving) and campaigning (policy-oriented advocacy).

    A new survey and data-processing pipeline were developed to categorise and anonymise message summaries, applying machine learning and large language model techniques to cluster and label topics. Analysis of 5,400 messages from Q3 2025 found:

    • Casework and campaigning form two distinct types of communication, with casework more common for councillors and campaigning dominant for MPs.
    • The deprivation gradients of these two types differ sharply: campaigning is concentrated in less deprived areas, while casework is more evenly distributed, though likely still underrepresents the most deprived groups.
    • First-time users are more likely to send casework messages and to receive responses.
    • Top themes in casework include housing, local services, health, and anti-social behaviour; in campaigning, issues such as Gaza, climate policy, and digital ID predominate.

    This data has limits. This covers only a portion of total correspondence, and with little information about whether the sample is representative enough to generalise to messages sent in general. That said, we think there are strong uses both for improving WriteToThem itself and for informing broader understanding of constituent communication.

    We want to build on this work: refining the analysis process and exploring opportunities to collaborate. We see particular value in digging more into casework data as something that could inform more systematic approaches in this area, helping representatives across the country join up information and improve collective scrutiny of government services.

    The full report can be read here.

    Image: Christopher Burns

  4. Mayoral scrutiny: building an ecosystem of accountability

    Mayors and combined authorities are the future of devolution in England,  but the ways in which citizens can understand, scrutinise, or influence them remain unclear.

    Our latest report, Mayoral scrutiny: supporting an ecosystem of accountability organisations, argues that devolution will not deliver on its promises unless we also invest in new forms of civic and democratic oversight. It is not enough to create powerful new Mayors; we need to create the ecosystem that holds them (and the wider web of regional institutions) to account.

    Why scrutiny matters

    Combined authorities are designed to bring councils together to plan and deliver across a region. But unlike the London model, they do not have an elected assembly meant to hold the mayoral executive to account.

    Existing models, such as council scrutiny committees or parliamentary hearings, can only go so far. Combined authorities need scrutiny that reflects the full complexity of their networks and partnerships.

    A scrutiny and civic development fund

    We highlight two complementary approaches already being explored:

    • Local Public Accounts Committees (LPACs): technocratic bodies that examine how public services work together across a region, looking not only at the Mayor’s decisions but at value for money and collaboration across agencies.
    • Democratic journalism funds: public-interest media funds guided by citizens’ assemblies, ensuring independent, locally relevant journalism that supports democratic life.

    We propose bringing these ideas together in a new Scrutiny and civic development fund: a local grantmaking body with priorities set by a citizens’ assembly. The fund would support a mix of civic institutions — from expert-led scrutiny committees to independent journalism — that together strengthen public accountability and regional identity. Approaches along these lines would help ensure that devolution does not just move power geographically, but makes it genuinely more responsive to the people it serves.

    Supporting existing scrutiny

    This report also explores ways we could apply our existing tools and approaches to sustain and connect the accountability ecosystem that already exists. Through tools like MapIt, TheyWorkForYou, and WhatDoTheyKnow, we can build a civic democratic stack to support journalists and civic technologists to understand and monitor combined authorities.

    We’ll also continue to explore how civic tech can make these new layers of governance more transparent, and how data and digital infrastructure can support the work of local scrutiny.

    Read the full report

    The report explores the history of scrutiny in English devolution, how these proposals could work in practice, and sets out the steps to strengthen the civic fabric around mayors and combined authorities. You can read it here. 

    Header image: Photo by Omar Flores on Unsplash

  5. Running open LLM models

    Most discussion and usage of LLMs is focused on high profile closed models such as OpenAI’s ChatGPT family, and Google’s Gemini – which are widely available and integrated into a range of existing products and services. 

    Because these are closed models, access and hosting of the models is controlled by the companies that create them. This presents a dilemma for civic tech organisations who believe in open source – where important parts of their processes can disappear into black boxes beyond your control. These may work well/be affordable today, but creates new risks. Specific models might become unavailable, there might be changes in pricing, and this represents lock-in to specific providers. 

    Open LLM models provide an alternative approach. In a familiar issue from open source licensing,  there are different ways in which a model can be ‘open’. Open weights models have the final structure of the model released and can be run on your own hardware (Meta’s Llama model is an example of this). Fully open models have the underlying (open licenced) training data released, as well as the recipes and evaluation systems used in their training. AI2’s OLMo family of models and the recent Swiss AI institute’s Apertus model are examples of these. Somewhere in between these are approaches like IBM’s Granite models, where the model is released as open weights and the data was licensed to be able to train on (addressing copyright issues) but is not publicly accessible. 

    What are weights? Basically a model can be understood as a big network of connections – where the ‘weights’ are how strong (and influential) a connection is. What’s happening in the training process is a refinement of these weights as a result of being exposed to the training data. The weights at the end of the process are the trained model, and can be shared and used by others. But if you also have the training data and process, you can recreate the model step-by-step, with a clear audit trail of what’s in it.

    Any kind of open weight model is practically appealing because they unlock new ways to work with private data without sharing with third parties, and create more flexibility around infrastructure. For instance, we currently use a fine-tuned version of Llama to help flag immigration correspondence in WhatDoTheyKnow.

    Fully open models are ethically appealing because they avoid the issues of models that have been trained on copyrighted data. Their existence is a challenge to an AI policy debate where countries must trade-off the rights of creators against the benefits of AI as sold by a handful of companies.  They fit well with our open source ethos – and understanding more about how to use them practically helps give us options to improve our own services, and contribute to wider arguments about responsible use of AI.

    This blog post is a write-up of several practical experiments in using the 7b parameters variation of OLMo-2 both locally on a laptop GPU and remotely using HuggingFace’s inference endpoints. 

    Using OLMo-2 locally

    Our purpose in running something locally is to be able to process sensitive information that should not leave our infrastructure. In this case, using OLMo-2 to create human-readable representations of clusters from WriteToThem survey responses. While users are asked not to include personal information in this survey, enough do that we need to treat the basic dataset as having personal information that should not be shared.

    We used llama-cpp (and the associated python bindings) to run the local model. An alternative local approach is to use ollama to run a local server. The reason for using llama-cpp in this case is that ollama doesn’t always seem to pick up that less well known models can use ‘tools’ correctly (which is required for structured data output). Another benefit is having it run in process rather than as a separate server is the script can turn on and off the resource intensive bit (although there’s a corresponding start up time) rather than needing a separate server process to run.

    Setting up the libraries

    Installing llama-cpp in a way that can use the GPU is not straightforward. This set of instructions for Windows 11/Nvidia GPU mostly worked for me. I additionally needed to add an extra DLL directory before importing from llama_cpp because there’s a DLL folder that the library wasn’t yet referencing. 

    Big picture, WheelNext is a project to try and make installing correct versions of the library easier across different OS/GPU combinations. In the meantime, setting up a local machine is a bit fiddly.

    Downloading model information

    Llama-cpp uses GGFU files – which have all the weights in a single file. There are libraries to convert from the transformers format – but this is often made available by model publishers on HuggingFace.

    Downloading the model can be done using the huggingface_hub command line too (here using uv). 

    uvx –from huggingface-hub hf download allenai/OLMo-2-1124-7B-Instruct-GGFU olmo-2-1124-7B-instruct-Q4_0.gguf –local-dir models

    This is pulling down a quantised version – which has the same number of parameters – but the values of the weights have been significantly rounded down. This tends to have much less decrease in quality than the corresponding decrease in file/memory size (why? Broadly high fidelity here is useful for adjusting in training which will happen in small shifts, but when you have something working the general structure is good enough)  – and this fits it just inside the ability of my laptop’s GPU. 

    This download can also just be done in code:

    from llama_cpp import Llama

    from functools import lru_cache

    @lru_cache

    def get_llm():

    return Llama.from_pretrained(

        repo_id=“allenai/OLMo-2-1124-7B-Instruct-GGUF”,

        filename=“olmo-2-1124-7B-instruct-Q4_0.gguf”,

    )

     

    Structured data output

    To get structured data out of the model, Pydantic AI can be used with Outline to query the llama cpp model.

    This:

    • makes it easier to define Pydantic data structures that should be returned.
    • makes it easier to swap between local/remote models by swapping the model passed to the agent, but otherwise using a common API.

    Hosted OLMo-2 model

    An advantage of any open weights model is being able to run it on a range of infrastructure (and being able to change the infrastructure later). 

    In this case, I had a use case where we wanted to do transformations on already public data (the appropriateness of linking to a specific Wikipedia page from a specific sentence in a parliamentary debate)  – and so there was no privacy/security issue for the purposes of the experiment. We are doing further exploration about how we can make this kind of use compliant with our wider legal and privacy commitments. 

    Because OLMo-2 is not a commonly used model, there isn’t an inference service that offers it directly as an option (which would be most efficient – as you’re being charged for tokens while the underlying infrastructure is shared between many users). Instead, you need to create a private server that can manage the model. 

    Creating an endpoint

    Hugging Face Inference Endpoints is the approach I used here – that lets you provision an endpoint connected to a specific model. I’m using the same model as I used locally.

    Depending on the properties of the model – the minimum GPU required will be suggested. This model was coming up about $0.8 an hour. Running the 13b parameter version of the model was about $2 an hour. There are options to run on AWS, Azure and Google Cloud in different regions (although processing data in the EU/UK is a requirement – this limits some of the GPU options). 

    The scale-to-zero time is adjustable down to about 15 minutes. It takes a few minutes to load up from this. In principle, if the access token is scoped correctly – the huggingface_hub library can handle pausing and unpausing the endpoint (or even programmatically creating one), if some more control here is wanted. 

    Structured data output

    This endpoint works well using some of the example HuggingFace connections for PydanticAI. Something I had to adjust was adding an adapter to reduce complex json schemas (e.g. anything with multiple model types, enums, etc) from using ‘$defs’ to just being a normal structure because the Hugging Face text-generation-inference interface can’t handle them. 

    I have an example of creating a model that Pydantic AI will accept here – the missing config bits are a token associated with the account and the url of the endpoint created. 

    So in principle this means we can have an endpoint that gives us access to a GPU based model for an hour a day at a reasonable price – while we could at a later point swap out to use a local model without adjusting the general logic of the application. This is well suited to our current anticipated uses in batched backend processes, but would be less efficient if it needed to be responsive around the clock.

    Reflecting on the results

    Compared to previous projects using the OpenAI API, a key thing to note is it is slower and more fiddly on the infrastructure at hand. I was only using the 7b parameter model, while the 32b parameter model is the one that evaluates closer to GPT-4o mini. As such, prompts needed to be a bit more detailed on what was required. Similarly, a combination of the hardware and not being able to run queries in parallel over a wider infrastructure mean the process takes longer. 

    But this is also like comparing cake to a well balanced meal – the benefits of an open model are not just philosophical but practical. With a bit more work on the prompt you can get useful results on a laptop with no dependency on third-party services. That brings into scope a range of use cases that OpenAI is not suitable for. 

    Even where, such as in the Wikipedia example, there are no privacy issues in using OpenAI, making it easy to swap in an open model makes it much easier to evaluate the effect of using an open model. It will now be relatively straightforward to quickly substitute OLMo-2 into PydanticAI flows using other models and get a baseline feeling for effectiveness. Even where you might choose to use a closed model in a specific instance, it is very useful to work in such a way that you are not locked in to that model and could switch away in future.

    Similarly, having a working process for a non-mainstream model like OLMo-2 makes it easier to explore other models like Apertus. As this has been trained on a wider range of non-English languages it could provide a more dependable component in LLM integration with the core Alavateli software – which powers Freedom of Information platforms across a range of languages. 

    Understanding open models as a practical approach helps contribute more widely to policy conversations around AI – and where trade-offs and impacts are inherent to the nature of the technology, or are a consequence of how they are currently controlled and produced. 

    Open models are always likely to lag slightly behind the frontier models, but they are already incredibly useful technologies compared to what was possible a few years ago. We want to understand more about how we can practically make use of these models – and help make sure the future of LLMs are shaped by ethical considerations about their training and use – rather than accepting them on the terms of the dominant tech giants. 

    Header image: Photo by Zhang Zi Han on Unsplash

  6. Using LLM tools to build APPG scrapers

    Recently we wrote about why we’re now listing APPGs in TheyWorkForYou. This blog post goes into more detail about the technical process we use to gather who is a member of an APPG.

    We have two methods of getting the memberships of APPGs. The first is finding if it’s already published on their website. The second is using Parliament’s rules to ask the APPG contact for the list. So we need to a) find all the APPG websites, and b) see if they publish members lists c) if not, ask for the list and d) get those lists into a consistent format.

    Data that is fragmented and not in the format we want is a fairly common civic tech problem. The solution is to write a ‘scraper’ that reads the content of a website and has a process for converting it to a more structured format. 

    This works well when dealing with only a few sources (e.g. the memberships of the UK’s parliaments only needs a few different scrapers), or where a common format is being used (e.g. many local government websites use similar providers). In the case of APPGs, there is no common template being used. We just have a set of a few hundred websites that may (or may not) contain a list of names. 

    Rather than a traditional scraper, we have built an agentic AI/LLM approach that is more flexibly able to extract memberships from websites.  The end result is a tool with a careful sequencing of manual and automated steps, injecting human review in structured ways. Rather than an “AI makes mistakes” disclaimer, we built a structured process to check elements efficiently one group at a time, that can lock off errors before proceeding to the next stage. This was also an experiment in using LLMs to write scraper tools, as well as some of the tools needed for the manual review steps. 

    Practically, this was an effective way of getting the information we needed that turned a very hard problem into one that we can dependably run regularly. It also suggests more generally useful ways of approaching fragmented data problems (more on this at the end of the post). 

    Building agentic approaches

    An ‘agent’ is often poorly defined, but broadly it’s a language model interface is given tools (specific functions), a task, and an output data structure, and it loops between these until it gives a result. 

    To build agentic functions, we used the PydanticAI framework, which acts as a connector between the prompt, input data, the data structure of the output data, functions the agent has access to, and any bespoke validation of the results. The end result is a function that accepts structured input, and returns structured output, relatively painlessly. 

    Although this example is using OpenAI’s GPT models, in future experiments we use the PydanticAI approach to connect to open source models (the framework is designed to be model-agnostic). In principle this means that this project could in future switch the underlying provider used. 

    Process

    Step 1: Writing a scraper 

    The first thing we needed to do was to get the official data from Parliament’s APPG register into a more structured form. 

    You can see an example of this page for the Africa APPG. This is a good task for a traditional scraper, but would also have been a fiddly problem. Using ChatGPT, we gave it an extract of the HTML, and asked for a Pydantic data structure and script to convert the data. This worked pretty well, with some tweaking to the format over time. When errors emerged in different APPGs – passing the error and an understanding of what should have happened back to the Copilot agent (using a Claude model) led to working fixes. In using the coding agent the key decision was deciding which bit of the project to be opinionated about – and this has mostly meant being very explicit about data structures (and validation to ensure they’re correct), and more relaxed about the pipes that connect things up. 

    Step 2: Adding categories to APPGs

    From the official data, we only know if an APPG is a county or subject area group. We want to make it a bit more explorable by breaking this down into categories. 

    In the spirit of experimenting with LLMs, we copied all subject areas APPGs names and purpose statements into one of OpenAI’s reasoning models and asked for 10-20 sub-categories. It came back with 20 and they looked reasonable.

    We then created a small functionless agent interface, giving it the title and purpose of a specific APPG, and returning a list of potential categories (preferring one, but allowing all that seem relevant). 

    Spot-checking these, they seem reasonable and for the purpose of breaking down the big list a bit – this is a good step up. This means, we can quickly see the APPGs that are likely to be relevant to environmental matters

    Step 3: Finding missing websites

    Some APPGs list their external website – some do not. Here we use AI tools as part of the workflow, to find those missing sites (which may not exist). 

    We created an agent function with access to a web search tool (tavity), a function to check if the URL is valid, and a prompt to help identify the correct site. This creates a loop to search and identify a good candidate for the website.

    At this point, there is a manual check that prompts the user to review each site one-by-one before confirming it as a valid site. 45/74 sites identified in the first wave were valid. Invalid websites were news articles, APPGs in other parliaments, or sites for previous iterations of that APPG.

    This is not comprehensive and we and our volunteers found some more manually after the fact – but it is an interesting trial in finding data starting only with a search engine. 

    Step 4: Find published members

    The final step is to get a list of members (if published) off these websites. We need a really flexible approach for this. Names might be in a structured list, but they can also be in one paragraph. They might be on a members page, the home page, or spread over three pages. There is no consistency to fall back on. 

    Here, we created an agent with a function that can fetch a web page and convert it to markdown. Using this recursively, the prompt instructs the agent to find the most relevant page (in some cases pages) that could contain membership information, and return a data structure of the members (MPs, Lords, Other). This returned over 5,000 names in the data format provided. 

    The big risk at this point is that having been asked for a list of MPs, it makes some up. The validation we use for this is to check if each name in the list is present within the HTML content of the page it was extracted from. If there’s an error, it runs again and will give up rather than use an incorrect list.  There is some possibility for misinterpretation – but this prevents outright fabrication. Errors flagged here tended to be when the LLM has fixed formatting meaning the text no longer matches exactly against the page.

    The key problem here is one that a human would have too – some APPG lists are out of date. Here I added an extra flag detecting a list containing people who had left Parliament that then needed a manual review. In other cases, this was sometimes picking up lists that were not membership lists. We made some adjustments to the prompt after picking up attendees at the AGM – which is not wrong, but incomplete. 

    Step 5. Manual data

    As our main blog post talks about, we then needed to contact APPGs directly for lists that were not published. This presented a new problem: what we got back was a combination of spreadsheets and emails with different levels of detail – some including party details in other columns, some not. 

    Our solution was to have a Google Doc that just has each list formatted under a heading with the APPG title – we could just copy and paste information into this. 

    This file is then downloaded as markdown and converted into a list of names. There are a few tweaks to clean up leading numbers, and identify the name component of the line. Again, this step was substantially written via prompt – giving the LLM examples of the problem data, and that would create regular expressions to clean the data into the basic list of names we needed. 

    Step 6: Tidy members information

    What we want to do next is get from a list of names to a list of TheyWorkForYou unique IDs. 

    We have a library that helps reconcile names to IDs, but a challenge here is that there are a huge range of spelling mistakes (sometimes to an extent where you could not actually work out the correct MP).

    What we needed was a quick tool to compare the input name against our list of known names and suggest near matches. Here we again turned to the coding agent, posing the problem, providing some snippets to interact with our existing library, and letting it craft a command line interface. 

    This fairly quickly gave a good interface for reviewing spelling problems (which was later refined to auto-match below a certain threshold). This helper tool is not especially complicated, but as something with a clear input and output, isolated from the rest of the flow, was a good candidate for testing using Copilot to create the function. In choosing what to spend time on, this would not otherwise have been a priority – but brought a useful feature into scope. 

    Result

    The end result of this process is fairly effective – with a series of steps we can repeat every six weeks when a new APPG register is released to check for new webpages for new APPGs, or to recheck previously scanned pages. 

    The efficient sequencing of steps means that manual review happens on similar tasks in sequence, rather than checking each APPG through all steps. 

    In general, I’m pretty happy with the results of this, it made a project that would otherwise only have been possible with a big (and fairly boring for participants) crowdsourcing effort possible. 

    One of the problems we have to deal with a lot is fragmented public data, when relevant data is scattered all over the place and is a lot of work to bring back together. Here we found AI tools that were both useful in discovery of a component of the data, and in reconciling to a common standard. 

    The “AI scrapes then verifies content is present” approach worked well here but would struggle with more complex problems. For instance, if we really needed to be sure we were extracting a correct party label alongside a name, knowing that ‘Labour’ was present on the page wouldn’t be as helpful. 

    Building on this, the AI-written scraper code worked pretty well. If properly sandboxed (pydantic-ai has support for running python in a sandbox using pyodide), transformation code could be written to convert data between different sets of headers without running the data itself through an LLM to convert it. This potentially helps with some of the fragmented data problems of reconciling compatible but different schemas. LLM-involved approaches have a real potential to create new datasets through easier discovery and joining of data.

    This is a way we can use new technology to make a dataset possible, but also it would be much easier if Parliament gathered and published this in the first place. The equivalent Cross Party Groups in the Scottish Parliament just make a downloadable file of all memberships in their open data portal. We need to think about how new technological approaches are not just propping up bad transparency – but part of encouraging better transparency all the way upstream. 

    Header image: Photo by Susan Holt Simpson on Unsplash

  7. TheyWorkForYou Update: A richer view of Parliament

    We want to improve the quality of UK democracy by making more and better information about Parliament available to everyone. 

    In previous updates to TheyWorkForYou, we’ve expanded the range of official sources the service pulls on: extending to cover all the UK’s parliaments, and recently bringing together all the registers of interest in one place

    This update is about adding pipelines and data to bring in data beyond Parliament to provide richer insights into your representatives. 

    What we’ve added:

    • Committees and APPGs memberships
    • Signatures (EDMs and Open Letters)
    • Vote annotations
    • Adding context to parliamentary debates
    • Improved email alerts for political monitoring
    • Navigation improvements to MPs profiles

    You can also watch our launch webinar to learn more about how these changes fit together:

    And as ever, you value the work we do, and want to help us go further – please consider making a donation to support our work

    Committees and APPGs

    An important part of how Parliament works is through the formal committee system and the informal APPGs. We wanted to improve the information we display on both of these kinds of groups. 

    For Committees: we’ve pulled more information from Parliament to give extra information about the committees MPs are a part of,and to try to explain more about Parliament as part of the MP profile. 

    For APPGs: there has not previously been a good central database of APPG members. We’ve set out to create this. We used a new LLM-assisted scraper to get lists of memberships off dozens of individual websites. For those without a website, we asked each APPG individually for a membership list to add to the collection. This database isn’t complete yet, but is now the best available source on APPG memberships.

    Read more about this the APPG changes

    Signatures

    Early Day Motions are effectively an internal petition system available to MPs, where they can signal support for different issues. Including recent EDMs helps indicate which issues MPs see as important. 

    But we also wanted to go beyond these motions to look at the growing trend for MPs to share joint open letters on social media instead. We have started to transcribe and store these open letters, so we can make the content more accessible, and show on MPs’ profiles the issues that concern them. 

    We have separated out ‘motions to annul’ from other EDMs. The process of objecting to negative statutory instruments (which become law unless there is a vote against them) – felt worth highlighting above other proposed motions because it represents scrutiny of secondary legislation. These motions are technically called ‘prayers’ in the UK Parliament, but we use the term used in the Senedd and Scottish Parliament because it’s clearer. 

    Read more about the EDM / open letter changes

    Improved political monitoring

    We originally created TheyWorkForYou’s email alerts to make it easier to track what your representatives have been saying in Parliament. But as well as following individual representatives, alerts can also be for phrases, and these have proven to be a vital tool that help civil society monitor what is happening in the UK’s parliaments.

    To lean into this use, we’ve completely redesigned how you can create and manage complex keyword alerts, making it easier to group multiple terms, see results on the page, and manage a number of alerts across different topics.

    With this, we want to make TheyWorkForYou a more powerful free tool for political monitoring —and make it easier for NGOs and grassroots organisations who cannot afford paid political monitoring to not be disadvantaged compared to those who can. We don’t think money should get you better access and want to build tools to level the playing field. 

    Read more about the changes to email alerts

    Vote annotations

    Building on the release of our new site TheyWorkForYou Votes, we have made it easier to reach the new information we hold on voting. For recent votes in an MPs profile, we now link to our new richer analysis, and if MPs spoke in the section before the vote, we’ll also link to those speeches. 

    We’re also starting to make some of the extra information we store in TheyWorkForYou Votes visible in MPs’ profiles and voting summaries, such as vote annotations and information about party instructions (whipping). TheyWorkForYou’s publication of voting records has led to more public justifications from representatives about how they vote, and we want to try and get that information back into the site. 

    Currently we don’t have many examples of this while we test the system, but we will be picking a few specific votes to add more information and links to. 

    Understanding parliamentary debates

    We want to make it easier for everyone to understand Parliament, and one way we can do that is by adding context to debates beyond the official transcripts. 

    We’ve gone back to features that have been around for twenty years and made improvements. We’ve overhauled our approach to linking words and phrases to Wikipedia to ensure there are fewer false positives. 

    We’ve also revived aspects of the debate annotation system and glossary systems to give us the ability to add notes to high profile debates —and will be making more use of that over the next few months. 

    A new coat of paint

    To hold all this new information, we’ve redesigned our MP profile pages to make it easier to find different sections, and so they work better on mobile. 

    We’ve added more explanatory text to different sections, and improved the display of registers of interest to make it easier to see only the new entries (also: see all the wider data we hold on registers of interests). 

    Coming up

    In the coming months we’ll be releasing some more work as part of our efforts to understand and improve how component parts of UK democracy are working in practice. 

    We’ve been running a new survey on WriteToThem to understand more about what people are writing to their representatives about, and we’re going to release a report talking about the patterns we’ve learned from that, and how it’s affecting our thinking. 

    As part of WhoFundsThem work, we’re continuing to dig into money in politics, and have two releases coming up. One is a report about the systems of tracking election donations, and the other is our research into MPs asking parliamentary questions about areas they have a financial interest in. 

    That’s it for now, and remember if you want to help us go further – please consider making a donation to support our work!

    Header image: House of Commons

  8. Improving TheyWorkForYou email alerts

    You can subscribe to TheyWorkForYou’s alerts to receive email updates on representatives’ parliamentary speeches and questions — but they’re also strongly used by civil society as a parliamentary monitoring tool, letting organisations know when their topics of interest have been mentioned in debates or votes. Our alerts help the flow of information from Parliament through government and wider civil society. 

    Something we’ve wanted to do for a while is make it easier to create these keyword alerts. To cover all variants of a concept or topic,  you previously needed to create a search using operators (‘cars’ + ‘vehicles’), and as a result very few people did this. 

    We’ve made a new interface for email alerts. This:

    • makes it easier to create more complicated alerts
    • can sometimes suggest useful phrases to include
    • lets you see recent hits on alerts on the website as well as your inbox.

    Creating more complicated alerts

    Previously, you would need to do a search for (“electric vehicle” OR “electric car”) and then convert this into an alert. You can still do it that way if you want,  but we have a new interface to make it easier to make more complete queries. 

    From the alerts page, you can create a new alert, and add a list of phrases.

    When you’ve made an alert, it will give you the option to see the results and any recent matches to check it’s picking up what you want, and if not,  go back and adjust the terms used. 

    TheyWorkForYou Screenshot of listing multiple terms

    Suggesting useful phrases

    Sometimes you might not know the term that is commonly used in Parliament. We’ve done some data-crunching to try and help out here. 

    Using vector search, we’ve created a list of related terms based on common previous searches and matches in the transcripts. For instance, below a search for ‘electric vehicle’ you’ll see suggestions including ‘electric car’ and ‘ev’: other terms for the same concept that have been used in Parliament in the past.  

    This is not comprehensive and is initially focused on the most common terms — but is part of our approach to incorporating benefits from machine learning tools in a sustainable way into our services.

     TheyWorkForYou screenshot of suggested terms for an alert 

    Viewing and managing alerts

    Another thing we have done is make it easier to manage a greater range of alerts — and see recent mentions in the browser rather than purely in emails. 

    You can now see, at a glance, any hits that have happened in the previous week, in the alerts management page, and expand it to see the last mention and get a link to the latest results. 

    TheyWorkForYou Screenshot of management page - showing different alerts with count previews

    Stay up to date

    We’re always working to improve our services. Sign up our newsletter and make sure you’ve checked ‘Democracy and Parliaments’ to hear more.


    Header photo by Patrícia Nicoloso on Unsplash

  9. TheyWorkForYou voting summaries update: October 2025 

    This update to TheyWorkForYou voting summaries brings us up to date as of the end of September 2025 (covering Q2+Q3 2025). 

    To learn more about our process for updating MPs’ voting summaries, please read our previous blog post.  We have also recently released TheyWorkForYou Votes which, as well as providing open data for anyone to use in their own online parliamentary projects, is now powering TheyWorkForYou’s voting summaries. 

    This update adds 21 votes and 3 historical votes to expand new and revised policies. We have also started to bring more information we’re gathering in TheyWorkForYou Votes (vote annotations and whip reports) into the voting summary pages of TheyWorkForYou. 

    Previous draft policies have been put live for:

    • Border Security Bill
    • Planning and Infrastructure Bill

    Votes have been added to existing policies for:

    New policies have been created for:

    • Increasing local council power over bus services
    • Preventing sentencing guidelines requiring offender background reports based on race, religion, culture, or similar traits.
    • Creating a new regulator for English Football
    • Proscribing Palestine Action, Maniacs Murder Cult, and Russian Imperial Movement as terrorist group

    Draft policies for:

    • English Devolution and Community Empowerment Bill
    • Sentencing Bill

    Have been created and will be added in the next update after the third stage (approval) vote. 

    If your MP voted in any of the divisions feeding into these policies, you’ll see them on their TheyWorkForYou page in the ‘Voting summaries’ tab.

    Notes

    Annotations and free votes

    One of the things we want to do with TheyWorkForYou Votes is gather public statements MPs make about their votes and make this accessible through TheyWorkForYou. 

    We’ve completed work to flag when we’ve gathered some statements associated with a policy line, and are testing this with a few statements on the Assisted Dying Bill third reading (the annotations column in the table at the bottom). These are flagged on the MP in question’s summary page. Our next step here will be to crowdsource more statements that were made around this specific vote. 

    We are also starting to experiment with recording some votes as free votes and flagging these in in the summary page. This is step one towards gathering and displaying whipping information. Currently we have only included a few votes from the current Parliament to refine the approach. 

    Bus powers

    When adding new policies we check whether there were any obvious votes in the last decade that should also be included. 

    For a new policy around ‘increasing local council power over bus services’, we have added a retrospective scoring agreement for the 2017 Bus Services Act (which was passed without a vote, with explicit cross-party approval in the debate). 

    Minimum detention requirements

    Here we have adjusted the description of the policy to:

    voted for/against reducing (for some kinds of offenders) the minimum detention requirement before release to *reduce pressure on prison capacity*[last bit added]

    The original was framed more generally in a way that could have worked as an all-time policy description, this now includes the justification used across these votes currently covered for the change. 

    Palestine Action proscription

    We’ve created a new one-vote policy line for voting for/against ‘proscribing Palestine Action, Maniacs Murder Cult, and Russian Imperial Movement as terrorist groups’.

    This vote passed a noteworthiness criterion for a single vote policy through not only for the continuing impact (leading to hundreds of arrests for supporting the now proscribed group), but the initial circumstances of the vote.

    This was the first vote to be held on proscription of a group under the Terrorism Act, with all previous examples having been taken by unanimous agreement. While there were few votes against (which will show as a difference from the party for Labour MPs), there were also a large number of absences and some conscious abstentions from Liberal Democrat MPs (who voted both for and against, which we convert to an ‘abstain’). 

    As part of our 2024 scoring changes, absences and abstentions are treated differently. MPs who abstained are recorded as voting, and will have a line for this policy. MPs who were absent will not be given a policy line for this policy (and this isn’t shown as a significant difference from the party).

    LGBT+ Rights

    As part of this update we’ve renamed the ‘Gay rights’ policy to ‘LGBT+ rights’. In substance this change better describes votes already included in the policy, as relevant votes have generally covered multiple groups (and this fulfils our uniqueness and cohesion criteria better than a new policy line). This wider framing provides a sharper lens on already included votes. For instance, in an already included 2024 vote on a conversion therapy ban, while the kind of conversion therapy being discussed covered multiple LGBT groups, in practice the opposition to a ban in the debate followed from opposition to trans conversion therapy specifically.

    This shift lets us capture votes that represent attempts to restrict or expand the rights and status of trans people independently of other groups. In this specific case, an amendment to the Data (Use and Access) Bill around the definition of sex data, requiring sex at birth to be recorded in official contexts (far beyond settings where it is is practically relevant).

    As part of this, we have reviewed if any previous votes should be added to an expanded definition, finding two relevant decisions that would have been appropriate under the original definition. The approval of 2019 guidance around inclusive relationship and sex education has been added as a scoring vote, and the inclusion in the census of separate questions around sexual orientation and gender identity has been added as an informative non-scoring agreement. 

    Launch event

    This Thursday we’ll be hosting a webinar to talk through a range of recent changes to the MP profile pages and email alerts. We’ll also share more information on our mailing list over the next few weeks. Sign up here and make sure you have ‘Democracy and Parliaments’ checked as an interest if you’d like to receive these emails.

  10. New to TheyWorkForYou: Signatures

    One thing we want to take more advantage of with TheyWorkForYou is the fact that we’re not an official website — and so can pull on multiple official and unofficial sources of information to present a richer picture of how our democracy works. 

    Our trajectory with voting summaries has been to focus on votes that are substantive. This means they’re generally on issues whipped by parties, and there are few differences between the voting records of MPs in the same party.

    But we’d also like to make it easier for everyone to understand what differentiates MPs: the signals they give about their values and interests, and where they fall on internal arguments about policy direction. 

    As such, all MPs now have a Signatures tab on their TheyWorkForYou page, which tracks Early Day Motions (EDMs), open letters, and Motions to Annul signed by the MP. 

    EDMs

    One form of information we want to make more use of are Early Day Motions (EDMs). These are technically ‘proposed motions’ that may be elevated to a full debate. In practice this rarely happens and they work as an internal parliamentary petition service, where MPs can propose motions and co-sign ones proposed by others. They are still useful in reflecting the interests of different MPs even if EDMs rarely lead to substantive change in themselves. 

    To provide better access to this information, we’ve added EDMs to TheyWorkForYou Votes as ‘Signatures’. Here TheyWorkForYou Votes is working as a general data backend that will help power features in our own services, and makes it easier to access the data for bulk analysis. This then feeds into individual MP profiles. 

    With this, we are catching up to what Parliament displays on their MP profiles (EDMs), but also building the framework to expand to the UK’s other Parliaments and to capture extra-parliamentary statements like open letters that serve a similar function. 

    Open letters

    Over the last few years, we’ve noticed more open letters being shared on social media, where screenshots of a list of names on official parliamentary paper are serving the purpose of  signalling in public that a grouping exists in a political argument. 

    A recent example of that is the big open letter for UK recognition of a Palestinian State. This was initially posted on X as images, and we’ve transcribed it and made the list of MPs searchable

    There are a few reasons why MPs might prefer to use these kinds of open letters rather than submitting an EDM. Social media reach means that MPs can make a full public statement without the parliamentary publishing process. A letter can be published in full without the word count restriction of a letter to a newspaper, so can pick up more names.

    Similarly, open letters are free from the format restrictions and word count of EDMs (a single sentence of less than 250 words). This can be important as many letters represent a group of government MPs trying to change the government position. Being able to write more is important in referencing previous government actions, anchoring the change in agreed principles and so on,  while still being a critical signal. 

    This fits with a general change in usage of EDMs. While the number of actual EDMs proposed per year  have remained roughly the same, overall signatures have dropped by almost half since 2015 (33k to 15k), and far fewer petitions get a large number of signatures. The average number of signatures per EDM has dropped from 27 to 12. Some of this activity has moved to the new social open letter format. 

    There are also some disadvantages to open letters. Publishing via screenshots means it’s not very accessible or searchable — a problem if one reason for signing is to signal to constituents.  If an open letter is important, people want to sign after the fact. EDMs have a mechanism for that, while for open letters you might get “here’s another page of names in another tweet” or social media posts saying “I support this too” —  but not in the same place as the original. 

    For our purposes, it also means there’s collection work to be done finding the letters in the first place, and transcribing the images into text. We’ve got some good technical processes on the latter; and we’ve opened a form here where people can tell us about them. But it’s more work than just plugging into Parliament’s feed, which is what we do for data elsewhere on TheyWorkForYou. 

    Looking at open letters is a shift towards including more extra-parliamentary activity — but reflects the need for parliamentary monitoring sites to react to changes in how parliaments and representatives behave, and think creatively about how to make use of new sources of information. 

    Motions to Annul

    Motions to annul are technically a form of EDM, but we’ve separated them out because we see them as something worth highlighting in their own right. 

    To take a few steps back, when Parliament passes laws (primary legislation), it fairly commonly gives the government authority to make additional orders/regulations (secondary legislation) that fill in specific details in laws without the full parliamentary process. 

    Secondary legislation still needs to be approved by Parliament – and this happens in two ways depending on how the law was written. Either the regulations need to be approved in a vote to become law (positive procedure), or they need to not be voted against within 40 days (negative procedure). 

    Most legislation (around 75%) is passed through the negative process, and in practice the power to object is used very rarely (the last successful Commons objection was in 1979).

    The mechanism is to make a Motion to Annul (for historical reasons called a ‘prayer’) through the EDM process. There is no threshold at which this is promoted to a vote and the government controls the Commons agenda. It is more likely if the motion is tabled by the Leader of the Opposition, or as the number of signatures goes up.

    Come to our event

    Join us on Thursday 23 October for a webinar on our new features, plans for the site, and our vision of a more open Parliament. 

    Even if rarely successful,  these represent engagement with the legislative scrutiny process, which we felt was worth highlighting, and we separate these out in the signatures page from other EDMs.