1. Beneficial ownership tools and analysis

    Header image: Photo by Susan Holt Simpson on Unsplash

    mySociety and SpendNetwork have been working on a project for the UK Government Digital Service (GDS) Global Digital Marketplace Programme and the Prosperity Fund Global Anti-Corruption programme, led by the Foreign & Commonwealth Office (FCO), around beneficial ownership in public procurement. This is one of a series of posts about that work

    As part of this project we reviewed the open source tools that are available for working with beneficial ownership data. There is a tooling ecosystem around the Beneficial Ownership Data Standard (BODS), but it is not yet as well-developed as that around the equivalent OCDS standard for contracting information.

    There are some open source tools and analyses developed by civil society that aim to support users in understanding the relationships between companies and individuals, and related tools in the commercial sector for supporting anti-money laundering processes.

    Across all tools, Python is a reasonably well established language choice (with some civil society tools developed in Ruby) and network or graph visualisation components such as neo4j are common. We will discuss this further in the section on beneficial ownership analysis.

    OpenOwnership Register

    OpenOwnership is an organisation with the goal of making beneficial ownership data more widely available through technical development, partnerships and research. They are the key developers of the BODS data standard and host a global open registry of beneficial ownership data.

    The goal of the OpenOwnership Register is to create an “open global beneficial ownership register” that is useful across different jurisdictions and industries. This is an open source digital service which can:

    • Incorporate data from existing open registers published by countries
    • Allow cross-jurisdiction searches through a single interface/dataset
    • Becomes more useful the more open registers are published

    This works in tandem with the promotion of BODS format. Releases made in BODS are easier to incorporate into the register, and being able to make use of and contribute to a central register is an incentive to publish in a compatible format.

    The register currently contains data from every open, countrywide beneficial ownership register (UK’s Persons of Significant Control Register, Slovakia’s Public Sector Partners Register, Ukraine’s Consolidated State Registry, and the Danish Central Business Register) and the data from the EITI’s 2013-15 pilots.

    While there is additional deduplication applied to the source data (merging people with identical names, addresses and dates of birth, and companies with matching identifiers), the limitations of the source data still apply and the size of the register means that many similar entities are unreconciled.

    BODS collection and processing tools

    OpenOwnership have produced guidance on collecting BODS-compliant data using paper forms. They have also commissioned the Open Data Services (ODSC) to convert Excel format data collection spreadsheets used in the Extractive Industry Transparency Initiative (EITI) so that the data they collect will be compatible with the BODS 0.2.

    The BODS data review tool is available as an online service – as with the OCDS data review tool, it is based on the CoVE platform (Convert, Validate and Explore). Both tools check that your data complies with the relevant schema, allow you to inspect key contents of your data to check data quality, and give you access to the data in different formats (spreadsheet and JSON) to support further review. The tool is built by Open Data Services, and hosted by OpenOwnership.

    CoVE itself uses a generic flatten tool to transform standards-compliant data in JSON into spreadsheets and vice versa. This is a key piece of utility software, as it means that people working with ownership disclosure data can work in a familiar spreadsheet program. Once flattened, sheets of a spreadsheet are used to represent each of the main elements of the standard (people, entities, and control statements), as well as associated data like addresses, annotations and identifiers. This data can then be transformed into the JSON data interchange format, which has a large tooling ecosystem around it.

    The BODS mapping template enables field-level mapping between source data systems and version 0.1 of the Beneficial Ownership Data Standard. It supports the processes of:

    • identifying source systems that hold beneficial ownership information
    • itemising the fields that those systems define
    • itemising the codes and codelists associated with those fields
    • mapping the source system fields, codes and codelists to the beneficial ownership data standard

    This kind of mapping support – from simple, widely used formats and interfaces into machine readable forms, and from existing systems into data standards for interchange or publication –  are key enablers of adoption of data standards and a rich tool ecosystem.

    Beneficial ownership analysis tools

    In addition to the tools developed specifically around BODS, there is a set of open source  tools developed by civil society that analyse information on the ownership of companies, sometimes in conjunction with information about public contracting. Malaysian civic tech organisation Sinar Project have developed the Telus prototype, combining information from Malaysia about procurement, beneficial ownership, and politically exposed people. They are also working on Politikus in Kenya, which will combine those types of data with information about infrastructure projects.

    Two different civil society tools originate in Mexico: Sinapsis, produced by journalism organisation Animal Político and TowerBuilder, created by transparency and accountability NGO PODER. The goal of Sinapsis is the examination of ‘coincidences’ in a set of companies or organisations, where addresses, people, ID numbers, notaries or phone numbers may connect seemingly disconnected companies. TowerBuilder is a reusable toolkit for generating websites with data visualisations that mix open contracting and beneficial ownership data.

    These tools are generalisations of approaches originally used in one-off investigations into reusable services that can be fed new datasets. Sinapsis originated in Animal Político’s  ‘estafa maestra’ investigation, and TowerBuilder in PODER’s Torre de Control project. In the UK, the two analyses performed by Global Witness of the Persons of Significant Control register (The Companies We Keep in 2018, and Getting the UK’s House in Order in 2019) have been made available as Jupyter Notebooks – an open-source web application that allows you to create and share documents that contain live code, equations, visualisations and narrative text. This represents a space between truly one-off analyses and frameworks or services designed for reuse. The analyses are fully documented via the notebooks and are sharable and repeatable with the same data, but not generalised to other data sources.

    The OpenTender portal run in Indonesia by Indonesian Corruption Watch and the international Aleph dashboard produced by the Organised Crime and Corruption Reporting Project (OCCRP) also touch on beneficial ownership information.

    Whilst this data is not explicitly used in OpenTender.net, some of their red flag risk analyses are trying to reveal the same connections that beneficial ownership data can reveal. For example, companies being registered at the same address is suggestive that their beneficial owners may be the same, and that cartels may be in operation.

    Aleph is a document storage and search platform designed to facilitate cross-border investigation of white-collar crime. It includes some beneficial ownership datasets, and parts of the toolchain can also be used to address issues in tools more focused on beneficial ownership, such as name matching, so may be a source of useful open source components.

    A significant amount of the effort in producing these tools and analyses has been in pre-processing data to turn it into standard forms that can be easily combined and analysed. Reliably matching companies and individuals across different data sources is a recurring and significant technical problem.

    The use of BODS is not yet widespread: as civic tech early adopters, the Sinar Project uses it across their tools, but it is not used in Sinapsis, Aleph or TowerBuilder, although the latter does use OCDS. Where BODS is not in use, CSV files with various different schemas store beneficial ownership information.

    See all posts in this series.

  2. Screening for conflicts of interests in ownership data

    Header image: Photo by Rob Curran on Unsplash

    mySociety and SpendNetwork have been working on a project for the UK Government Digital Service (GDS) Global Digital Marketplace Programme and the Prosperity Fund Global Anti-Corruption programme, led by the Foreign & Commonwealth Office (FCO), around beneficial ownership in public procurement. This is one of a series of posts about that work

    A key corruption risk in public procurement is that officials or politicians successfully direct contracts to companies that they control or benefit from.

    Understanding who the beneficial owners of these companies are is one half of preventing this; the other is knowing more about the people who shouldn’t benefit, such as politically exposed persons (PEPs) or those involved in the procurement process.

    The United Nations Convention against Corruption (UNCAC) defines politically exposed people as “individuals who are, or have been, entrusted with prominent public functions and their family members and close associates”. This is a flexible definition, varying by country as to which roles should be included and how far their associations should be seen as connected. That said, typically the term will be understood as  limited to senior roles, while procurement processes might actually suffer from conflicts of interest from less senior procurement officials (PO) who are more directly involved.

    Solving this problem is hard. Data sources exist to help with the issue but are not complete in themselves. A good general principle is designing a process that makes it more likely that conflicts of interest will be detected, using tools and datasets to increase scrutiny, without relying on it as an all encompassing solution.

    The problem

    In an ideal world, analysts would simply match a list of beneficial owners against a list of politically exposed persons. Any overlap would say if a PEP is benefiting from a government contract. Unfortunately, each step in this process is far from simple.

    Previous blog posts have detailed the problems in creating a list of beneficial ownership, and politically exposed persons represent a similar challenge.  The UNCAC definition is a good definition for investigators, but to create disclosure requirements it needs to be translated into a concrete local understanding of which roles are covered.

    Where there is a clear definition, or even a list of roles that it covers (as may already exist for tracking asset disclosures), this requires a system of tracking and updating changes in those roles. Up to date lists should have a mechanism for adding new PEPs with  reasonable speed after they take office, but also need to act as an archive for information about former office holders for pursuing retrospective investigations.

    The more comprehensive the dataset (for instance, covering multiple countries, or sub-national significant figures), the higher the costs of maintenance and the greater the risk the list will fall out of date. Procurement officials (POs) are unlikely to be tracked by existing approaches to identifying PEPs in a country and will need new approaches. In South Africa, civil servants are prohibited by law from being a  beneficiary of the procurement process, creating a very large list of people to exclude.

    The other side of the problem is that where an up-to-date and comprehensive list of excluded persons exists, you have to be able to match it against your list of owners. This runs into the problem of data matching. Name matching is error prone and while information of office holders is often public (and so a list can be maintained without special privileges), these public lists are less likely to include the unique IDs essential to easy matching of individuals.

    As the Financial Action Task Force (FAFT) put it:

    Inconsistent transliterations and spellings of names affect the ability of financial institutions and DNFBPs to match names in general. Scrubbing customer databases for matches against commercial databases may result in many false positives if such databases contain insufficient or inadequate identifier information. This increases the risk of missing true matches and requires additional resources to separate false positives from true matches.

    However, while the problem is hard, partial and incomplete solutions have value.

    PEP databases and matching tools

    While FATF says that the use of databases is not sufficient to comply with their requirements, they are still a useful tool that can speed up work. Commercial databases exist, often aimed at assisting regulatory compliance in banks,  such as SmartSearch, Accuity and BAE systems Watch List Management system. There is also a variety of open data sources available, with OCCRP gathering a set of datasets on individual sanctions together as a dataset in aleph.  Some of these have approaches to name matching built in. For instance, ComplyAdvantage has a PEP database with a fuzzy matching search that can be accessed through an API.

    There is a wide selection of open source tools available to help with name reconciliation, such as Elasticsearch, OpenRefine, and Dedupe.io (a service built around a free python library). When people have entries in multiple national databases, different transliterations of their names can be recorded. OCCRP has developed a list of ‘synonames’ (soundalike) names that help address this, but reconciling individuals based on name remains a difficult problem.

    These databases will not cover procurement officers, and require additional data creation and maintenance work in a country. However, as these people are state employees, there is the prospect of tying into existing HR or payroll systems to automate generating the list, and also having access to more sensitive personal identifiers such as identity or tax numbers.

    Where the intention is to release the list publicly (such as Mexico’s planned SESNA datasets of public servants involved in procurement, and those who are sanctioned), the identity fragment approach could be used to aid reconciliation with other datasets without releasing this personal information.

    Where unique IDs can be established for both sides of the process, this makes lookups far more efficient. Where they can’t, the process should be designed to be more likely to create false positives than negatives that can then be further investigated. This also raises the importance of how the overall system is designed. While automated screenings can be built into tools for procurers decided between contracts, enhanced scrutiny of contract winners is less time consuming than screening all those who sign up to a supplier portal.

    Representing the data

    While the data standard that has most use in beneficial ownership is the Beneficial Ownership Data Standard (BODS), this is not the most appropriate format for PEP data.

    Currently where this data exists it is in a variety of CSV or JSON based formats. The ideal scenario is that PEP information is published in a common standard, so that multiple data sources can be easily combined in an analysis tool.

    A good candidate for this is the Popolo data standard. This is a standard designed to hold information about elected politicians and legislatures, which makes it useful for holding lists of PEPs. It can store information on when particular people hold particular offices, allowing it to act as a repository of older information for comparisons several years after the fact, as well as having the ability to store multiple names and identifiers that might aid reconciliation.

    mySociety’s(currently paused) EveryPolitician project uses this standard, which makes it useful as a source of global PEP information (it is used in, for instance, Global Witness’s investigation of the UK Persons of Significant Control dataset). The standard was also used by the Sinar Project’s Telus tool in Malaysia as a repository of PEP information.  FATF recommend that countries should compile a list of domestic positions/functions that are considered prominent public functions to aid determinations of whether a particular person holds a PEP-qualifying role. This could also similarly be released in Popolo format, using just the Post structure.

    Alternatively, where the process is less of a lookup between two lists, and more an investigation of individuals who are beneficial owners, BODS has an optional field saying whether and if so, why someone qualifies as a politically exposed person. This could be collected as part of a verification process, with information reviewed for relevance by decision makers.

    See all posts in this series.

  3. Collecting and making use of beneficial ownership data

    Header image: Photo by Markus Winkler on Unsplash

    mySociety and SpendNetwork have been working on a project for the UK Government Digital Service (GDS) Global Digital Marketplace Programme and the Prosperity Fund Global Anti-Corruption programme, led by the Foreign & Commonwealth Office (FCO), around beneficial ownership in public procurement. This is one of a series of posts about that work

    There are three steps to working with beneficial ownership data: collection, verification and analysis. These three areas interact – how and when data is collected affects how viable different methods of verification are, and both of these in turn affect what forms of analysis are possible.

    While collection of beneficial ownership data does not have to be part of the procurement process, (for example, if there is already a national register) requirements for bidding or winning public contracts are good pressure points to require disclosure. The following diagram shows a bird’s eye view of how ownership data (green lines) might be collected by different government agencies as part of the corporate lifecycle and the government contracting process (click for more detail).

    Where ownership data fits in the company lifecycle and contracting process

    In an ideal world, beneficial ownership information would be available and accurate at all of these points. But realistically, choices must be made over when and where to introduce beneficial ownership data collection and how to resource verification. Such choices will have an effect on the scale and timeliness of the data collected, however, as we can explore with the following diagram:

    Potential ownership collection points

    For instance, if a goal is to check for bidding cartels in the process of judging the bid, this information can be collected at any point: when companies are formed; when they register as a supplier; or when they make a bid. If companies submit multiple bids, it reduces duplication if information is collected sooner. However, this also increases the potential time between submission and analysis, and so requires an update process to avoid information becoming out of date.

    Collecting at different points also makes a difference to the size of the database. Each successive capture point is collecting a smaller sample of organisations. This affects analysis in two ways: scope and accuracy. The more companies covered in the dataset, the more forms of analysis become possible. If you have only collected information on bid winners, you cannot investigate bidding cartels. If your collection only includes the beneficial ownership of registered suppliers, you cannot identify the ‘sibling’ entities (which are not in the direct ownership chain of a company, but are owned by the same owners)  in a corporate structure that might contain hidden debts.

    While collecting data about more companies does not inherently make data more inaccurate, there is an indirect effect in that collecting more information raises the overall cost of verification.  Collecting information about all companies creates a much larger dataset to verify than just those who win contracts. If verification resources are not increased accordingly, the dataset will have a much larger scope, but be less accurate, and so the resulting analysis may be less useful.

    This inaccuracy may have a higher effect on fraud/anti-corruption analysis than its overall incidence in the dataset. This is because while some errors will be accidental, some will have been deliberately introduced to disguise ownership. Analysis that is only possible with large amounts of data may not even really be possible if the database is not supported by a strong verification system.

    To provide two different models, the UK’s Persons of Significant Control (PSC) register requires declarations as part of a company’s annual statement. With a few exceptions, it includes all companies registered in the UK. This data is submitted by the company without verification, as examining suspicious statements is a resource intensive problem across the entire jurisdiction (see the recent Global Witness report for a description of the verification problems in the UK PSC register).

    The Slovakian Register of public sector partners (RPVS) requires beneficial ownership information be submitted only before a high value contract is awarded and so has a much narrower scope. However, there is a much stronger verification process, with third parties (generally legal offices) submitting the information and the process they used to reach it, with an in-country individual held legally responsible for the accuracy of the data. Methods and useful concepts in the verification process are explored in more detail in an OpenOwnership briefing.

    Depending on the specifics, smaller databases (with verification) may lead to more basic—but more accurate—analysis. The calculation made in Slovakia for instance, is that there is less need for using data as part of the procurement process if the post-award checks are very good, because raising the chances of being caught raises the costs of cheating. On the other hand, this is then missing out on the prospect of identification of cartels. Disclosures may be accurate while the procurement process is still distorted.

    Each expansion in the number of companies included does not expand the process in the same ways. While smaller registers may be cheaper to verify, new forms of analysis may open up with small increases in size. For instance, if the overall number of companies participating in bids is not much larger than those winning bids, the additional compliance costs may be negligible and allow the possibility of cartel analysis.

    See all posts in this series.

  4. What is beneficial ownership?

    Header image: Omar Flores on Unsplash

    mySociety and SpendNetwork have been working on a project for the UK Government Digital Service (GDS) Global Digital Marketplace Programme and the Prosperity Fund Global Anti-Corruption programme, led by the Foreign & Commonwealth Office (FCO), around beneficial ownership in public procurement. This is one of a series of posts about that work

    The idea of beneficial ownership is meant to address the problem that the official directors and board of a company may be different from the true owner or controller.

    Without knowing the true owners of a business, you cannot understand who benefits from or controls its activities. In a procurement context, without beneficial ownership information about suppliers, it can be difficult to detect organised corruption or conflicts of interest.  Greater knowledge of ownership and control can give greater insight into supply chains and product quality. In the case of government contracts, collecting and using beneficial ownership data can have a very real impact on ensuring state funding is directed towards legitimate, high quality services and infrastructure for citizens.

    Someone may not even be an ‘owner’ in the sense of having a significant proportion of shares to have ‘control’ over it. They might own no shares but still exercise control through a right to appoint board members. In most cases they would still be considered beneficial owners of the company.

    Where this becomes interesting is when companies are owned not just by ‘natural’ (real) people, but also by other companies. For some companies, this can result in long chains of ownership, with many levels of companies owning other companies. But sooner or later, all ownership chains must terminate in real people, not corporate entities – those people are the beneficial owners.

    Tools for visualising beneficial ownership structures are still quite varied, but most attempt to represent ownership as a network, with companies and people as nodes:

    Diagram showing connections between companies and their eventual beneficial owners

    An alternative approach is to think about only the ultimate owners in a chain. This can be particularly useful when you need to make quick decisions about who owns or benefits from a given company, regardless of how many ‘steps’ they are removed from the company itself:

    Diagram showing the same network, but with ownership information displayed seperately

    Perfect vs practical definitions

    A broad definition of beneficial ownership (such as ‘deriving significant benefit from or having control over a company‘) is useful for an investigator trying to understand whether specific individuals can be said to be beneficial owners of an organisation. It is less useful when an organisation is being asked to declare who their beneficial owners are. This requires concrete disclosure requirements that may approach, but are likely to fall short of a broad definition. For instance, it might be decided that stockholders who have more than 25% of voting rights qualify for disclosure. In the terminology used by the World Bank/STAR Puppet Masters report, this is a “formal” rather than “substantive” approach to understanding the beneficial ownership of companies.

    When talking about beneficial ownership, it is important to keep in mind this distinction between the concept of a beneficial owner and the inherently imperfect ways of identifying them. Better management of procurement risks means knowing more about who benefits from a company receiving a contract. But on the other side, the people hoping to subvert the process will want to maintain secret ownership ties in order to control or benefit from the company.

    Closing the gaps in knowledge with additional beneficial ownership disclosure addresses the current state of evasion, but not how dishonest actors will react to new requirements. Introducing new requirements will address some amount of fraud and corruption, but also creates a strong incentive to find new ways to conceal conflicts of interests. This arms race dynamic means there is no one ‘good’ formal definition of beneficial ownership, but a number of different criteria that need to react to the practices of concealment in evidence in a country at a particular time.

    As such, the best way to think about the long term impact of beneficial ownership on public procurement is not as a silver bullet, but as a tightening net. Future escalations may involve changed definitions, or improving the means by which information is validated. Underlying tools and standards need to be flexible to a range of national contexts, as well as a potential for change over time.

    Beneficial ownership is part of a solution to several different problems

    Several different frameworks promoted by inter-governmental bodies or international transparency/anti-corruption groups push towards more collection of beneficial ownership information.

    The Extractive Industries Transparency Initiative (EITI) required as part of their 2016 standard that all participating countries mandate the disclosure of beneficial owners within extractive industries (oil, coal, gas, mineral extraction), and recommend publication in public registers or through the country EITI report.

    The Financial Action Task Force (FATF) 2012 recommendations include the importance for financial institutions of discovering the beneficial owner as part of customer due diligence when establishing a new business relationship, and apply enhanced diligence if a beneficial owner is also a politically exposed person (PEP). While not calling for an open register, they do recommend that there are timely forms of accessing accurate beneficial ownership available for ‘competent authorities’.

    The Open Government Partnership (OGP) supports the Beneficial Ownership Leadership Group with the aims of strengthening disclosure requirements and verification processes,  supporting a common data standard and allowing public access to enable citizen monitoring. Over 40 countries (including Mexico, South Africa and Indonesia) have incorporated commitments related to beneficial ownership transparency in their OGP plan.

    On the practical side of how international ownership data should be processed and stored, OpenOwnership is an organisation with the goal of making beneficial ownership data more widely available through technical development, partnerships and research. They are the key developers of the BODS data standard and host a global open registry of beneficial ownership data.

    More directly related to public procurement, as part of their COVID-19 response, the International Monetary Fund (IMF) has asked countries requesting emergency assistance to make commitments to publish information on the contracts with and the beneficial owners of companies benefiting from the emergency funds.

    Ownership in public procurement

    The problem beneficial ownership data can address in public procurement is corruption or subversion of the procurement process, but it also has a bearing on procurement efficiencies, risk profiling and enactment of preferential procurement policies.

    Making beneficial ownership data available to procurement officers helps them discriminate between bidders for work in a current procurement process. For instance, a problem described by several interviewees in our research on this area is bidding cartels. This is where multiple bidders (who are in reality controlled by the same owner) coordinate to drive up the price and raise the chances of winning. Knowing more information about the ownership of the companies in this bidding cartel would make it easier to detect.

    Better visibility of who is benefiting from public procurement contracts can be beneficial even when companies are behaving perfectly within rules. Entirely legitimately, a set of apparently independent companies may have won many bids. However, in reality these are part of a broader group with a set of common owners. Beneficial ownership data can make it easier to understand the connections between these companies (either because chains of corporate ownership have been revealed, or the final owners directly revealed). This can allow identification of where procurement contracts are ultimately flowing. Where beneficial ownership data is broadly available for organisations, this also allows identification of other businesses in which owners have an interest. This can be used to risk profile broader corporate structures.

    Explicitly collecting the data required to catch violations of existing rules can also create a chilling effect, by making potential bad actors aware of the scrutiny that may be given to the information, especially if combined with more effective enforcement. A government official (elected or otherwise) with power over the procurement process may have significant involvement in a company bidding for a contract, but this fact would be undeclared and invisible on official paperwork. Greater visibility of the beneficial owners of these companies leaves fewer places to hide, and raises the risk of detection and costs of attempting to subvert the process.

    See all posts in this series.

     

  5. Improving access to information in Europe: everyone’s a lottery winner

    We’re delighted to announce that we’ve received funding from the Swedish Postcode Foundation that will help us extend our work on Freedom of Information in Europe.

    The Foundation uses proceeds from the country’s lottery sales to help fund projects that support democracy and freedom of speech, as one of three areas where they believe they can help bring about long term positive change to the world.

    The connection is particularly apt, as it was in Sweden that the world’s first FOI law was passed in 1766. From that beginning grew a worldwide good: since then, access to information has been recognised as a fundamental right by the European Court of Human Rights, and has been adopted in countries around the globe.

    Matched up

    In May 2019 we received funding from Adessium Foundation for a three-year project to increase access to online FOI tools across Europe. The ultimate aim is to enable journalists, campaigners and citizens in Europe to make greater and more effective use of their right to access information; and in particular to generate public interest stories and campaigns that will hold power to account.

    Now this new match funding will allow us to dig further and build better within the main elements of the project, which are:

    • To help partners to launch new FOI sites in the Netherlands, France (already completed) and another jurisdiction (coming soon).
    • To upgrade existing sites to include the Alaveteli Pro functionality: AskTheEU already has this and five others will gain it shortly. By 2022 there’ll be 13 Alaveteli sites in Europe, 10 of which will have Pro.
    • To improve the Alaveteli Pro software with new features that’ll make it a more powerful tool for investigations and campaigns (so far we’ve worked on exporting data from batch requests and enabling users to add links to news stories).
    • To support journalist and campaigning organisations to use Alaveteli tools as part of their investigations (such as Privacy International’s use of FOI in their investigation into surveillance technologies used by police in the UK).
    • To monitor government compliance with FOI, especially in the wake of the coronavirus pandemic.

    Get involved

    Now we can spread the goodness even further, so we’re planning to run some online training/learning activities around using Alaveteli tools as part of an investigation or campaign. If your work would benefit from this, and you live in an EU country with an Alaveteli Pro site, do get in touch.

    We’re also keen to partner with membership-based news or campaign organisations to run more pilot projects using our new Projects feature. If you have a project that could benefit from contributors helping to extract and analyse data from FOI responses, let us know.

    And finally: we’ll soon be starting to gather data about FOI compliance in different EU countries. If this is something that could benefit your work, register your interest and we’ll keep you posted.

    Image: Jonathan Brinkhorst 

  6. Introducing WhatDoTheyKnow Projects

    With the aim of making large scale Freedom of Information investigations easier for community newsrooms and campaigning organisations, we’ve spent the first half of 2020 developing collaboration tools for WhatDoTheyKnow to speed up and bring others into the FOI management process.

    In an initial pilot, 17 contributors saved a journalist 6.5 hours by taking on half of the work of managing responses to requests.

    We’re actively looking to partner with membership-driven news organisations or impactful campaign groups to run further pilot projects to help refine the features. If that’s you, please get in touch.

    FOI can be hard without dedicated tools

    We know FOI can be hard work, especially when you make large batch requests that return a huge amount of data.

    While our Pro tools make life easier, much of the work simply involves triaging whether you got a response or just an automated acknowledgement, and whether the authority actually released the information you requested.

    After that, you then need to sift through various different formats of data, different understandings of the questions, and follow up with clarifications.

    All this comes before you can start analysing the data to build up a narrative for a story.

    A compelling membership proposition

    News organisations are increasingly looking for sustainability by offering memberships – where you pay a monthly fee to support the organisation – instead of relying on advertising revenue to support themselves.

    Memberships are still a relatively unproven and unexplored area, and organisations are still in the process of discovery over what makes someone want to pay for their news output. Is it just being able to read the stories, or do people want more involvement?

    There’s evidence to suggest that members do want to get more involved.

    Crowdsourcing some of the work of the FOI process from the membership presents an opportunity to help take some of the load off journalists, while also bringing members into the reporting process so that they value the final output more.

    Many hands make light work

    With this new functionality, once you’ve made your requests – either individually or as part of a batch – they can be added to a Project. Contributors can then be invited to the project where they are briefed on what the project is about and the tasks they can help with.

    Screenshot of Project Homepage

    Helping to classifying responses

    When you’re making FOI requests, each response to each request needs to be read to establish whether the authority has provided the information asked for – a process that is difficult to automate, given the huge variety of language that can be deployed by authorities. With large batch requests this can be a time-consuming process.

    Projects creates a pool of responses that need classifying that contributors can work through to take some of the onus off the project owner.

    2up of Project Classify page

    Contributors read the original FOI request and latest response, and then classify its current status appropriately. This doesn’t take much specialist understanding of FOI, so it’s a really easy way to get lots of people to help out.

    Helping to extract data

    In larger FOI investigations requesters are usually looking to build up a dataset so that they can compare responses from different authorities.

    This usually involves lots of spreadsheets, copy & paste, and hours of hard work.

    Projects provides dedicated tools to help build this dataset by creating a pool of requests that contributors can extract data points from using structured forms.

    Allowing contributors to help build up a dataset that will be used for real-life reporting and research helps them feel more directly involved and connected to the organisation, hopefully adding value to the membership proposition.

    Screenshot of Project Extract page

    Project owners are then able to download the crowdsourced dataset to investigate, using their analysis tools of choice.

    Screenshot of downloaded Dataset

    What we learned from our pilot

    In our pilot project contributors took on 50% of the classification tasks, accounting for 57% of the 14.8 hours overall spent classifying, saving the journalist around 6.5 hours of the administrative work required before she could start reviewing the data releases. This is a clear indication that crowdsourcing key parts of the FOI investigation process can save a significant amount of time.

    The journalist we worked with was enthusiastic about using the Projects interface again in the future, even if she wouldn’t be inviting external contributors. She expressed that it would be ideal to collaborate with interns to help sift through classifications and responses.

    With an 82% conversion rate from joining to taking action and nearly 40% of contributors returning for more than one session there’s clearly an appetite from contributors to get involved and help out. The contributors we interviewed understood that by helping with menial tasks, they were allowing the journalist more time to focus on work which required specialist expertise.

    A potential for global benefit

    Through the Nesta Future News Fund we worked with openDemocracy to design and develop WhatDoTheyKnow Projects to support this collaboration, and ran a pilot collaborative project made up from a batch of over 800 FOI requests.

    Projects is of course built into Alaveteli – the platform that powers WhatDoTheyKnow and many other FOI sites around the world, so it’s not just going to be of use in the UK, but for every jurisdiction where an Alaveteli site is utilising the Pro add-on.

    Image: Duy Pham

  7. FOI and COVID-19: an update

    While the UK begins the process of trying to return to some kind of normality after lockdown, full access to information must also be restored.

    Back in April, we put out a blog post examining the state of Freedom of Information during the covid-19 crisis, looking at the UK and more broadly across the world. State-sanctioned delays were seen almost universally.

    While we understood the difficulties faced by authorities redeploying staff members to the frontline, we said then that the right to information was perhaps more vital than ever. In times of national crisis, transparency is crucial both for retaining trust in our leaders and for keeping check on their activities.

    WhatDoTheyKnow users have been asking pertinent questions about the pandemic, from requests for data on the number of cases in prisons and care homes, to the basis on which decisions about the national response strategy have been made. Potential students want to know about universities’ plans for the coming year; citizens are asking about measures put in place by their councils to encourage social distancing. And meanwhile, of course, requests for non-coronavirus-related topics are equally pressing: who’s keeping an eye on Brexit, or making sure the climate crisis doesn’t slip off the agenda for example?

    The state of play

    We’ve been linking to that initial post from the top of WhatDoTheyKnow, so that people making requests could get some background to the delays they might be experiencing.

    But since then the global situation has moved on, and so have some aspects of FOI provision. At the time of writing:

    • The Information Commissioner’s Office (ICO) is still stating that they “will not be penalising public authorities for prioritising other areas or adapting their usual approach during this extraordinary period.” Therefore, UK public authorities may still delay their requests without penalty. Read more on the ICO website.
    • The Scottish Information Commissioner had previously introduced overseen a change that permitted [see below for clarification] a longer period in which authorities might respond to requests, but on 27 May a reversal came into effect and the period returned to its standard 20-day deadline. However, there is still an acknowledgement that the pandemic, and indeed their own previous relaxation of the required timescales, may have a knock on effect to requests made before that date. See full details here.

    This does raise the question as to when the ICO foresees a return to business as usual. Of course, each authority will have its own experiences and challenges, with varying reasons for maintaining or removing an expectation of delayed responses. But they are guided by the regulator, and while the ICO continues to excuse lengthened response times, authorities may not hurry to do any different.

    UPDATE: A representative from the Scottish Commissioner’s Office contacted us with the following clarification:

    The changes in timescales under the FOI (Scotland) Act came about because the Scottish Parliament passed emergency legislation to change the timescales – they were not introduced by the Commissioner. Our position prior to the change in the law was set out in a statement we issued, and our comments, including concerns raised, on the legislation when it was introduced can be read here.

    We’ve also sought to emphasise the importance of the duty to respond promptly, even during the period when the deadlines were extended, as set out in our guidance for requesting information during the pandemic. We think it’s important that requesters know their rights, and the right to a prompt response (not just one within 20/60 working days) is something that has remained consistent for FOI users throughout the pandemic.

    Time to restore oversight

    It’s unquestionably a time of great uncertainty for us all, with many returning to some semblance of normality while still unsure whether the much anticipated second peak is on the horizon. But given a national policy of this staged return, should the ICO not, like its Scottish counterpart, be encouraging authorities to do the same?

    One compelling reason is hinted at by the Scottish Commissioner’s own caveat: that the longer the deadlines are allowed to extend, the more of a backlog will build up, causing further delays down the line.

    We’d encourage authorities everywhere to re-examine any laxity they may have introduced at the start of lockdown, and to continue to do so regularly: is it still genuinely necessary now that staff may have been moved back from the covid-19 frontline?

    And we’d urge them to treat the need for a timely, efficient FOI service as one of the top priorities during this uncertain period.

    Image: Andrea Piacquadio

  8. Who’s checking your Facebook profile?

    If you were putting in a claim for benefits, challenging an accusation in court or phoning in sick to your employer, would you expect your local authority to be checking your social media presence?

    How do you think a stranger might assess you as a parent, were they to skim over any public posts on your Facebook page? If you’ve been on a protest recently, would you be comfortable knowing that your local council was combing through any photos you’ve shared?

    A Freedom of Information investigation by Privacy International, using WhatDoTheyKnow Pro, has discovered that a significant number of local authorities — 62.5% of those responding to their FOI requests — habitually monitor citizens’ Facebook or other social media profiles to gather intelligence.

    What’s more, the majority have no policy in place or measurement of how often and to what extent these investigations occur.

    If this concerns you, the first thing you should do is check that your social media privacy controls are up to date. Then you might like to go and read Privacy International’s full report, as well as checking how (or whether) your own local authority has responded to their requests for information.

    And finally, you can join Privacy International’s call for stronger guidelines from the Investigatory Powers Commissioner.

    Just… maybe think twice about putting it in a public Facebook post?

    We’re only joking, of course. Or half joking.

    Issues like this need to be shared far and wide. But as Privacy International point out, there are already sobering instances from abroad of threats to those following anti-government accounts. With so many completely unexpected changes to the status quo recently, can we say for certain that it could never happen here?

    Image: John Schnobrich

     

  9. Learn everything you need to know about FOI, online

    Investigative journalism platform The Ferret has just launched an online training course on using Freedom of Information — and all trainees get a free subscription to our WhatDoTheyKnow Pro service for professional users of FOI.

    Based in Edinburgh, the Ferret is a community journalism initiative that describes itself as ‘for Scotland and beyond’. Since 2012 its members’ investigations have rooted out the truth around local, national and international issues including coronavirus, Brexit, dark money —  and much more. They’re a co-operative, so supporters become part-owners. If they want to, they can also access the resources and training to pursue their own stories.

    And now, the Ferret’s online Freedom of Information course shares everything the founders know about the use of FOI for tracking down facts. This resource would be useful for anyone wanting to know the ins and outs of the act and how to use it, not just for journalism but potentially for campaigning or research purposes too. And it’s not just restricted to the use of FOI in Scotland: you’ll learn everything you need to know to use FOI across the UK… and beyond.

    The course costs £30, but six months’ WhatDoTheyKnow Pro usage is bundled in. Since that’s worth £60 on its own, you’re ahead before you even begin.

    We’re big fans of the Ferret at mySociety, and we have every confidence that this course will be a springboard for a new generation of great investigative journalists. If you think you might like to be one of them, then why not give it a try? More details here, and in this Twitter thread.

    Image: ConvertKit

  10. WhatDoTheyKnow in Wikidata

    We were glad to see this recent tweet from Andy Mabbett:

    Andy has imported the IDs of every authority listed on our FOI site WhatDoTheyKnow into Mix’n’match, a tool for helping to link a dataset with existing Wikidata entities. Once a match has been made, the URL of the body’s WhatDoTheyKnow page is available as one of its identifiers (specifically, P8167).

    This means that anyone running a project that utilises Wikidata will have the option to include WhatDoTheyKnow data in their site or app.

    Andy says, “Wikidata acts as a hub for all sorts of databases and identifier systems. For example, it can be the only way of linking (programmatically, in the linked data sense) an MP’s official parliamentary record to their IMDb entry. I do a lot of work making that happen. As a regular and satisfied user of WhatDoTheyKnow, it appealed to me to add that site’s 24.5K listings of UK public bodies to the mix.”

    The best-known site relying on Wikidata is of course Wikipedia, so in theory it would now be feasible, say, to include a template that automatically pulled the relevant WhatDoTheyKnow link into Wikipedia articles about authorities, or to build a browser extension that provided those links when the user visited such articles.

    It would also be possible for us to pull information back the other way, so for example we might consider importing the first paragraph of a Wikipedia page for a body and using it within the introduction, as a way of providing context.

    The matching of WhatDoTheyKnow authorities confirms which Wikidata URI (Uniform Resource Identifier) relates to each, meaning that these can now be used in “sameAs” metadata headers, scehma.org markup, etc. We think this might have a beneficial effect on the way search engines treat our pages in the future — something we’ll be keeping an eye on to check if that’s true.

    Additionally, this works as a nice proof of concept that we can potentially recommend to other Alaveteli sites around the world, given that the Wikidata project is, of course, international.

    But first, the bodies need to be checked with the Mix’n’match tool. At the time of writing, 1,302 bodies have been resolved, and can be seen here. Anyone is welcome to help by confirming more matches: just log in with a Wikimedia account.

    Thanks to Andy for this initiative — it’s great to see the potential of our data being widened in one fell swoop.

    There has already been a mutual benefit to this linking. WhatDoTheyKnow volunteer Matt has been able to use examples of failed matches to find cases where our database needed to be brought up to date with name changes. At the same time, Andy says it has helped him and his fellow Wikidata volunteers to create new items about councils and other bodies that were in WhatDoTheyKnow but not Wikidata.

    Richard, also one of WhatDoTheyKnow’s volunteer team, says, “I’ve often thought there’s a lot of overlap between what we do on WhatDoTheyKnow and what Wikipedia volunteers are doing — we’re both maintaining lists of public bodies — so any tools for closer collaboration are great.”


    Image: Carl Nenzen Loven