1. Participation vs representation: Councillor attitudes towards citizen engagement

    In 2019, mySociety was involved in several projects working with local councils around using participatory or deliberative democracy to address a local issue (Public Square and the Innovations in Democracy Programme). Something that kept coming up at the fringes of these projects were the political considerations that led councils to find the idea of alternate forms of democracy appealing in the first place.

    Understanding more about this seemed important to the future spread of these ideas, and so as part of the Public Square project, we set out to find out how local councillors viewed ‘new’ forms of democracy, and how these views varied by the political situation of the councils and of the councillors themselves.

    Using a survey of local councillors, we tried to learn about different awareness and attitudes towards deliberative or participative exercises. We found that partisan and structural factors shape the perceptions of local representatives of citizen participation, and a wide-range of views among local councillors. Some were supportive of more weight being put on citizen participation, while others argued that if decisions are made by elected councillors there is someone to hold accountable. Both awareness and support for participatory methods increased if there was local experience of an exercise. Even opposition councillors tended to be quite supportive (76%) of participatory processes when run by the current leadership of the council.

    Where there is more disagreement was in how the outcome of processes should be handled. Very few councillors favour approaches where the result is authoritative or binding. Councillors in councils where there is no one party with an overall majority are more likely to give greater weight to participatory exercises (59%) than those where there is a single party majority (38%).  Every policy area except Children’s Social Care had over 50% acceptance that a participatory exercise could be appropriate. Programmes related to environment and cultural programmes rated highly, while programmes concerning social care scored lower. For all categories except planning and public health, councillors rated these activities as more appropriate if their council had previously engaged in such an exercise.

    Overall, this survey told us that councillors make personal evaluations of participatory exercises based on a mix of political and practical factors. While there is a tension between participatory and representative democratic structures, in practice this tension can lead to a variety of outcomes. The success or failure of future participation requires understanding about how this tension affects not just the form of deliberative exercises, but how results will be interpreted and implemented.

    The full report can be read online or downloaded as a PDF.

    Image: Lucas Benjamin

  2. Tracking carbon for mySociety

    As we explore projects where mySociety can help address the climate crisis, as an organisation we’ve also been trying to understand the carbon impact of our existing work.

    Using Code for Australia’s carbon calculator as a really helpful guide and starting point, we’ve estimated mySociety’s 2019 and 2020 carbon footprints.

    In  2019 this was 74 tonnes of CO2, and so far in 2020 it’s, as you’d expect in a year that includes several months of lockdown, substantially lower at around 23 tonnes.

    It’s proving frustratingly difficult to place these figures in context: even while using their methodology, we can’t accurately compare the outcome to Code for Australia’s given their very different geographical situation and activities; and as a remote organisation where all employees work from home, our footprint is always going to be different from more conventional set-ups. If you think your organisation bears similarities to ours, and you’ve also calculated your emissions, please do let us know!

    As for addressing our output, we are pushing a two pronged approach: we’ve already changed staff policies to encourage more sustainable working methods and to ensure a significant reduction in our future emissions; and, currently, having learned of disturbing failings in even the most-recommended offsetting services, we are researching where we might be able to make direct payments to mitigate  the carbon we produce.

    mySociety 2019 carbon footprint
    Item Total CO2 (metric tonnes) Percentage of total
    Flights 40.663 55.31%
    Accommodation 9.545 12.98%
    Ground transport 6.198 8.43%
    Electronics 0.695 0.95%
    Servers – manufacture 5.120 6.96%
    Servers – electricity 7.199 9.79%
    Laptop – manufacture 1.655 2.25%
    Laptop – electricity 0.475 0.65%
    Catering 1.967 2.68%
    Everything else 0.002 0.00%
    Total 73.56 100.00%

     

    The biggest contribution to carbon expenditure in 2019 was travel. mySociety is a distributed organisation, with staff all around the UK. While on a daily basis that means very little commuting, we do (or did pre-COVID) meet up frequently in teams, and three to four times a year the entire organisation convenes in one place. International research contracts that require onsite interviews can mean long haul plane journeys, and travelling to the international events that we organise requires some air travel as well.

    As an organisation we produced 47 tonnes of carbon in travel in 2019, with 75% produced by relatively few longhaul plane flights. The overall contribution of train travel is relatively low despite a large number of journeys (349). There were far fewer domestic plane journeys, but even so they accounted for almost as much carbon as train trips within the UKs.

    Mode Journeys (one way) CO2 % C02 Total distance % Total distance Average C02 per journey
    Long distance plane 24 35,297 75% 73,201 63% 2,941
    Short hop plane 31 5,366 11% 11,938 10% 298
    Train 349 3,068 7% 24,035 21% 17
    UK plane 15 2,156 5% 2,964 3% 270
    Car 39 887 2% 1,359 1% 39
    Bus 25 36 0% 397 0% 3
    Eurostar 9 29 0% 1,830 2% 5
    Grand total 492 46,839 100% 115,724 100% 181

     

    While for obvious reasons our 2020 travel costs are much lower, we are keen to avoid a return to the ‘old normal’.

    Over the last year, our policy towards ‘short’ plane journeys has changed. When staff do travel, if their destination can be reached within 7.5 hours door-to-door by train (or other forms of sustainable public transport) they should take this option rather than flying, except in mitigating circumstances around safety or accessibility.

    Additionally, if staff choose low-carbon holiday travel they are entitled to claim additional annual leave, as part of mySociety’s involvement in the Climate Perks scheme.

    Our wider environmental policy can be read on our website.

    Image: Providence Doucet

  3. Do photos help resolution of FixMyStreet reports?

    Summary

    FixMyStreet allows people to upload images along with a report. This can quickly provide the authority with more details of the issue than might be passed along in the written description, and lead to quicker evaluation and prioritisation of the repair. For problems that are hard to locate geographically by description (or where the pin has been dropped inaccurately), images might also help council staff locate and deal with the problem correctly.

    In 2019, 35% of reports included photos. Accounting for several other possible factors,  reports with photos were around 15% more likely to be recorded as fixed than reports without a photo. In absolute terms, reports with photos were fixed at a rate two percentage points higher. This varies by category, with photos having a much stronger effect (highways enquiries and reports made in parks and open space) in some categories, and in other categories photos having a small negative effect in the resolution (reports of pavement issues and rights of way).

    In general, these results suggest that attaching photos is not only useful for authorities, but can make it more likely that reporters have their problem resolved. There is a significant reservation that photos are much more useful for some kinds of reports than others. In terms of impacts on the service, when photos can convey useful information that helps lead to a resolution, users should be encouraged to attach them. Where photos are less helpful (such as problems encountered mostly at night), other prompt suggestions or asset selection tools may help lead to more repairs.

    (more…)

  4. Publishing less: our current thinking about comparative statistics

    Over the last few years we have stopped publishing several statistics on some of our services, but haven’t really talked publicly about why. This blog post is about the problems we’ve been trying to address and why, for the moment, we think less is better.

    TheyWorkForYou numerology 

    TheyWorkForYou launched with explicit rankings of MPs but these were quickly replaced with more “fuzzy” rankings, acknowledging the limitations of the data sources available in providing a concrete evaluation of an MP. Explicit rankings on the ‘numerology’ section of an MPs profile were removed in 2006. In July 2020, we removed the section altogether.

    This section covered the number of speeches in Parliament this year, answers to written questions, attendance at votes, alongside more abstract metrics like the reading age of the MP’s speeches, and ‘three-word alliterative phrases’ which counted the number of times an MP said phrases like ‘she sells seashells’.

    This last metric intended to make a point about the limits of the data, along with a disclaimer that countable numbers reflect only part of an MP’s job.

    Our new approach is based on the idea that, while disclaimers may make us feel we have adequately reflected nuance, we don’t think they are really read by users. Instead, if we do not believe data can help make meaningful evaluations or requires large qualifications,  we should not highlight it.

    Covid-19 and limits on remote participation also mean that the significance of some participation metrics is less clear for some periods. We’re open to the idea that some data may return in the future, if a clear need arises that we think we can fill with good information. In the meantime the raw information on voting attendance is still available on Public Whip. 

    WriteToThem responsiveness statistics

    When someone sends a message to a representative through WriteToThem, we send a survey two weeks later to ask if this was their first time writing, and whether they got a response.

    This answer was used annually to generate a table ranking MPs by responsiveness. In 2017 we stopped publishing the WriteToThem stats page. The concerns that led to this were:

    • There are systemic factors that can make MPs more or less likely to respond to correspondence (eg holding ministerial office).
    • As the statistics only cover the last year, this can lead to MPs moving around the rankings significantly, calling into question the value of a placement in any particular year. Does it represent improvement/decline, or is the change random?
    • MPs receive different types of communication and may prioritise some over others (for example,  requests for intervention rather than policy lobbying). Different MPs may receive different types of messages, making comparisons difficult.
    • The bottom rankings may be reflecting factors outside MPs’control (eg a technical problem with the email address, or health problems), which can invalidate the wider value of the rankings.

    The original plan was to turn this off temporarily while we explored how the approach could be improved, but digging into the complexity has led to the issue dragging on and at this point it is best to say the rankings are unlikely to return in a similar form any time soon.

    The reasons for this come from our research on WriteToThem and the different ways we have tried to explore what these responsiveness scores mean.

    Structural factors

    There are structural factors that make direct comparisons between MPs more complicated. For instance, we found that when people write to Members of the Scottish Parliament there are different response rates for list and constituency members. What we don’t know is whether this reflects different behaviour in response to the same messages, or whether list and constituency MSPs were getting different kinds of messages, some of which are easier to respond to. Either way, this would suggest an approach where we judge these separately or need to apply a correction for this effect (and we would need to have different processes for different legislatures).

    There are also collective factors that individual representatives do contribute to. For instance, if MPs from one party are more responsive to communication, controlling for this factor to make them easier to compare to other MPs individually is unfair as it minimises the collective effort. Individuals are part of parties, but also parliamentary parties are a collection of individuals. Clear divides are difficult in terms of allocating agency.

    Gender

    One of the other findings of our paper on the Scottish Parliament was that there was an effect in the Holyrood and Westminster Parliaments where female MPs had a systematically lower responsiveness score than male MPs (roughly 7% lower in both cases, and this remains when looking at parties in isolation). Is this a genuine difference in behaviour, or does it reflect a deeper problem with the data? While responsiveness scores are not quite evaluations it seems reasonable to be cautious in user-generated data that is systematically leading to lower rankings for women, especially when the relevant literature suggests that women MPs had spent more time on constituency service when the question was studied in the 1990s.

    One concern was if abusive messages sent through the platform were leading to more emails not worth responding to. This was of special concern given online abuse against women MPs through other platforms. While WriteToThem only accounts for 1-2% of emails to MPs, it is a concern if we cannot rule out if a gendered difference in abusive messages is a contributor to a difference in a metric we would then use to make judgements about MPs.

    Our research in this area has found some interaction between the gender of the writer and recipient of a message.  We found a (small) preference for users to write to representatives who shared their gender, but without more knowledge of the content of messages we cannot really understand if the responsiveness difference results from factors that are fair or unfair to judge individual representatives on. Our policy that we should maintain the privacy of communications between people and their MP as much as possible means direct examination is not possible for research projects, and returning to publishing rankings without more work to rule this out would be problematic. We are exploring other approaches to understand more about the content of messages.

    Content and needs

    We could in principle adjust for differences that can be identified, but we also suspect there are other differences that we cannot detect and remove. For instance, constituents in different places have different types of problems, and so have different needs from their MP. If these different kinds of problems have different levels of responsiveness, what we are actually judging an MP on is their constituents, rather than their own behaviour.

    A finding from our analysis of how the index of multiple deprivation (which ranks the country on a variety of different possible measures of deprivation) relates to data in WriteToThem is that messages to MPs from more deprived areas are less likely to get a response than those from less deprived areas. The least deprived decile has a response rate about 7% higher than the average and the most deprived decile is 6% lower. However, when looking at rates per decile per individual MP there is no pattern. This suggests this is a feature of different MPs covering different areas (with different distributions of deprivation), rather than individual MPs responding differently to their own constituents.

    At the end of last year, we experimented with an approach that standardised the scores via a hypothetical average constituency. This was used by change.org as one metric among many in a People-Power index. While this approach addresses a few issues with the raw rankings, we’re not happy with it. In particular, there was an issue with an MP who was downgraded because more of their responses were in a more deprived decile, and this was averaged down by lower responses in higher deciles.

    If we were to continue with that approach, a system that punishes better responsiveness to more deprived areas is a choice that needed a strong justification. This approach is also becoming more abstract as a measure, and less easy to explain what the ranking represents. Are we aiming to provide useful comparisons by which to judge an MP, or a guide to WriteToThem users as to whether they should expect a reply? These are two different problems.

    We are continuing to collect the data, because it is an interesting dataset and we’re still thinking about what it can best be used for, but do not expect to publish rankings in their previous form again.

    FixMyStreet and WhatDoTheyKnow

    Other services are concerned with public authorities rather than individual representatives. In these cases, there is a clearer (and sometimes statutory) sense of what good performance looks like.

    Early versions of FixMyStreet displayed a “league table”, showing the number of reports sent to each UK council, along with the number that had been fixed recently. A few years ago we changed this page so that it only lists the top five most responsive councils.

    There were several reasons for this: FixMyStreet covers many different kinds of issues that take different amounts of time to address, and different councils have more of some of these issues than others. Additionally, even once a council resolves an issue, not all users come back to mark their reports as fixed.

    As a result the information we have on how quickly problems are fixed may vary for reasons out of a council’s control. And so while we show a selection of the top five “most responsive” councils on our dashboard page, as a small way of recognising the most active councils on the site, we don’t share responsiveness stats for all councils in the UK. More detail on the difference in the reported fix rate between different kinds of reports can be seen on our data explorer minisite.

    WhatDoTheyKnow similarly has some statistics summary pages for the FOI performance of public authorities. We are reviewing how we want to generate and use these stats to better reflect our goals of understanding and improving compliance with FOI legislation in the UK, and as a model for our partner FOI platforms throughout the world.

    In general, we want to be confident that any metric is measuring what we want to measure, and we are providing information to citizens that is meaningful. For the moment that means publishing slightly less. In the long run, we hope this will lead us to new and valuable ways of exploring this data.

  5. New kinds of geography in MapIt

    MapIt is a mySociety service that can take UK postcodes and return which administrative boundaries those postcodes are inside. This can be used to find out what council or constituency an area is in —  you can test it at https://mapit.mysociety.org/.

    Over the last few months we’ve been updating some existing boundaries in MapIt, and adding new kinds of geographies.

    LSOA boundary on openstreetmap

    What has changed

    Local authorities and Clinical Commissioning Groups have been updated to their new April 2020 boundaries. Small census areas have been updated to their latest version across all UK nations.

    There is also a new kind of statistical geography: Travel To Work areas. These are areas that include the home and work location of 75% of people inside them. They are a way of visualising the commuting boundaries of an area, which may be significantly different from the administrative ones (See maps of all Travel To Work areas in England).

    Small census areas are small statistical areas that cover a neighbourhood sized areas (although what locals consider the neighbourhood to be may vary). Many sets of official statistics are mapped against these small areas, making them an important intermediary between postcode or coordinate data and measures such as the indices of multiple deprivation.

    As many statistics are produced separately for different UK nations, there are different kinds of small areas in different nations.

    Smaller:

    • Lower Super Output Areas  (LSOA) – England and Wales
    • Datazones (DZ) – Scotland
    • Super Output Areas (SOA) – Northern Ireland

    Larger:

    • Middle Layer Super Output Area (MSOA) – England and Wales
    • Intermediate Zones (IZ) – Scotland

    In MapIt, all of these boundaries are present, available under the English geography names (LSOA/ MSOA) to avoid needing more complicated lookups when working with postcode data from across different nations.

    What can I use this for?

    Mapping from user postcode data to LSOA helps build a picture of the environment of users. As we’ve done with FixMyStreet, this can be used to understand patterns of use. It can also help researchers with existing postcode datasets to find the equivalent statistical areas to expand the dataset.

    You can see some of these areas in practice powering the postcode lookup on this minisite looking at the new 2019 maps of multiple deprivation in England.

    You can access MapIt from an application through an API, or use the bulk upload tool to convert an existing dataset.

  6. Beneficial ownership blog series

    Over the last few months, mySociety and SpendNetwork have been working on a project for the UK Government Digital Service (GDS) Global Digital Marketplace Programme and the Prosperity Fund Global Anti-Corruption programme, led by the Foreign & Commonwealth Office (FCO), around beneficial ownership in public procurement.

    We’ve gathered some of the things we learned in a series of blog posts:

    The entire series can be viewed here.

    Header image: Photo by Olga O on Unsplash

  7. Beneficial ownership data and preferential procurement

    Header image: Photo by Ricardo Rocha on Unsplash

    mySociety and SpendNetwork have been working on a project for the UK Government Digital Service (GDS) Global Digital Marketplace Programme and the Prosperity Fund Global Anti-Corruption programme, led by the Foreign & Commonwealth Office (FCO), around beneficial ownership in public procurement. This is one of a series of posts about that work

    While the main purpose of collecting beneficial ownership information is as part of an anti-corruption agenda, ownership information can also be used in public procurement as part of preferential procurement programmes. These are meant to increase the distribution of government contracts among different groups in a country. 

    South Africa is an example of a country with a system of preferential procurement through the Broad-Based Black Economic Empowerment (B-BBEE) programme. This programme gives preference to companies that (amongst other criteria) have more Black people and/or women in ownership and management.

    This works through a certification process where auditors convert evidence of ownership and management into a certification for the company, which is then used in the procurement process. While conceptually similar to beneficial ownership in many ways, this methodology differs from the requirement of disclosure of ownership that tends to be used in beneficial ownership. 

    Public disclosure of ownership could be made a component of preferential procurement or similar schemes, but this would also require understanding of ownership at lower thresholds than is currently common. Understanding the demographics of ownership requires a full picture of shareholders, and that may include adding up many with small shares. The Beneficial Ownership Data Standard (BODS), does allow for anonymous persons where a reason is given, and so information could be captured and released for demographic analysis while not disclosing the identities of owners below a threshold.

    BODS does not currently cover demographic information for individuals or certification for companies. Doing so could increase its applicability to broader procurement objectives such as B-BBEE. There is discussion on OpenOwnership’s BODS repository of what the inclusion of additional personal data fields would involve. In general BODS approaches field inclusion using the principle of data minimisation, where the data collected should be the smallest amount of personal information required to fulfil a valid purpose. There is an intentional decision to exclude gender information from the global standard/data store, with the argument that personal information included in the overall standard should be demonstrably useful for the purposes of disambiguation. This is seen as the main purpose of ownership information on a global scale, rather than demographic analysis. 

    Rather than inclusion in the global standard, localised extensions are seen as more appropriate for demographic information, as what is of interest will vary from place to place. While a gender field could be relatively universal, understandings of ethnicity are often culturally specific and a universal standard would be inappropriate. For instance, Australia’s Indigenous Procurement Policy (IPP) recommends the use of an Indigenous business register that in turn uses a ‘Proof of Aboriginality’ process that is more involved than self-certification. 

    The data standard would benefit from some abstract thinking about how country-specific demographic needs should best be reflected within BODS-formatted data. The specific questions are:

    • What should the general pattern be for extending BODS data with demographics? Remembering that demographics may be for individuals or organisations. 
    • Should self-certified data be logged differently from certified data? How should certification be acknowledged (often ‘certifying agency’ is available, but sometimes the certification certificate may have an ID number). 
    • Should there be a flag on demographic information that is stored in BODS, but shouldn’t be released publicly? Or does this logic belong outside the standard? If so, is there a generalised need for a ‘privacy schema’ and tool that can be applied to BODS to remove/anonymise particular fields?

    Demographic certification is a system of ownership collection and verification, and a general understanding of the ways in which BODS should and shouldn’t be a part of that would be useful for the future of the standard.

    See all posts in this series.

  8. Unequal impacts of open registers of ownership

    Header image: Photo by Erol Ahmed on Unsplash

    mySociety and SpendNetwork have been working on a project for the UK Government Digital Service (GDS) Global Digital Marketplace Programme and the Prosperity Fund Global Anti-Corruption programme, led by the Foreign & Commonwealth Office (FCO), around beneficial ownership in public procurement. This is one of a series of posts about that work

    A key privacy concern with beneficial ownership, and especially open registers of beneficial ownership, is that it is making private information publicly accessible. As an Engine Room/OpenOwnership report on the subject says:

    Justifying open registers therefore depends on answering two important questions: first, why is a central register necessary, as opposed to company reporting obligations, or trusts and corporate service providers (‘TCSP’) regulation? Second, why must the central register be publicly accessible, rather than closed or limited-access?

    Common across the countries we looked at as part of this research was concern from government stakeholders and the private sector about open registers, even while there is enthusiasm for them from civil society.

    The case for open registers is, broadly, that it allows many eyes to look at the data. This creates greater oversight and scope for investigations from civil society – NGOs, journalists and members of the public, as well as feedback mechanisms to improve the quality of the data. There are multiplier effects when multiple open registers are merged that allow the same beneficiaries to be followed across borders. Making these datasets easier to access also makes it easier for official bodies to pursue investigations by increasing discoverability and removing obstacles to use.

    A key benefit of forming companies is it provides limited liability – which protects the assets of shareholders from the legal liabilities or debts of the company beyond the size of their ownership of the company. The argument justifying releasing the personal information of owners is that this is a privacy trade-off made by individuals in exchange for the substantial benefits of limited liability.

    The resulting information is a safeguard against the use of legal entities in a way that is against the public interest because it allows investigation and discovery of abuses.

    Where this becomes more complicated is that the costs of that loss of privacy are not the same for everyone. Where privacy loss leads to greater risk, this may either result in harm to individuals or the fear of that harm may mean people avoid forming companies or tendering for government contracts.  As such, the collection and distribution of data needs to acknowledge different costs of disclosing information, and allow exceptions. From the Engine Room/OpenOwnership report:

    Governments and companies should not collect and disclose data beyond the minimum that is necessary to achieve their aim, or data that poses a significant risk of harm. The risk associated with different types of information will depend on the context of both the individual and the country where they reside. This highlights the need for carefully designed exceptions regimes tailored to risks in that context.

    A key potential risk of address information being public is stalking, and this is a risk that falls more on women than men. The UK has an open register of directors and persons of significant control (PSC), and the discussion around it reflects possible risks of open registers more broadly. The comments under a Companies House blog post about GDPR features people saying they were surprised that personal information such as signatures, month and year of birth and addresses are publicly available. One commenter explicitly said the experience of being stalked made her terrified about her address information being made available. While disclosure requirements often distinguish between company registration and home addresses, micro-businesses may be more likely to be registered from home, and so have an increased privacy cost to the owner.

    In the UK, there has been an exception regime that allows information to be concealed from the public register, if personal characteristics of a person when associated with a company put a person “or any person living with them, at serious risk of violence or intimidation”.  This was amended in 2018 to remove the need for evidence for certain kinds of changes and to allow people to remove home addresses (for a cost) from register documents without the need for exceptions or evidence. Current directors have to substitute another correspondence address; former directors can have the information reduced to the first half of the postcode. This was explicitly fast-tracked without consultation as a “number of cases have been raised […] where the people involved are at risk of violence or intimidation yet cannot have their address information protected.”

    A related problem involves changes of name. A requirement that directors list former names is a common sense requirement which prevents people with bad reputations avoiding scrutiny. But for transgender directors this is a public record of their transition that may either expose them to harm, or discourage company formation in the first place. This issue is one of the reasons for the exclusion of gender from the BODS standard, as a structure where old information is superseded but not removed raises this exact issue. We also heard of a similar problem when gender is encoded into ID numbers, and these ID numbers are used in public.

    While there are situations where the risk is foreseeable and evidenced (a domestic violence victim starting a company at a new home, but needing to conceal their address), in other cases the damage may already be done when the risk becomes apparent. Even if information is successfully removed from the original source, where data has been released and incorporated into other products, retrospective redaction is more difficult.

    This problem is analogous to one faced by political candidates in the UK, where a report about intimidation and harassment of candidates and politicians led to the removal of a requirement to have home addresses printed on the ballot paper. Increased acknowledgements of the risks posed to individuals as a more diverse set of people enter into registerable roles can require re-examination of previous standards. This is especially important if it is happening alongside the opening up of information that was previously legally (but not easily) accessible.

    While privacy risks of open registers have to be accounted for in their design, closed registries might still be a privacy/security risk. One concern raised by an interviewee was that even closed registers can leak or bribery could occur for access. If a cache of data is too sensitive to publicly release, and there isn’t the capacity to properly secure it, the information may be too sensitive to gather at all. The capacity to secure and manage access to personal information is an essential component of any register.

    These problems demonstrate the importance of finding methods of delivering the public benefits of having collected private identifying information, while minimising the amount of personal information that is released. We have explored possible design patterns to help accomplish this where unique identifiers are available.

     

    See all posts in this series.

  9. Visualising conflicts of interests

    Header image: Photo by David Cook on flickr under a CC BY-NC 2.0 licence

    mySociety and SpendNetwork have been working on a project for the UK Government Digital Service (GDS) Global Digital Marketplace Programme and the Prosperity Fund Global Anti-Corruption programme, led by the Foreign & Commonwealth Office (FCO), around beneficial ownership in public procurement. This is one of a series of posts about that work

    As part of our research into beneficial ownership in procurement, we found several potential uses of better ownership data in the procurement process:

    • The identification of bidding cartels through revealing common beneficial ownership of tenderers to procurement processes.
    • The identification of high risk or fraudulent suppliers through non-existent or suspicious beneficial owners, such as professional intermediaries, or the presence of sanctioned individuals and companies in the ownership chains.
    • There is also an appetite from both government and civil society to use beneficial ownership in the identification of conflicts of interest in conjunction with information on procurement officers and politically exposed people.

    To explore this area we built a prototype, ‘Bluetail’, to explore options for a visual interface for use by procurement officers. This demonstrates the ways in which beneficial ownership data could be used to address some of the key procurement use cases we had found as part of our research.

    Diagram showing how contract data, ownership and pep data are combined to a single datastore and interface

    Our demo sites and and source materials are available in public:

    This prototype is a demonstration of processing data in three relevant standards: BODS, OCDS, and Popolo.

    Bluetail integrates this data by identifier matching. We reviewed options for the alternative approach of attribute-based matching, and identified relevant open source tools with which to achieve this. However, the goal would be to avoid this kind of matching wherever possible as it is a time and resource intensive process, with many possible inaccuracies and difficulties in scaling. That being the case, we also explored different methods for releasing ID information that can improve the effectiveness of this process.

    More information on the process and running locally can be found in the repository readme file.

    See all posts in this series.

  10. Getting public benefit from private IDs

    Header image: Photo by Meagan Carsience on Unsplash

    mySociety and SpendNetwork have been working on a project for the UK Government Digital Service (GDS) Global Digital Marketplace Programme and the Prosperity Fund Global Anti-Corruption programme, led by the Foreign & Commonwealth Office (FCO), around beneficial ownership in public procurement. This is one of a series of posts about that work

    Once collected, a key issue in analysis of company ownership data is correctly identifying when the same individual is connected with multiple companies. While name matching is viable in small datasets, it increases the amount of work required to remove false positives in larger datasets.

    For instance, while the UK’s Persons of Significant Control (PSC) register has a unique ID for each instance of a person having ownership, reconciling where an individual exists in multiple ownerships requires additional data processing, and possible inaccuracy. An approach developed for this dataset might not travel well to others, where address data may be less consistent (or lack an equivalent of, for example, a postcode). This problem extends beyond ownership data, and is a general issue in reconciling different datasets about people.

    The exact challenges of name reconciliations vary by the naming conventions in a country. Just as there can be no universal standard on storing name information, shortcuts to reduce ‘noise’ in a name (removing common typos, or sound-alikes) differ by language. For instance, the process to generate a CURP (ID) number in Mexico (which, by default, incorporates an individual’s first name) has explicit exceptions for very common first names, requesting use of the individual’s second name instead. Approaches within a country can also be varied: Indonesia has a wide range of ethnic and language groups, and so several different sets of common naming conventions.

    Given this problem, it is useful to be able to make use of other unique identifiers for an individual (a national ID or tax number). However, these are often seen as personal data that can not be released as part of open data. We have produced a short paper outlining the possible ways these private identifiers can be released.

    Different approaches are practical in different contexts, but at a minimum it should always be viable (and should be encouraged) to collect private identification information, and release an ID fragment to aid reconciliation. This is a short code derived from an ID, but that is not in itself unique. This can be used to more accurately group similar names into unique people. Private information can be used to add information about uniqueness to the process, without revealing the private information publicly.

    Read the paper

    See all posts in this series.