1. New in Alaveteli: hiding individual attachments

    At our Freedom of Information service WhatDoTheyKnow, when faced with requests to remove material, we operate on the principle of removing the minimum amount possible.

    Alaveteli, the codebase which underlies WhatDoTheyKnow and a number of other FOI sites around the world, gives moderators a range of options for removing content – with the ability to surgically remove text ranging from individual words and phrases, to individual messages, or even entire request threads. This is useful when we spot misuse of our service, for example.

    What we’ve been lacking, up until now, was a way to apply these types of removals to attachments.

    Back in the early days of WhatDoTheyKnow, attachments were less common, but now we see many more: there can often be several attachments to one individual message.

    Over the last few years, there have been occasions where we’ve had to remove an entire message, which may contain several useful attachments, just because of a small issue with one of them.

    We’d then go through an annoying manual process to download the publishable ones, upload them to our file server, and then annotate the request with the links – here’s an example.

     An FOI response, above which is the annotation: We have redacted a name from one of the released documents, acting in line with our published policies on how we run our service. We have republished the response and attachments in an annotation below.

    Back in 2013, when the original suggestion for enabling finer grained control was raised, the site contained around 400,000 attachments. There are now more than 3,500,000! We don’t remove content often, but at this scale it’s inevitable that we need to intervene now and then.

    After a little code cleanup we were able to make individual attachment removal a reality. This allows us much more control over how we balance preserving a historic archive of information released under Access to Information laws, and running the site responsibly and meeting our legal obligations under GDPR.

    As an example, let’s imagine that the FOI officer replying to our request inadvertently makes a data breach when releasing some organisation charts in `organisation chart b.pdf`.

    A fake FOI response in which the officer releases an organisational chart.

    Previously we’d have had to have hidden the whole response. Now, we can go into the admin interface and inspect each individual attachment.

    A list of file attachments

    We can then set our usual “prominence” value – offering a few options from fully visible to completely hidden – and include a reason for why the content has been hidden. We always seek to run the site transparently and explain any actions taken.

    Prominance: Hidden.Reason for prominence: attachment contains significant data breach

    On saving the form, you can see that only the problematic attachment has been removed, with the remainder of the response intact. This saves us considerable time when reviewing and handling material with potential data issues, and keeps as much information published as possible while we do so.

    Response with one hidden attachment

    As an extra bonus, since the main body text of emails is also treated as an “attachment” in Alaveteli, we’re now able to hide potentially problematic material there without affecting the attachments we present.

    A list of file attachments

    We’ve already used this feature several times to republish material where we’d previously had to hide the entire message due to the technical limitations at the time.

    Image: Kenny Eliason

  2. Freedom of Information requests around the academic status of Dr. Tsai Ing-wen

    We recently became aware of extensive misuse of our Freedom of Information site WhatDoTheyKnow, in connection with the academic status of Taiwanese politician Dr Tsai Ing-wen.

    This activity became apparent through a very large quantity of correspondence being sent through the site, all focusing on the validity of Dr Ing-wen’s qualification from the London School of Economics and Political Science (LSE). 

    The majority of this material was repeating the same or very similar FOI requests, and some were not valid requests at all. We also saw mass posting of annotations, some on completely unrelated requests, and new requests which copied the titles of unrelated existing requests in an apparent attempt to evade our attention.

    Running the service responsibly

    As an organisation, we positively and passionately support the citizens’ right to access information and to hold organisations accountable: this is the very foundation that WhatDoTheyKnow is built upon, and its reason for existing. 

    Over time, we’ve formulated and consolidated policies to ensure that information on the site is preserved, as far as possible, as a permanent archive. We robustly contest unjustified requests to remove material from our service, and will only remove any substantive Freedom of Information requests and responses if we absolutely have to. 

    We initially treated this misuse assuming good faith, putting significant effort into removing problematic material from correspondence while continuing to publish elements which could have amounted to a valid Freedom of Information request. 

    Understanding the problem 

    Several users took the time to report the misuse of our service to us, for which we are thankful. As a matter of course, we review all material reported to us and assess it before making a decision on what to do. It took our small team of staff and volunteers a significant amount of  time to respond to the number of reports made in this case.

    Researching the topic more deeply, we discovered a statement from the Information Commissioner on requests they’ve also received on this subject, in which they say:

    The intent of these requests is clearly to try to add weight to theories around the falsification of President Tsai’s PHD, which have already been considered at length by the Commissioner and the Tribunal and found to be entirely lacking in substance.

    Further, both the LSE and the University of London have published their own statements, and a copy of the PhD thesis in question is now available online via LSE’s website

    While rejecting one FOI request on this subject as vexatious, LSE raised the possibility that people in China could be making requests to benefit from the country’s citizen evaluation system, stating:

    “We have been made aware that there is the possibility that the LSE has been added to a list of targets to gain social credits in China. As such we believe that your request and the others we received in this time period have not been made for just the purpose of receiving information but for personal gain.” 

    With this information in hand, we were confident to treat the issue as mass misuse, more akin to spam or even a disinformation attack than to people making misguided requests.

    Taking action

    During the course of this situation, we have banned 108 user accounts, most of which have been created to circumnavigate previous bans and to post inappropriate material to our site. We removed more than 300 requests from the site and 1,640 comments from pages. 

    To put this in context, we only banned 126 newly created user accounts in the whole of 2021, mainly for spamming (see more details in our 2021 Transparency Report). 

    Current approach to the misuse of service

    As a result of this misuse we are taking the following actions. 

    While we will continue to adhere to our reactive moderation policy in most instances, we may occasionally review activity by new users while this incident is ongoing. When we are alerted to correspondence on the subject in question, we will not be taking our usual approach of trying to preserve any valid FOI request contained within broader correspondence. We will instead make a very quick assessment of whether it appears to be a genuine request for information or part of the concerted misuse campaign, in which case the request will be hidden.

    The users making these requests will then be banned without warning or notification. The same will apply to any comments being made on existing requests. It will be up to any users that are banned in this process to make a case to us that they are making genuine FOI requests.

    This approach is in line with that we have taken in other instances of misuse of our service.

    We have also enabled enhanced anti-spam measures on the site, which will help us deal with other instances of misuse more efficiently.

    We may never fully understand what exact circumstances instigated this wave of misuse, but it has been instructive, and has helped us formulate new ways to tackle the always surprising means by which our work – to help citizens make valid requests for information in public – can be temporarily derailed.


    Image: Olga Safronova

  3. WhatDoTheyKnow Transparency report

    In 2021 WhatDoTheyKnow users made 100,092 Freedom of Information requests.

    Those requests, and the responses they received, are public on the website for anyone to see. But what’s not quite so visible is the work the WhatDoTheyKnow team do behind the scenes — answering users’ questions, removing inappropriate content and keeping everything ticking over.

    Some of the team’s most difficult calls arise around the removal of information. WhatDoTheyKnow’s guiding principle is that it is a permanent, public archive of Freedom of Information requests and responses, open to all.

    For this reason, the default position is not to remove substantive public information requests and responses; however, we act quickly if problematic content is reported to us. And, to help everyone understand exactly what has been removed and why, where possible we record these details on the request page.

    This year, for the first time, we’re extending our efforts towards transparency even further, with this report in which we’ll summarise the information removal requests and actions taken during the last twelve months.

    To allow for a full 12 months of data, the date range used throughout this report is 1 November 2020 to 31 October 2021

    Headline facts and figures

    • 20,714,033 visitors to WhatDoTheyKnow.com this year
    • 22,847 new WhatDoTheyKnow user accounts this year, taking the total to 222,694
    • 7,971 total number of email threads in the support inbox in 2021
    • 822 requests hidden from WhatDoTheyKnow in 2021
      …in the context of 100,092 requests made in the year, and a total of 772,971 requests now published on the site
    • 196 Total number of published requests where we redacted some material in 2021
      …usually due to the inappropriate inclusion of personal information, or defamation.
    • 126 The number of users who created accounts this year banned
      …that’s just 0.06% of new users.
    • WhatDoTheyKnow is a project of mySociety run by a small team of staff and dedicated volunteers.

    And in more detail…

    Requests flagged for our attention

    The table below shows the reasons that requests were reported for admin attention this year. Note that we also receive many reports directly by email, so while not comprehensive, this is indicative.

    Reason for attention request Total number
    Contains personal information 143
    Not a valid request 108
    Vexatious 94
    Request for personal information 85
    Contains defamatory material 51
    Other 287
    Total 768

    Material removed from the site

    The following tables show where members of the support team have acted to remove or hide requests from WhatDoTheyKnow in the last year, and the reason why.

    There is a range of options available to moderators, from ‘hidden’ (the most extreme) to ‘discoverable with link’. This is in addition to the censor rules that are used to hide certain information within a request or response.

    Request visibility Total number
    Visible only to the request maker 805
    Discoverable only to those who have the link to the request 11
    Hidden 8
    Reason for removing from public view Total number
    Not a valid FOI request 701
    Vexatious use of FOI 29
    Other (reason not programmatically recorded*) 124

    *Current processes do not create an easily retrievable list of reasons beyond the two above, but we are hoping to improve our systems so future transparency reports can include a more detailed breakdown.

    Censor rules (programmatically hiding the problematic part/s of a request) Total number
    Number of censor rules applied 881
    Number of requests with censor rules applied 196 
    Number of requests with censor rules applied which are still publicly visible, but with problematic material hidden 188

    Data protection issues raised to the WhatDoTheyKnow user support inbox 

    The following data shows the number of email threads* received into the WhatDoTheyKnow user support inbox regarding the most common types of concern around information published on the site. Not all issues raised resulted in material being removed from the site.

    GDPR = UK General Data Protection Regulations
    DPA: Data Protection Act

    Label Total number of threads
    GDPR Right to Erasure 317
    Defamation  130
    Data breach 96
    GDPR & DPA concerns (type not specified) 42
    GDPR Right to Rectification 33
    GDPR Right of Access 21
    Harassment 17
    GDPR Right to Object 12
    Data breach – internal** 2
    Impersonation 1
    Total 674

    * Email threads may be either automatically categorised by the system, or manually categorised by the WhatDoTheyKnow support team on the basis of the information given by the person reporting them.

    ** “Data Breach – internal” refers to cases where WhatDoTheyKnow has identified that a data breach may have been caused due to our own staff actions. We take our obligations seriously, and use such instances as a learning opportunity, so these are reported even if very minor, and often when they’re nothing more than a near miss — which both of these cases were.

    High risk concerns raised for review 

    Our policies ensure that certain issues can be escalated for review by the wider team and, where more complex, by a review panel that includes mySociety’s Chief Executive and the Chair of the Trustees.  Escalation is typically prompted by threats of legal action, complaints, notifications of serious data breaches, complex GDPR cases, or cases that raise significant policy questions.

    Case type* Total number
    Defamation 66
    GDPR Right to Erasure 42
    Data breach 40
    Complaints 33
    GDPR & DPA concerns 11
    GDPR Right of Access 6
    Harassment 5
    Takedown 2
    GDPR Right to Object 2
    GDPR Right to Rectification 1
    Other 78
    Total 286

    * Email threads may be either automatically categorised by the system, or manually categorised by the WhatDoTheyKnow support team on the basis of the information given by the person reporting them.

    Users

    User accounts Total 
    WhatDoTheyKnow users with activated accounts 222,694
    New user accounts activated in 2021 22,847
    Reason for banning users in 2021 Total 
    Spam 3,936
    Other site misuse 166
    Total number of users banned in 2021 4,102
    Anonymisation* Total 
    Accounts anonymised in 2021 170

    * Where accounts have been anonymised this is at the user’s request, generally to comply with GDPR Right to Erasure requests.

    Users are banned and their accounts may be closed due to site misuse and breach of the House Rules. Anonymised and banned users are no longer able to make requests or use their accounts.

    Thank you for reading

    This is the first time we’ve compiled a Transparency Report like this for WhatDoTheyKnow, but it’s something we’ve been wanting to do for some time. We demand transparency from public authorities and it’s only right that we also practice it ourselves. 

    Additionally, we hope that the report goes some way to showing the type of work the team do behind the scenes, and that moderating a well-used site like WhatDoTheyKnow is not without challenges.

    In future years, we hope to build on this initial report, ideally automating many of the stats so that they can be seen on a live dashboard. For now, we thought it was worthwhile making a manually-compiled proof of concept. 

    If there are specific statistics that you’d like to see in subsequent Transparency reports, or you’d like to know more about any of those above, do drop the team a line. They’ll get back to you as soon as the urgent moderation work is done!

    See mySociety’s 2021 annual review

    Image: Create & Bloom

  4. New functionality for FixMyStreet for Councils

    We’re pleased to announce new moderation features for clients of FixMyStreet for Councils.

    This new functionality enables nominated members of staff to edit user reports from within the FixMyStreet front end.

    It’s quick and easy, and allows you to react immediately to unwanted content on your site. Read on to find out more.

    Screenshot of a problematic report in FMS

    Screenshot of a problematic report in FMS

    What’s wrong with this report?

    So what is wrong with the report in the screenshot above?

    If you run a site on the FixMyStreet platform, you’ll be familiar with this kind of report, and the chances are that you’ll already be twitching to edit it.

    User-generated content is wonderful in many ways – but it can also present problems on a public-facing site. Let’s look at a few of the potential issues in the report above:

    • The user has included his phone number in the report description, and now it’s available for anyone to see.
    • The user’s name is also public. While this is the default option on FixMyStreet, users often get in touch to say that they meant to make their report anonymously (an option on FixMyStreet, but one which the user can only access at the point of submission).
    • There’s an inappropriate photo. This one is a statue of Carl Jung, which obviously has nothing to do with the report. But even relevant photos can be problematic: imagine if it was a graphic depiction of a dead animal, or some rude graffiti.
    • Profanity: in the example above, we’ll imagine that “pesky” is a mild profanity, but experience tells us that users don’t always hold back on their language.

    There are other common problems too, not represented in this report. Users sometimes post potentially libellous information: naming someone they suspect of flytipping, for example, or giving an address where they believe planning permission has been flouted.

    In the run-up to local election, councils have to be particularly sensitive to any content that might be construed as political – commonly they wish to remove any mention of any candidate.

    Moderation in all things

    New FMS Moderation panel
    Up until now, we’ve edited reports for our council clients, on request. However, this is clearly a long-winded way of getting sensitive material off the site, especially when time is of the essence.

    So we’ll shortly be introducing the ability for client moderation of sites. Councils or other bodies who run FixMyStreet will be able to nominate trusted users and give them the ability to edit problematic reports from within the report page.

    When logged in, these users will see a “moderate” button on every report – this feature will not be available to any user unless explicitly authorised.

    As you can see, this panel provides the ability to:

    • Hide the report completely
    • Hide the name of the poster
    • Hide or show a photo (if one was originally provided)
    • Edit the title and body of the report.

    For some reports, it might be necessary to make a number of edits, and finally submit the changes:

     

    FMS Moderation in Progress

    FMS Moderation in Progress

    The moderator can also add a reason for the changes, so it’s recorded if a colleague needs to know the history of the report in the future.

    This functionality gives a lot of power to admins to remove inappropriate information – but the user took the time to submit their report, and it’s only fair to let them know it’s been changed. So the system sends them an automatic email, as below:

    FMS Moderation Email

    FMS Moderation Email

     

    Finally, the system automatically updates the report to show that it has been moderated. As well as a timestamp, it signals where any information has been removed in the title or body of the report.

    FMS Moderation Displayed on Report

    FMS Moderation Displayed on Report

     

    Updates can be just as problematic as reports, so the same functionality will apply to them.

    We’d welcome feedback on this mechanism, so please let us know if you think we’ve missed any features.

    Note: These screenshots are from our work in progress and do not yet display the slick design that we habitually apply right at the end of the build process. Please regard them as preview shots only!