At our Freedom of Information service WhatDoTheyKnow, when faced with requests to remove material, we operate on the principle of removing the minimum amount possible.
Alaveteli, the codebase which underlies WhatDoTheyKnow and a number of other FOI sites around the world, gives moderators a range of options for removing content – with the ability to surgically remove text ranging from individual words and phrases, to individual messages, or even entire request threads. This is useful when we spot misuse of our service, for example.
What we’ve been lacking, up until now, was a way to apply these types of removals to attachments.
Back in the early days of WhatDoTheyKnow, attachments were less common, but now we see many more: there can often be several attachments to one individual message.
Over the last few years, there have been occasions where we’ve had to remove an entire message, which may contain several useful attachments, just because of a small issue with one of them.
We’d then go through an annoying manual process to download the publishable ones, upload them to our file server, and then annotate the request with the links – here’s an example.
Back in 2013, when the original suggestion for enabling finer grained control was raised, the site contained around 400,000 attachments. There are now more than 3,500,000! We don’t remove content often, but at this scale it’s inevitable that we need to intervene now and then.
After a little code cleanup we were able to make individual attachment removal a reality. This allows us much more control over how we balance preserving a historic archive of information released under Access to Information laws, and running the site responsibly and meeting our legal obligations under GDPR.
As an example, let’s imagine that the FOI officer replying to our request inadvertently makes a data breach when releasing some organisation charts in `organisation chart b.pdf`.
Previously we’d have had to have hidden the whole response. Now, we can go into the admin interface and inspect each individual attachment.
We can then set our usual “prominence” value – offering a few options from fully visible to completely hidden – and include a reason for why the content has been hidden. We always seek to run the site transparently and explain any actions taken.
On saving the form, you can see that only the problematic attachment has been removed, with the remainder of the response intact. This saves us considerable time when reviewing and handling material with potential data issues, and keeps as much information published as possible while we do so.
As an extra bonus, since the main body text of emails is also treated as an “attachment” in Alaveteli, we’re now able to hide potentially problematic material there without affecting the attachments we present.
We’ve already used this feature several times to republish material where we’d previously had to hide the entire message due to the technical limitations at the time.
Image: Kenny Eliason