Why SayIt is (partly) a statement about the future of Open Data

Open Here by Troy J Morris

Until about two years ago I was quite actively involved in the Open Data movement. I sat in on the 2007 gathering in California where the first Open Data Principles were drafted, and later sat on the Transparency Board at the UK government.

I stopped being involved in early 2012 because I saw a couple of things happening. First, the Open Data baton had been picked up by dedicated, focused advocates like the Open Data Institute and the Open Knowledge Foundation, who could give 100% to fighting this fight (I always had to fit it around managing a growing organisation with other goals). And second I felt that the surge of relatively meaningful data releases in the country I live in (the UK) had pretty much come to an end. The real policy action and innovation will now happen in more rapidly-changing countries where transparency is a more visceral issue.

Still, despite walking away, I remained optimistic. It seemed more or less impossible to imagine that in twenty years’ time that there wouldn’t be quite a bit more Open Data around, especially in rich countries. But given the virtually-zero political gain to be had from this agenda in countries like the UK, where is said data actually going to come from?

Learning from Microsoft (really)

The more I thought about it, the more I realised that we’d already seen the answer in the form of Microsoft. Throughout the 1990s the .doc and .xls standard rose and took over governments around the world, even though there was never anything like a clear policy process that drove that decision.

There was certainly no high profile ‘Microsoft Government Partnership’ with international conferences and presidential speeches. Instead there was a safe, ‘no brainer’ product that governments bought to solve their problems, and these data standards came with it. The pressure on governments to do anything at all probably came from the fact that the private sector had widely adopted Office first.

I think that a recurrence of this phenomenon – change-through-replacing-old-computers – is where Open Data at real scale is going to come from. I think it’s going to come from old government computers being thrown away at their end-of-life and replaced with new computers that have software on them that produces Open Data more or less by default.

The big but

However, there’s a big BUT here. What if the new computers don’t come with tools that produce Open Data? This is where SayIt comes in, as an example of a relatively low-cost approach to making sure that the next generation of government IT systems do produce Open Data.

SayIt is a newly launched open source tool for publishing transcripts of trials, debates, interviews and so on. It publishes them online in a way that matches modern expectations about how stuff should work on the web – responsive, searchable and so on. It’s being built as a Poplus Component, which means it’s part of an international network of groups collaborating on shared technologies. Here’s JK Rowling being interviewed, published via SayIt.

But how does this little tool relate to the business of getting governments to release more Open Data? Well, SayIt isn’t just about publishing data, it’s about making it too – in a few months we’ll be sharing an authoring interface for making new transcripts from whatever source a user has access to.

We hope that having iterated and improved this authoring interface, SayIt can become the tool of choice for public sector transcribers, replacing whatever tool they use today (almost certainly Word). Then, if they use SayIt to make a transcript, instead of Word, then it will produce new, instantly-online Open Data every time they use it.

The true Open Data challenge is building brilliant products

But we can’t expect the public sector to use a tool like SayIt to make new Open Data unless it is cheaper, better and less burdensome than whatever they’re using now. We can’t – quite simply – expect to sell government procurement officers a new product mainly on the virtues of Open Data.  This means the tough task of persuading government employees that there is a new tool that is head-and-shoulders better than Excel or Word for certain purposes: formidable, familiar products that are much better than their critics like to let on.

So in order for SayIt to replace the current tools used by any current transcriber, it’s going to have to be really, really good. And really trustworthy. And it’s going to have to be well marketed. And that’s why we’ve chosen to build SayIt as an international, open source collaboration – as a Poplus Component. Because we think that without the billions of dollars it takes to compete with Microsoft, our best hope is to develop very narrow tools that do 0.01% of what Word does, but which do that one thing really really well. And our key strategic advantage, other than the trust that comes with Open Source and Open Standards, is the energy of the global civic hacking and government IT reform sector. SayIt is far more likely to succeed if it has ideas and inputs from contributors from around the world.

Regardless of whether or not SayIt ever succeeds in penetrating inside governments, this post is about an idea that such an approach represents. The idea is that people can advance the Open Data agenda not just by lobbying, but also by building and popularising tools that mean that data is born open in the first place. I hope this post will encourage more people to work on such tools, either on your own, or via collaborations like Poplus.


Photo by Troy Morris (CC)