Transcripts – the written records of who says what in a conversation – aren’t sexy.
However, they can be very important, or even historic. They can reveal big plans that will affect lots of people, and they are a basic requirement of political accountability.
But the way in which transcripts are made available online today doesn’t reflect this importance. They tend to be published as hundreds of PDFs, and look more or less like they were made in the 1950s.
We think that the people who are affected by the decisions and plans announced in transcribed meetings deserve better.
What is SayIt?
SayIt is an open source tool for publishing speeches, discussions and dialogues, simply and clearly, online. Search functionality is built in, you can link to any part of a transcript, and the whole thing works nicely on mobile devices.
SayIt can be used either as a hosted service, or it can be built directly into your own website, as a Django app. Here are some examples of what it looks like in its hosted, standalone form:
However, SayIt’s main purpose is to be built into other sites and apps. We don’t have a live demo of this today, but one of our international partners will soon be launching a new Parliamentary Monitoring site which uses SayIt to publish years of parliamentary transcripts.
SayIt is also 100% open data compatible, and we use a cut-down version of the Akoma Ntoso open standard for data import.
What isn’t SayIt?
Not a site full of data curated and uploaded by mySociety – it’s a tool for redeployment all over the net. We’ll host deployments where that’s helpful to people, though.
Not primarily about Britain – whilst we’re a social enterprise based in the UK, SayIt has been built with an international perspective. We hope it will serve the needs of people watching politicians in places like Kenya and South Africa.
Not solely a mySociety project – it’s actually an international collaboration, via the Poplus network (see more below).
Not (yet) a tool to replace Microsoft Word as the way you write down transcripts in the first place. This is coming as we move from Alpha to Beta, though.
Why are we building SayIt?
SayIt is one of the Poplus Components. Poplus is a global collaboration of groups that believe it is currently too difficult and expensive to build effective new digital tools to help citizens exert power over institutions.
Poplus Components are loosely joined tools, mostly structured as web services, that can be used to radically decrease the development time of empowerment sites and apps.
SayIt is the newest component, and aims to reduce the difficulty and cost of launching services that contain transcripts – in particular websites that allow people to track the activities of politicians. Using SayIt or other Poplus components you can build your site in whatever language and framework suits your wishes, but save time by using the components to solve time-consuming problems for you.
The founders of Poplus are FCI in Chile, and mySociety in the UK – and we are hoping that the launch of SayIt will help grow the network. The project has been made possible by a grant from Google.org, while early iterations were aided by the Technology Strategy Board.
Interested in publishing transcripts via SayIt? Here’s what to do…
Having taken a look at the demos, we hope at least some of you are thinking ‘I know of some transcripts that would be better if published like this’.
If you are interested, then there are two approaches we’d recommend:
If you’re a coder, or if you have access to technical skills, read about how to convert your data into the open standard we use. Then talk to us about how to get this data online.
If you don’t have access to technical skills, get in touch about what you’re interested in publishing, and we’ll explore the options with you.
Note to coders – We’ve not yet spent a lot of time making SayIt easy to deploy locally, so we know it may be a challenge. We’re here to help.
Where might SayIt help?
SayIt comes from a desire to publish the speeches of politicians. But we know that there are many other possible uses, which is why we built the Shakespeare demo.
We think SayIt could be useful for publishing and storing transcripts of:
Local council meetings
Academic research interviews and focus groups
Academic seminars, lectures, etc
Market research focus groups
Historic archives of events such as a coronation or key debate
These are just a few of our ideas, but we bet you have others – please do tell us in the comments below.
What’s coming next
At the moment, SayIt only covers publishing transcripts, not creating them. Needless to say, this lack of an authoring interface is a pretty big gap, but we are launching early (as an Alpha) because we want to know how you’ll use it, what features you want us to build, and what doesn’t work as well as we anticipated. We also want to see if we can attract other people to co-develop the code with us, which is the real spirit of the Poplus network.
We’ll also be adding the ability to subscribe to alerts so that you’ll get an email every time a keyword occurs (just as you can on our other websites, such as FixMyStreet, TheyWorkForYou and WhatDoTheyKnow). This feature will come into its own for ongoing series of transcripts such as council meetings.
Image by Columbia Phonograph Co. [Public domain], via Wikimedia Commons
We’re starting the year with some really wonderful news: Google.org is granting us a fantastic $1.6m, to be spent over two years.
Clearly, this is a significant sum of money, which will really turbo-charge our efforts to build technologies to help groups like mySociety in countries around the world.
We will be using the money to provide developers with open source technologies to help them to more easily and quickly launch new civic apps and services. We will also be working with lots of other groups to promote greater knowledge and technology sharing amongst civil society groups of all kinds, especially in the accountability sector.
What’s the problem being tackled?
Currently, it can take a great deal of work to launch even relatively simple sites or apps with civic purposes, because the sector is not rich with mature, sector-specific tools and technologies. This high barrier to getting started has a bad effect on the range and strength of popular, impactful civic sites and apps online, globally.
Working with international partners we plan to develop some common, open source components that will reduce the effort required to launch new services in a broad range of areas: including accountability, legal, environmental, political, and more.
mySociety will work with local partners in various targeted regions to help those partners make the greatest possible benefit from using these new, common, collaboratively-developed open source components. And we’ll be working to help them contribute back, both in terms of shared code and shared knowledge.
The project will also develop new approaches to bringing together the global civic-technology community, so that it can collaborate more easily on new projects.
We’re really excited to see where this project will take us next – and we are very grateful to Google.org for the increased opportunities their funding brings us.
Photo by KayVee INC (CC)
Summer may seem like a long time ago, but despite the cold outside, we’ve been looking back over our participation in Google’s Summer of Code project. It’s almost enough to warm us up!
This post is an attempt to record the process from our point of view. We hope it will be useful for other organisations considering participating next year, and for students who want to know more about how the scheme works.
What is Google Summer of Code?
It’s a programme sponsored by Google’s philanthropic arm, giving students the chance to experience real-life coding on open source software.
The scheme is open to students all over the world, who are then paired up with open source organisations like us. The students gain paid work experience and mentoring; the organisations gain willing workers and some fresh new perspectives; the world gains some more open source code to use or develop further.
Everyone’s a winner, basically.
2012 was our first year on the programme: once we had been accepted on the scheme, we were given two student slots – the maximum allowed for a first-time organisation.
Given mySociety’s wide suite of codebases, there were several projects that could have benefited. We listed all our ideas, and let people apply for the ones they found appealing.
Goodness, there were a lot of applicants! It was very heartening to discover that there is such an enthusiastic community of young coders all around the world – even if it did take us a long time to sift through them all and make our choices.
You might remember our post back in May, when we announced that we’d made our choices. We were delighted to get working with Dominik from Germany and Chetan from India.
As things turned out, our students ended up working on a project that wasn’t even on our original list: PopIt, our super-easy ‘people and positions’ software.
That’s because once we spoke to our chosen students, we realised they had the skills that could really help us forge ahead with this project – and once we discussed it with them, they were keen. So PopIt it was.
Germany and India are a bit of a commute away, but fortunately development work can be managed remotely. We know this particularly well at mySociety: our core team work from home and are scattered across the UK.
The only difference here was the 6+ hour time difference between us and India: it was important to be rigorous about checking in at times when Chetan would be awake!
We communicated via IRC (instant chat), email, and occasionally Skype, and it all worked well.
Edmund, the team member chosen to be mentor, broke the required tasks down into big pieces so that the students would have realistic work units of several days each.
What was achieved
PopIt is primarily a tool for helping people create and run parliamentary monitoring websites (like TheyWorkForYou) with minimal coding knowledge/effort, though we anticipate that it will have many other uses too.
Our students spent the first half of the summer learning and improving the PopIt codebase. Once they were confident in it, they created their own sites using PopIt as a datasource to test the API, and, hopefully, create a valuable reference resource for the community.
Dominik added a migration tool to PopIt, which lets you upload data as a CSV. This means that you can start a site with a database of names, positions and dates at its heart – within seconds.
His test site was a professors’ database (the code is here). Dom also wrote some helpful posts on the dev blog like this one.
Chetan created an image proxy that lets us serve images in a smart way that makes sense for APIs. His test site was for Indian representatives (here’s the code).
Neither site is being maintained now, which just confirms that it is harder to run a site than to start it. This is not a failing, though. The creation of these sites, along with Chetan and Dom’s feedback, helped us understand where improvements needed to be made. In the course of one summer, PopIt became much more mature.
Looking back on the Summer of Code
Edmund attended a follow-up ‘mentors’ summit’ at the Googleplex in California – he found it very helpful to compare notes with other organisations and find out what had worked best for them all, and he made some good contacts too.
Assuming we get the chance again, would we participate in 2013? Our experience was very positive, but as yet we are undecided, purely because of the fluid nature of our workflow: we don’t yet know whether time and resources will permit.
Obviously, we have enjoyed great benefits from the scheme, but that has depended on quite a bit of input from our side, and we need to be sure that we can ensure that happens again.
Edmund has compiled a list of advice, from the practical (ask students to treat the placement like a full-time job; test coding skills before acceptance) to the desirable (a weekly blog post from participants; make sure you over-estimate the time you’ll spend mentoring). If you’re thinking of participating next year, he’d be happy to pass on his tips for ensuring that you, and your assigned students, get the best out of the Google Summer of Code. Just drop him a line.
One of the key differences between the UK’s national parliament and its local governments is that Parliament produces a written record of what gets said – Hansard.
This practice – which has no actual legal power – still has a huge impact on successful functioning of Parliament. MPs share their own quotes, they quote things back to one-another, journalists cite questions and answers, and every day TheyWorkForYou sends tens of thousands of email alerts to people who want to know who said what yesterday in Parliament. Without freely available transcripts of Parliamentary debates, it is likely that Parliament would not be anything like as prominent an institution in British public life.
No Local Hansards
Councils, of course, are too poor to have transcribers, and so don’t produce transcripts. Plus, nobody wants to know what’s going on anyway. Those are the twin beliefs that ensure that verbatim transcripts are an exceptional rarity in the local government world.
At mySociety we think the time has come to actively challenge these beliefs. We are going to be building a set of technologies whose aim is to start making the production of written transcripts of local government meetings a normal practice.
We believe that being able to get sent some form of alert when a council meeting mentions your street is a gentle and psychologically realistic way of engaging regular people with the decisions being made in their local governments. We believe transcripts are worth producing because they show that local politics is actually carried out by humans.
The State of the Art Still Needs You
First, though – a reality check. No technology currently exists that can entirely remove human labour from the production of good quality transcripts of noisy, complicated public meetings. But technology is now at a point where it is possible to substantially collapse the energy and skills required to record, edit and publish transcripts of public meetings of all kinds.
We are planning to develop software that uses off-the-shelf voice recognition technologies to produce rough drafts of transcripts that can then be edited and published through a web browser. Our role will not be in working on the voice recognition itself, but rather on making the whole experience of setting out to record, transcribe and publish a speech or session as easy, fast and enjoyable as possible. And we will build tools to make browsing and sharing the data as nice as we know how. All this fits within our Components strategy.
But mySociety cannot ourselves go to all these meetings. And it appears exceptionally unlikely that councils will want to pay for official transcribers at this point in history. So what we’re asking today is for interest from individuals – inside or outside councils – willing to have a go at transcribing meetings as we develop the software.
It doesn’t have to be definitive to be valuable
Hansard is the record of pretty much everything that gets said in Parliament. This has led to the idea that if you don’t record everything said in every session, your project is a failure. But if Wikipedia has taught us anything, it is that starting small – producing little nuggets of value from the first day – is the right way to get started on hairy, ambitious projects. We’re not looking for people willing to give up their lives to transcribe endlessly and for free – we’re looking for people for whom having a transcript is useful to them anyway, people willing to transcribe at least partly out of self interest. We’re looking for these initial enthusiasts to start building up transcripts that slowly shift the idea of what ‘normal’ conduct in local government is.
Unlike Wikipedia we’re not really talking about a single mega database with community rules. Our current plans are to let you set up a database which you would own – just as you own your blog on Blogger or WordPress, perhaps with collaborators. Maybe you just want to record each annual address of the Lord Mayor – that’s fine. We just want to build something that suits many different people’s needs, and which lifts the veil on so much hidden decision making in this country.
Get in touch
The main purpose of this post is to tell people that mySociety is heading in this direction, and that we’d like you along for the ride. We won’t have a beta to play with for a good few months yet, but we are keen to hear from anyone who thinks they might be an early adopter, or who knows of other people who might want to be involved.
And we’re just as keen to hear from people inside councils as outside, although we know your hands are more tied. Wherever you sit – drop us a line and tell us what sort of use you might want to make of the new technology, and what sort of features you’d like to see. We’ll get back in touch when we’ve something to share.
All of us at mySociety love the fact that there are so many interesting new civic and democratic websites and apps springing up across the whole world. And we’re really keen to do what we can to help lower the barriers for people trying to build successful sites, to help citizens everywhere.
Today mySociety is unveiling MapIt Global, a new Component designed to eliminate one common, time-consuming task that civic software hackers everwhere have to struggle with: the task of identifying which political or administrative areas cover which parts of the planet.
As a general user this sort of thing might seem a bit obscure, but you’ve probably indirectly used such a service many times. So, for example, if you use our WriteToThem.com to write to a politician, you type in your postcode and the site will tell you who your politicians are. But this website can only do this because it knows that your postcode is located inside a particular council, or constituency or region.
Today, with the launch of MapIt Global , we are opening up a boundaries lookup service that works across the whole world. So now you can lookup a random point in Russia or Haiti or South Africa and find out about the administrative boundaries that surround it. And you can browse and inspect the shapes of administrative areas large and small, and perform sophisticated lookups like “Which areas does this one border with?”. And all this data is available both through an easy to use API, and a nice user interface.
We hope that MapIt Global will be used by coders and citizens worldwide to help them in ways we can’t even imagine yet. Our own immediate use case is to use it to make installations of the FixMyStreet Platform much easier.
We’re able to offer this service only because of the fantastic data made available by the amazing OpenStreetMap volunteer community, who are constantly labouring to make an ever-improving map of the whole world. You guys are amazing, and I hope that you find MapIt Global to be useful to your own projects.
The developers who made it possible were Mark Longair, Matthew Somerville and designer Jedidiah Broadbent. And, of course, we’re also only able to do this because the Omidyar Network is supporting our efforts to help people around the world.
From Britain to the World
For the last few years we’ve been running a British version of the MapIt service to allow people running other websites and apps to work out what council or constituency covers a particular point – it’s been very well used. We’ve given this a lick of paint and it is being relaunched today, too.
MapIt Global is also the first of The Components, a series of interoperable data stores that mySociety will be building with friends across the globe. Ultimately our goal is to radically reduce the effort required to launch democracy, transparency and government-facing sites and apps everywhere.
If you’d like to install and run the open source software that powers MapIt on your own servers, that’s cool too – you can find it on Github.
About the Data
The data that we are using is from the OpenStreetMap project, and has been collected by thousands of different people. It is licensed for free use under their open license. Coverage varies substantially, but for a great many countries the coverage is fantastic.
The brilliant thing about using OpenStreetMap data is that if you find that the boundary you need isn’t included, you can upload or draw it direct into Open Street Map, and it will subsequently be pulled into MapIt Global. We are planning to update our database about four times a year, but if you need boundaries adding faster, please talk to us.
If you’re interested in the technical aspects of how we built MapIt Global, see this blog post from Mark Longair.
Commercial Licenses and Local Copies
MapIt Global and UK are both based on open source software, which is available for free download. However, we charge a license fee for commercial usage of the API, and can also set up custom installs on virtual servers that you can own. Please drop us a line for any questions relating to commercial use.
As with any new service, we’re sure there will be problems that need sorting out. Please drop us an email, or tweet us @mySociety.
As you may already be aware, mySociety is putting considerable effort into making it super-easy to set up versions of our websites FixMyStreet and WhatDoTheyKnow in other countries.
These ‘websites in a box’ are a key part of our strategy to help people develop more successful civic and democratic websites around the world, but they are only the first half of our plan. Today I wanted to talk about the other half.
There are some use-cases for software in which most people are entirely happy to take some software off the shelf, press ‘Go’, and start using it. WordPress is a good example, and so is Microsoft Office.
However, there are some kinds of social issues that vary so much between different countries and regions that we believe one-size-fits-all tools for attacking them are impracticable.
This problem is particularly acute in the arena of sites and apps that allow people to track the activities of politicians. In this area there are several dozen different sites globally, almost all of which are powered by software that was written bespoke for that particular usage.
What drives this pattern of people re-writing every site from scratch is that people in different places care about different aspects of politics. In some countries what really counts is how politicians vote, in others the crux is campaign finance contributions, in others it is information on who has criminal records, and in others still it is whether public money has been vanishing suspiciously.
To build an off-the-shelf software platform that could handle all this data equally well in every country would be an immense coding task. And more important than that, we believe that it would create a codebase so huge and complex that most potential reusers would run away screaming. Or at least ignore it and start from scratch.
In short – we don’t believe there can be a WordPress for sites that monitor politicians, nor for a variety of other purposes that relate to good governance and stronger democracies.
We believe that the wrong answer to this challenge is to just say “Well then, everyone should build their own sites from scratch.” Over the years we at mySociety have been witness to the truly sad sight of people and organisations around the world wearing themselves out and blowing their budgets just trying to get the first version of a transparency website out the door. All too often they fail to create popular, long lasting sites because the birthing process is just so exhausting and resource-consuming that there’s nothing left to drive the sites to success. Often they don’t even get to launch.
A painful aspect of this problem is that the people who work on such sites are genuine altruists who are trying to solve serious problems in their part of the world; too much of their passion and energy is used up on building tools, when there’s still so much work beyond that that’s needed to make such sites successful. However, as we pointed out above, giving them a complete package on a plate isn’t an option. So what can we do?
Our Proposed Answer – The Components
We start from the following observation: coders and non-coders like simple, minimal, attractive tools that help them achieve bigger goals. Simple tools don’t make anyone run away screaming – they encourage exploration and deliver little sparkles of satisfaction almost immediately. But simple tools have to be highly interoperable and reliable to form the foundation of complex systems.
Our plan is to collaborate with international friends to build a series of components that deliver quite narrow little pieces of the functionality that make up bigger websites. These include:
- PopIt – A Component to store and share the names of politicians, and the jobs they have.
- MapIt – A Component to store and share information on the locations of administrative boundaries, like counties, regions or cities.
- SayIt – A Component to store and share information on the words that public figures say or put out in writing.
There will be more, possibly many more. Our goal is to radically collapse the time it takes to build new civic and democratic (and possibly governmental) websites and apps, without putting constraints on creativity.
Characteristics of each Component
There are some crucial architecture decisions that have been baked into the Components, to truly make them ‘small pieces loosely joined’.
- Each Component is fundamentally a tool for storing and sharing one or two kinds of common data – they’re intentionally minimalist.
- As a developer, you just use the Components that make sense for your goals – you simply don’t have to look at or learn about the Components that contain functionality that doesn’t matter to you.
- You don’t have to install anything to get started – you can always begin by playing with a hosted Component.
- We won’t impose our taste in programming languages on you. You can code your website in whatever language you want. The Components are not ‘modules’ - they don’t plug into some overarching framework like Drupal or WordPress. They are stand-alone tools which just present you data over REST APIs, and which you can write data into using REST APIs.
- Each Component’s data structures will offer as much flexibility as makes sense given the goal of keeping each Component really good at one or two tasks. We’ll listen to feedback carefully to get this right.
- Each Component has a clean, simple web front end so you can explore the data held in a store without having to write lots of SQL queries. Often you will be able to edit the data this way, too.
- Get started in seconds – each Component offers at least some functionality which is available inside a minute after getting involved.
- Non coders are welcome – we are building the Components so that non-coders can start gathering, editing and sharing data straight away, possibly long before they are in a position to launch a ‘real site’.
- Data can be added to the Components both through write APIs and through manual editing interfaces, suitable for non-coders.
- Learn from our mistakes – it is really easy to get the wrong data structure for civic, democratic or governmental data. Good practice data structures are baked into the Components, to save you pain later.
- Use our hosted versions, or install open source code locally. It will normally be quicker to get started in using the Components in a hosted environment, but if you want to run them locally, you’re entirely welcome. The code will be open source, and we’ll work hard to make sure it’s attractive and easy to install.
- The Components will talk to each other, and to the rest of the web using simple open schemas which will evolve as they are built. Where possible we’ll pick up popular data standards and re-use those, rather than building anything ourselves.
What the Components Aren’t
Sometimes in life it can be easier to describe things by what they aren’t:
- The Components are definitively not modules in a framework or platform. Each one is totally independent, and they will frequently be written in different languages – partly to force us to ensure that the APIs are truly excellent.
- The Components aren’t either Hosted or Local, they’re both. We’ll always offer a hosted version and a downloadable version, and you’ll always be able to move any data you have stored on the hosted versions down to your local copies.
- The Components aren’t all about mySociety. We’re planning to build the first ones in conjunction with some friends, and we’ll be announcing more about this soon. We want the family of Components to be jointly owned by a group of loving parents.
When can I see some of the Components in Action?
We’ll be blogging more about that tomorrow…
Footnote – To see the provenance of the extremely useful ’small pieces loosely joined’ concept, see this.
Ah, summer: walks in the park, lazing in the long grass, and the sound of chirping crickets – all overlaid with the clatter of a thousand keyboards.
That may not be your idea of summer, but it’s certainly the ways ours is shaping up. We’re participating in Google’s Summer of Code, which aims to put bright young programmers in touch with Open Source organisations, for mutual benefit.
What do the students get from it? Apart from a small stipend, they have a mentored project to get their teeth into over the long summer hols, and hopefully learn a lot in the process. We, of course, see our code being used, improved and adapted – and a whole new perspective on our own work.
Candidates come from all over the world – they’re mentored remotely – so for an organisation like mySociety, this offers a great chance to get insight into the background, politics and technical landscape of another culture. Ideas for projects that may seem startlingly obvious in, say, Latin America or India would simply never have occurred to our UK-based team.
This year, mySociety were one of the 180 organisations participating. We had almost 100 enquiries, from countries including Lithuania, India, Peru, Georgia, and many other places. It’s a shame that we were only able to take on a couple of the many excellent applicants.
We made suggestions for several possible projects to whet the applicants’ appetite. Mobile apps were popular, in particular an app for FixMyTransport. Reworking WriteToThem, and creating components to complement MapIt and PopIt also ranked highly.
It was exciting to see so many ideas, and of course, hard to narrow them down.
In the end we chose two people who wanted to help improve our nascent PopIt service. PopIt will allow people to very quickly create a public database of politicians or other figures. No technical knowledge will be needed – where in the past our code has been “Just add developers”, this one is “Just add data”. We’ll host the sites for others to build on.
Our two successful applicants both had ideas for new websites that would use PopIt for their datastore, exactly the sort of advanced usage we hope to encourage. As well as making sure that PopIt actually works by using it they’ll both be creating transparency sites that will continue after their placements ends. They’ll also have the knowledge of how to set up such a site, and in our opinion that is a very good thing.
We hope to bring you more details as their projects progress, throughout the long, hot (or indeed short and wet) summer.
PS: There is a separate micro-blog where we’re currently noting some of the nitty gritty thoughts and decisions that go into building something like PopIt. If you want to see how the project goes please do subscribe! The Components mailing list is also a good way of staying in touch.
Top image by Elaine Millan, used with thanks under the Creative Commons licence.