At the open government data camp recently in London I gave the first of a series of lightning talks, together with Bill Roberts. I chose to address “socially open data”.
For the most part when we are discussing opening up data what open is, it is being defined in terms of technical aspects (formats etc.) and legal aspects (licensing mostly). Where social and organizational
aspects are concerned these are usually seen as part of the adoption process that comes after the release of data, and not as part of how we perceive what open data is. It is being treated as something that
is not connected to the characteristics of the data set itself.
I would like to advocate however that there are certain social aspects that need to be part of how we define ‘open data’. The reason that it is absent in current discussions is I guess because the social side of things is where it can get complex and messy. But we can make that human complexity more managable if we look at it in the scope of singular datasets. On that level it is all about adding context.
Socially open data is data that comes with contextual information, next to the right data formats and open license. Socially open data = just add context.
That context I think can be added in several ways.
Part of it is in what is part of the metadata coming with the dataset:

  • A contact person
  • An address for feedback
  • When was the data generated/collected, and when will it be updated?
  • What was the data used for within government?
  • How was the data used for its government task?
  • How was the data generated / collected?
    (those last three points will tell you a lot about the background / possibilities of a dataset)
    Other aspects have to do with making access to your data more likely:
  • Make data sets findable
  • Point to your data set often (whenever you e.g. cite/use that data yourself)
  • Do PR for your data set
  • Announce your data release in relevant community ‘hang-outs’ (on-line / off-line) of people you think might be interested
  • Add the data set to a data catalogue (like
    All these points basically say ‘if you do not make sure people know your data exists and is available, for all intent and purposes it doesn’t and it’s not.’
    And thirdly part of making data socially open is readying the environment for release:
  • Engage in dialogue with likely and emerging re-users. Make them visible if possible in the context of the data set. (this helps new re-users see the potential of the data, and turns the data set into a social object creating new connections)
  • Engage in dialogue with those that the data describes or affects. Make them visible if possible in the context of the data set. (if your data is about agriculture, talk to farmers described by the data about the way/form the data release may be helpful to themselves)
  • Make the release process of data transparent from its inception to its conclusion.
    All of these points help address all kinds of objections and obstacles that may come up when opening up data. All of those, in my experience, can not be dealt with at all on a generic level but only and straightforward in the context of specific data sets. This makes it part of what precedes data release and the data release itself, not the adoption process after release of data.
    By focusing, when defining what open data is, on just the technical and legal aspects we overlook that the needed change of mindset concerning opening government and its data up is only adressed by social aspects. If we leave that out of how we define open data, and relegate it only to what happens after the release of data that is already deemed ‘open’, and not as part of how we get to labeling data as ‘open’, we are simply not addressing the purpose of it all.
    Below are the slides I used at OGD Camp in London to convey my point.

In the presentation below, that Tim Berners Lee gave last February at the TED conference, the creator of the Web talks about what needs to come next: linked data. This is Berners-Lee’s explanation of the semantic web.

The internet used to only connect servers to eacher other. I remember how in the very late ’80s I logged onto a Unix machine at some US University to get some material from that machine using command line entries.
Berners-Lee thought it frustrating that you would find documents and files in all kinds of formats for which you might not have the right software to read it. Out of that frustration came the Web. It didn’t look all that great initially (see pic below), but it meant you could open a document from any machine, and have it link to other documents. The Web connects documents.

Early version of the CERN website.

Now he proposes to link data to eachother, much like we now link documents, and used to link servers to eachother. As the next step in the evolution of the internet.
How he imagines that you can see in the video. It needs loads of raw data however. Data that follows three rules: it is available in open formats, it has an URI, and it links to other data. Hence his call to arms: Raw Data Now!
Given the work I currently do on opening up public service information (PSI) in the Netherlands, I can only subscribe to that call. In his presentation Berners-Lee talks a bit more about what is important about opening up government data.