Today I gave short presentation at the Citizen Science Koppelting conference in Amersfoort. Below is the transcript and the slidedeck.

I’ve worked on opening data, mainly with governments worldwide for the past decade. Since 2 years I’ve been living in Amersfoort, and since then I’ve been a participant in the Measure Your City network, with a sensor kit. I also run a LoRaWan gateway to provide additional infrastructure to people wanting to collect sensor data. Today I’d like to talk to you about using open data. What it is, what exists, where to find it, and how to get it. Because I think it can be a useful resource in citizen science.

What is open data? It is data that is published by whoever collected it in such a way, so that anyone is permitted to use it. Without any legal, technical or financial barriers.

This means an open license, such as Creative Commons 0, open standards, and machine readable formats.
Anyone can publish open data, simply by making it available on the internet. And plenty people, academics, and companies do. But mostly open data means we’re looking at government for data.

That’s because we all have a claim on our government, we are all stakeholders. We already paid for the data as well, so it’s all sunk costs, while making it available to all as infrastructure does not increase the costs a lot. And above all: governments have many different tasks, and therefore lots of different data. Usually over many years and at relatively good quality.

The legal framework for open data consists of two parts. The national access to information rules, in NL the WOB, which says everything government has is public, unless it is not.
And the EU initiated regulation on re-using, not just accessing, government material. That says everything that is public can be re-used, unless it can’t. Both these elements are passive, you need to request material.

A new law, the WOO, makes publication mandatory for more things. (For some parts publication is already mandated in laws, like in the WOB, the Cadastre law, and the Company Register)

Next to that there are other elements that play a role. Environmental data must be public (Arhus convention), and INSPIRE makes it mandatory for all EU members to publish certain geographic data. A new EU directive is in the works, making it mandatory for more organisations to publish data, and for some key data sets to be free of charge (like the company register and meteo data)

Next to the legal framework there are active Dutch policies towards more open data: the Data Agenda and the Open Government action plan.

The reason open data is important is because it allows people to do new things, and more importantly it allows new people, who did not have that access before, to do new things. It democratises data sources, that were previously only available to a select few, often those big enough to be able to pay for access. This has now been a growing movement for 10-15 years.

That new agency has visible effects. Economically and socially.In fact you probably already use open data on a daily basis without noticing. When you came here today by bike, you probably checked Buienradar. Which is based on the open data of the KNMI. Whenever in Wikipedia you find additional facts in the right hand column, that informations doesn’t come from Wikipedia but is often directly taken from government databases. The same is true for a lot of the images in Wikipedia, of monuments, historic events etc. They usually come from the open collections of national archives, etc.

When Google presents you with traffic density, like here the queues in front of the traffic lights on my way here, it’s not Google’s data. It’s government data, that is provided in near real-time from all the sensors in the roads. Google just taps into it, and anyone could do the same.You could do the same.

There are many big and small data sets that can be used for a new specific purpose. Like when you go to get gas for the car. You may have noticed at manned stations it takes a few seconds for the gas pump to start? That’s because they check your license plate against the make of the car, in the RDW’s open database. Or for small practical issues. Like when looking for a new house, how much sunshine does the garden get. Or can I wear shorts today (No!).

But more importantly for today’s discussion, It can be a powerful tool for citizen scientists as well. Such as in the public discussion about the Groningen earth quakes. Open seismological data allowed citizens to show their intuition that the strength and frequency of quakes was increasing was real. Using open data by the KNMI.Or you can use it to explore the impact of certain things or policies like analysing the usage statistics of the Utrecht bicycle parking locations.A key role open data can play is to provide context for your own questions. Core registers serve as infrastructure, key datasets on policy domains can be the source for your analysis. Or just a context or reference.

Here is a range of examples. The AHN gives you heights of everything, buildings, landscape etc.
But it also allows you to track growth of trees etc. Or estimate if your roof is suitable for solar panels.This in combination with the BAG and the TOP10NL makes the 3d image I started with possible. To construct it from multiple data sources: it is not a photograph but a constructed image.

The Sentinel satellites provide you with free high resolution data. Useful for icebreakers at sea, precision agriculture, forest management globally, flooding prevention, health of plants, and even to see if grasslands have been damaged by feeding geese or mice. Gas mains maintainer Stedin uses this to plan preventative maintenance on the grid, by looking for soil subsidence. Same is true for dams, dikes and railroads. And that goes for many other subjects. The data is all there. Use it to your advantage. To map your measurements, to provide additional proof or context, to formulate better questions or hypotheses.

It can be used to build tools that create more insigt. Here decision making docs are tied to locations. 38 Amersfoort council issues are tied to De Koppel, the area we are in now. The same is true for many other subjects. The data is all there. Use it to your advantage. To map your measurements, to provide additional proof or context, to formulate better questions or hypotheses.

Maybe the data you need isn’t public yet. But it might be. So request it. It’s your right. Think about what data you need or might be useful to you.
Be public about your data requests. Maybe we can for a Koppelting Data Team. Working with data can be hard and disappointing, doing it together goes some way to mitigate that.

[This post was created using a small hack to export the speaking notes from my slidedeck. Strangely enough, Keynote itself does not have such an option. Copying by hand takes time, by script it is just a single click. It took less than 10 minutes to clean up my notes a little bit, and then post the entire thing.]

Some links I thought worth reading the past few days

Jonathan Gray has published an article on Data Worlds, as a way to better understand and experiment with the consequences of the datafication of our lives. The article appeared in Krisis, an open access journal for contemporary philisophy, in its latest edition dealing with Data Activism.

Jonathan Gray writes

The notion of data worlds is intended to make space for thinking about data as more than simply a representational resource, and the politics of data as more than a matter of liberation and protection. It is intended to encourage exploration of the performative capacities of data infrastructures: what they do and could do differently, and how they are done and could be done differently. This includes consideration of, as Geoffrey Bowker puts it, “the ways in which our social, cultural and political values are braided into the wires, coded into the applications and built into the databases which are so much a part of our daily lives”

He describes 3 ‘data worlds’, and positions them as an instrument intended for practical usage.

The three aspects of data worlds which I examine below are not intended to be comprehensive, but illustrative of what is involved in data infrastructures, what they do, and how they are put to work. As I shall return to in the conclusion, this outline is intended to open up space for not only thinking about data differently, but also doing things with data differently. The test of these three aspects is therefore not only their analytical purchase, but also their practical utility.

Those 3 worlds mentioned are

  1. Data Worlds as Horizons of Intelligibility, where data is plays a role in changing what is sayable, knowable, intelligible and experienceable , where data allows us to explore new perspectives, arrive at new insights or even new overall understanding. Hans Rosling’s work with Gapminder falls in this space, and datavisualisations that combine time and geography. To me this feels like approaching what John Thackara calls Macroscopes, where one finds a way to understand complete systems and one’s own place and role in it, and not just the position of oneself. (a posting on Macroscopes will be coming)
  2. Data Worlds as Collective Accomplishments, where consequences (political, social, economic) result from not just one or a limited number of actors, but from a wide variety of them. Open data ecosystems and the shifts in how civil society, citizens and governments interact, but also big data efforts by the tech industry are examples Gray cites. “Looking at data worlds as collective accomplishments includes recognising the role of actors whose contributions may otherwise be under-recognised.
  3. Data Worlds as Transnational Coordination, in terms of networks, international institutions and norm setting, which aim to “shape the world through coordination of data“. In this context one can think of things like IATI, a civic initiative bringing standardisation and transparency to international aid globally, but also the GDPR through which the EU sets a new de-facto global standard on data protection.

This seems at first reading like a useful thinking tool in exploring the consequences and potential of various values and ethics related design choices.

(Disclosure: Jonathan Gray and I wore both active in the early European open data community, and are co-authors of the first edition/iteration of the Open Data Handbook in 2010)

Last week saw an end of an era. The program manager for open data of the Flemish government retired. While parts of the work will go on, no direct successor will be named to the role. At the annual conference of Information Flanders (#tiv2017), Noël van Herreweghe after 6 years of being the driving force behind Flanders’ open data team, said his goodbye during the opening plenary. His main and clearly heard message was that much is still to be done, and we’ve barely started on the path towards open by design. I hope the Flemish government and civil service will take this to heart. Now is not the time to reduce efforts, as the transition is only just in motion.


Noël telling us and the Flemish government to stay the course (Tweet and photo by @toon, Toon Vanagt)

In the past 6 years Flanders has taken several steps that I think the Netherlands should follow. Based on the underlying legal framework, the Flemish government has taken pre-emptive decisions for all government entities within their scope about in what ways data can and should be published. It is no longer up to the individual agencies, if you decide to publish you must follow the established principles. In the Netherlands that is all still voluntary, and the principles are put forward as guidelines, not as must-follow rules. Similarly the Flemish government has adopted a URI strategy, using both machine and human readable URI conventions, which in the Netherlands is lacking.

It’s been a pleasure to work with Noël and his team in these past 6 years. Whether it was in helping decide on which local and regional open data projects to fund from the Flemish government, translating research on the economic impact of open data to the Flemish and Belgian context, providing scenario’s to the Flemish Chancellary for opening up Flemish consolidated laws and regulations as open data, or providing open data training together with Noel to a joint session of the Dutch and Belgian/Flemish supreme audit authorities.

For each of those 6 years my colleague Paul, representing the Dutch government open data team, and I participated in the Flemish open government days, and its successor the annual Information Flanders Meet-up. It gave us the opportunity to keep comparing Dutch and Flemish open data efforts, to learn from each other as well as laugh about the differences. A fixed feature on the agenda was eating a Portuguese fish soup the evening before the event in Brussels with Noël and his colleagues.

Portuguese Fish Soup Open data dag Vlaanderen
A ‘small bowl’ of fish soup, 2012 and 2015 editions

As Noël said, the work isn’t remotely done, and judging from the conversations we had with Noël last week, he isn’t likely to stop being active either. So I trust we will find ways of working together again in a different setting in the near future.

Last week ten of the twelve Dutch Provinces met at the South-Holland Provincial government to discuss open data, and exchange experiences, seeking to inspire each other to do more on open government data. I participated as part of my roles as open data project lead for both the Province of Overijssel, and the Province Fryslân.

There were several topics of discussion.

  • The National Open Government Action Plan (part of the OGP effort), a new version of which is due next spring, and for which input is currently sought by the Dutch government.
  • A proposal by the team behind the national open data platform to form a ‘high value data list’ for provincial data sets.
  • Several examples were discussed of (open) data being used to enhance public interaction.

I want to briefly show those examples (and might blog about the other two later).

Make it usable, connect to what is really of significance to people
Basically the three examples that were presented during the session present two lessons:

1) Make data usable, by presenting them better and allow for more interaction. That way you more or less take up position half-way between what is/was common (presenting only abstracted information), and open data (the raw detailed data): presenting data in a much more detailed way, and making it possible for others to interact with the data and explore.

2) Connect to what people really care about. It is easy to assume what others would want to know or would need in terms of data, it is less easy to actually go outside and listen to people and entrepreneurs first what type of data they need around specific topics. However, it does provide lots of vital clues as to what data will actually find usage, and what type of questions people want to be able to solve for themselves.

That second point is something we always stress in our work with governments, so I was glad to hear it presented at the session.

There were three examples presented.

South-Holland put subsidies on a map
The Province of South-Holland made a map that shows where subsidies are provided and for what. It was made to better present to the public the data that exists about subsidies, als in order to stimulate people to dive deeper into the data. The map links to where the actual underlying data should be found (but as far as I can tell, the data isn’t actually provided there). A key part of the presentation was about the steps they took to make the data presentable in the first place, and how they created a path for doing that which can be re-used for other types of data they are seeking to house in their newly created data warehouse. This way presenting other data sources in similar ways will be less work.


The subsidy map

Gelderland provides insight into their audit-work
Provinces have a task in auditing municipal finances. The Province of Gelderland has used an existing tool (normally used for presenting statistical data) to provide more detail about the municipal finances they audited. Key point here again was to show how to present data better to the public, how that plays a role in communicating with municipalities as well, and how it provides stepping stones to entice people to dive deeper. The tool they use provides download links for the underlying data (although the way that is done can still be significantly improved, as it currently only allows downloads of selections you made, so you’d have to sticht them back together to reconstruct the full data set)



Screenshot of the Gelderland audit data tool

Flevoland listens first, then publishes data
The last example presented was much less about the data, and much more about the ability to really engage with citizens, civil society and businesses and to stimulate the usage of open data that way. The Province Flevoland is planning major renovation work on bridges and water locks in the coming years, and their aim is to reduce hindrance. Therefore they already now, before work is starting, are having conversations with various people that live near or regularly pass by the objects that will be renovated. To hear what type of data might help them to less disrupt their normal routines. Resulting insights are that where currently plans are published in a generic way, much more specific localized data is needed, as well as much more detailed data about what is going to happen in a few days time. This allows people to be flexible, such as a farmer deciding to harvest a day later, or to move the harvest aways over water and not the road. Detailed data also means communicating small changes and delays in the plans. Choosing the right channels is important too. Currently e.g. the Province announces construction works on Twitter, but no local farmer goes there for information. They do use a specific platform for farmers where they also get detailed data about weather, water etc, and distributing localized data on construction works there would be much more useful. So now they will collaborate with that platform to reach farmers better. (My company The Green Land is supporting the Province, 2 municipalities and the water board in the province, in this project)


Overview of the 16 bridges and waterlocks that will be renovated in the coming years


Various stakeholders around each bridge or waterlock are being approached