Juni is een goede maand voor open data dit jaar.

Ten eerste keurde vorige week dinsdag 4 juni de Eerste Kamer de wet goed die de Europese open data richtlijn implementeert in de Nederlandse Wet Hergebruik Overheidsinformatie. Al is de wet nog niet gepubliceerd en dus nog niet van kracht komt daarmee een einde aan drie jaar vertraging. De wet had al per juli 2021 in moeten gaan. De Europese richtlijn ging namelijk in juli 2019 in en gaf Lidstaten twee jaar de tijd voor omzetting in nationale wetgeving.

Ten tweede ging afgelopen zondag 9 juni de verplichting voor het actief publiceren door overheden via API’s van belangrijke data op zes thema’s in. Die Europese verordening werd eind 2022 aanvaard, werd begin februari 2023 van kracht, en gaf overheden 16 maanden d.w.z. tot zondag om er aan te voldoen. De eerste rapportage over de implementatie moeten Lidstaten in februari 2025 doen, dus ik neem aan dat veel landen die periode nog gebruiken om aan de verplichtingen te voldoen. Maar het begin is er. In Nederland is de impact van deze High Value Data verordening relatief gering, want het merendeel van de data die er onder valt was hier al open. Tegelijkertijd was dat in andere EU landen niet altijd het geval. Nu kun je dus Europees dekkende datasets samenstellen.

Today I gave short presentation at the Citizen Science Koppelting conference in Amersfoort. Below is the transcript and the slidedeck.

Using open data for citizen science, by Ton Zijlstra at Koppelting Mee Je Stad

I’ve worked on opening data, mainly with governments worldwide for the past decade. Since 2 years I’ve been living in Amersfoort, and since then I’ve been a participant in the Measure Your City network, with a sensor kit. I also run a LoRaWan gateway to provide additional infrastructure to people wanting to collect sensor data. Today I’d like to talk to you about using open data. What it is, what exists, where to find it, and how to get it. Because I think it can be a useful resource in citizen science.

What is open data? It is data that is published by whoever collected it in such a way, so that anyone is permitted to use it. Without any legal, technical or financial barriers.

This means an open license, such as Creative Commons 0, open standards, and machine readable formats.
Anyone can publish open data, simply by making it available on the internet. And plenty people, academics, and companies do. But mostly open data means we’re looking at government for data.

That’s because we all have a claim on our government, we are all stakeholders. We already paid for the data as well, so it’s all sunk costs, while making it available to all as infrastructure does not increase the costs a lot. And above all: governments have many different tasks, and therefore lots of different data. Usually over many years and at relatively good quality.

The legal framework for open data consists of two parts. The national access to information rules, in NL the WOB, which says everything government has is public, unless it is not.
And the EU initiated regulation on re-using, not just accessing, government material. That says everything that is public can be re-used, unless it can’t. Both these elements are passive, you need to request material.

A new law, the WOO, makes publication mandatory for more things. (For some parts publication is already mandated in laws, like in the WOB, the Cadastre law, and the Company Register)

Next to that there are other elements that play a role. Environmental data must be public (Arhus convention), and INSPIRE makes it mandatory for all EU members to publish certain geographic data. A new EU directive is in the works, making it mandatory for more organisations to publish data, and for some key data sets to be free of charge (like the company register and meteo data)

Next to the legal framework there are active Dutch policies towards more open data: the Data Agenda and the Open Government action plan.

The reason open data is important is because it allows people to do new things, and more importantly it allows new people, who did not have that access before, to do new things. It democratises data sources, that were previously only available to a select few, often those big enough to be able to pay for access. This has now been a growing movement for 10-15 years.

That new agency has visible effects. Economically and socially.In fact you probably already use open data on a daily basis without noticing. When you came here today by bike, you probably checked Buienradar. Which is based on the open data of the KNMI. Whenever in Wikipedia you find additional facts in the right hand column, that informations doesn’t come from Wikipedia but is often directly taken from government databases. The same is true for a lot of the images in Wikipedia, of monuments, historic events etc. They usually come from the open collections of national archives, etc.

When Google presents you with traffic density, like here the queues in front of the traffic lights on my way here, it’s not Google’s data. It’s government data, that is provided in near real-time from all the sensors in the roads. Google just taps into it, and anyone could do the same.You could do the same.

There are many big and small data sets that can be used for a new specific purpose. Like when you go to get gas for the car. You may have noticed at manned stations it takes a few seconds for the gas pump to start? That’s because they check your license plate against the make of the car, in the RDW’s open database. Or for small practical issues. Like when looking for a new house, how much sunshine does the garden get. Or can I wear shorts today (No!).

But more importantly for today’s discussion, It can be a powerful tool for citizen scientists as well. Such as in the public discussion about the Groningen earth quakes. Open seismological data allowed citizens to show their intuition that the strength and frequency of quakes was increasing was real. Using open data by the KNMI.Or you can use it to explore the impact of certain things or policies like analysing the usage statistics of the Utrecht bicycle parking locations.A key role open data can play is to provide context for your own questions. Core registers serve as infrastructure, key datasets on policy domains can be the source for your analysis. Or just a context or reference.

Here is a range of examples. The AHN gives you heights of everything, buildings, landscape etc.
But it also allows you to track growth of trees etc. Or estimate if your roof is suitable for solar panels.This in combination with the BAG and the TOP10NL makes the 3d image I started with possible. To construct it from multiple data sources: it is not a photograph but a constructed image.

The Sentinel satellites provide you with free high resolution data. Useful for icebreakers at sea, precision agriculture, forest management globally, flooding prevention, health of plants, and even to see if grasslands have been damaged by feeding geese or mice. Gas mains maintainer Stedin uses this to plan preventative maintenance on the grid, by looking for soil subsidence. Same is true for dams, dikes and railroads. And that goes for many other subjects. The data is all there. Use it to your advantage. To map your measurements, to provide additional proof or context, to formulate better questions or hypotheses.

It can be used to build tools that create more insight. Here decision making docs are tied to locations. 38 Amersfoort council issues are tied to De Koppel, the area we are in now. The same is true for many other subjects. The data is all there. Use it to your advantage. To map your measurements, to provide additional proof or context, to formulate better questions or hypotheses.

Maybe the data you need isn’t public yet. But it might be. So request it. It’s your right. Think about what data you need or might be useful to you.
Be public about your data requests. Maybe we can for a Koppelting Data Team. Working with data can be hard and disappointing, doing it together goes some way to mitigate that.

[This post was created using a small hack to export the speaking notes from my slidedeck. Strangely enough, Keynote itself does not have such an option. Copying by hand takes time, by script it is just a single click. It took less than 10 minutes to clean up my notes a little bit, and then post the entire thing.]

Some links I thought worth reading the past few days

Jonathan Gray has published an article on Data Worlds, as a way to better understand and experiment with the consequences of the datafication of our lives. The article appeared in Krisis, an open access journal for contemporary philisophy, in its latest edition dealing with Data Activism.

Jonathan Gray writes

The notion of data worlds is intended to make space for thinking about data as more than simply a representational resource, and the politics of data as more than a matter of liberation and protection. It is intended to encourage exploration of the performative capacities of data infrastructures: what they do and could do differently, and how they are done and could be done differently. This includes consideration of, as Geoffrey Bowker puts it, “the ways in which our social, cultural and political values are braided into the wires, coded into the applications and built into the databases which are so much a part of our daily lives”

He describes 3 ‘data worlds’, and positions them as an instrument intended for practical usage.

The three aspects of data worlds which I examine below are not intended to be comprehensive, but illustrative of what is involved in data infrastructures, what they do, and how they are put to work. As I shall return to in the conclusion, this outline is intended to open up space for not only thinking about data differently, but also doing things with data differently. The test of these three aspects is therefore not only their analytical purchase, but also their practical utility.

Those 3 worlds mentioned are

  1. Data Worlds as Horizons of Intelligibility, where data is plays a role in changing what is sayable, knowable, intelligible and experienceable , where data allows us to explore new perspectives, arrive at new insights or even new overall understanding. Hans Rosling’s work with Gapminder falls in this space, and datavisualisations that combine time and geography. To me this feels like approaching what John Thackara calls Macroscopes, where one finds a way to understand complete systems and one’s own place and role in it, and not just the position of oneself. (a posting on Macroscopes will be coming)
  2. Data Worlds as Collective Accomplishments, where consequences (political, social, economic) result from not just one or a limited number of actors, but from a wide variety of them. Open data ecosystems and the shifts in how civil society, citizens and governments interact, but also big data efforts by the tech industry are examples Gray cites. “Looking at data worlds as collective accomplishments includes recognising the role of actors whose contributions may otherwise be under-recognised.
  3. Data Worlds as Transnational Coordination, in terms of networks, international institutions and norm setting, which aim to “shape the world through coordination of data“. In this context one can think of things like IATI, a civic initiative bringing standardisation and transparency to international aid globally, but also the GDPR through which the EU sets a new de-facto global standard on data protection.

This seems at first reading like a useful thinking tool in exploring the consequences and potential of various values and ethics related design choices.

(Disclosure: Jonathan Gray and I wore both active in the early European open data community, and are co-authors of the first edition/iteration of the Open Data Handbook in 2010)

Last week saw an end of an era. The program manager for open data of the Flemish government retired. While parts of the work will go on, no direct successor will be named to the role. At the annual conference of Information Flanders (#tiv2017), Noël van Herreweghe after 6 years of being the driving force behind Flanders’ open data team, said his goodbye during the opening plenary. His main and clearly heard message was that much is still to be done, and we’ve barely started on the path towards open by design. I hope the Flemish government and civil service will take this to heart. Now is not the time to reduce efforts, as the transition is only just in motion.


Noël telling us and the Flemish government to stay the course (Tweet and photo by @toon, Toon Vanagt)

In the past 6 years Flanders has taken several steps that I think the Netherlands should follow. Based on the underlying legal framework, the Flemish government has taken pre-emptive decisions for all government entities within their scope about in what ways data can and should be published. It is no longer up to the individual agencies, if you decide to publish you must follow the established principles. In the Netherlands that is all still voluntary, and the principles are put forward as guidelines, not as must-follow rules. Similarly the Flemish government has adopted a URI strategy, using both machine and human readable URI conventions, which in the Netherlands is lacking.

It’s been a pleasure to work with Noël and his team in these past 6 years. Whether it was in helping decide on which local and regional open data projects to fund from the Flemish government, translating research on the economic impact of open data to the Flemish and Belgian context, providing scenario’s to the Flemish Chancellary for opening up Flemish consolidated laws and regulations as open data, or providing open data training together with Noel to a joint session of the Dutch and Belgian/Flemish supreme audit authorities.

For each of those 6 years my colleague Paul, representing the Dutch government open data team, and I participated in the Flemish open government days, and its successor the annual Information Flanders Meet-up. It gave us the opportunity to keep comparing Dutch and Flemish open data efforts, to learn from each other as well as laugh about the differences. A fixed feature on the agenda was eating a Portuguese fish soup the evening before the event in Brussels with Noël and his colleagues.


A ‘small bowl’ of fish soup, 2012 and 2015 editions

As Noël said, the work isn’t remotely done, and judging from the conversations we had with Noël last week, he isn’t likely to stop being active either. So I trust we will find ways of working together again in a different setting in the near future.