I freue mich das ich Ende September über offene Daten sprechen werde auf der Energie.Digital Konferenz in Münster. Das ist eine gute Gelegenheit unsere Erfahrungen mit offene Daten als Verwaltungsinstrument und als Zündholz für die Handlungsfähigkeit von sehr unterschiedlichen Beteiligten zu teilen. Gerade auch bezüglich Energiewende und Stadtwerke, die Themen der Konferenz. Ausserdem kam die Einladung von Max, und wird auch Robert mit dabei sein.

Great initiative. My colleague @palinuro sometimes wears a #missingdata hoodie to get this discussed. Dutch example, now solved McGyver-like, is election results per candidate per polling station, which isn’t collected/kept by election council, just aggregates per municipality. See https://www.zylstra.org/blog/2019/05/missing-numbers-the-gaps-in-government-data/

Replied to

A new weblog has been started by Anna Powell-Smith, called Missing Numbers:

Missing Numbers is a blog about the data that the government should collect and measure in the UK, but doesn’t.

I expect that whatever she finds in missing data within the UK public sector, similar or matching examples can be found in other countries, such as here in the Netherlands.

One such Dutch example are the election results per candidate per polling station. The election council (Kiesraad) that certifies election results only needs the aggregated results per municipality, and that is what it keeps track of. Local governments of course have this data immediately after counting the votes, but after providing that data to the Kiesraad their role is finished.

The Open State Foundation (disclosure: I’m its current chairman of the board) in recent years has worked towards ensuring results per polling station are available as open data. In the recent provincial and water authority elections the Minister for the Interior called upon municipalities to publish these results as machine readable data. About 25% complied, the other data files were requested by the Open State Foundation in collaboration with national media to get to a complete data set. This way for the first time, this data now exists as a national data set, and is available to the public.

Viz of all polling station results of the recent elections by the Volkskrant national paper

Added Missing Numbers to my feedreader.

The Netherlands has the lushest and tastiest grass in the world according to discerning geese, and millions flock to Dutch fields because of it. Farmers rather use the grass for their dairy cows, and don’t like the damage the geese cause to their fields. To reduce damage geese are scared away, their nests spiked, and hunted. Each year some 80.000 geese are shot in the Province South-Holland alone. The issue is that the Dutch don’t eat much wild goose, and hunters don’t like to hunt if they know the game won’t be eaten. The role of the provincial government in the case of these geese is that they compensate farmers for damage to their fields.

20190414 005 Cadzand, Grote Canadese gans
“All your base belong to us…”, Canada geese in a Dutch field (photo Jac Janssen, CC-BY)

In our open data work with the Province South-Holland we’re looking for opportunities where data can be used to increase the agency of both the province itself and external stakeholders. Part of that is talking to those stakeholders to better understand their work, the things they struggle with, and how that relates to the policy aims of the province.

So a few days ago, my colleague Rik and I met up on a farm outside Leiden, in the midst of those grass fields that the geese love, with several hunters, a civil servant, and the CEO of Hollands Wild that sells game meat to both restaurants and retail. We discussed the particular issues of hunting geese (and inspected some recently shot ones), the effort of dressing game, and the difficulties of cultivating demand for geese. Although a goose fetches a hunter just 25 cents, butchering geese is very intensive and not automated, which means that consumable meat is very expensive. Too expensive for low end use (e.g. in pet food), and even for high end use where it needs to compete with much more popular types of game, such as hare, venison and wild duck. We tasted some marinated raw goose meat and goose carpaccio. Data isn’t needed to improve communication between stakeholders on the production side (unless there emerges a market for fresh game, in contrast to the current distribution of only frozen products), but might play a role in the distribution part of the supply chain.

Today with the little one I sought out a local shop that carries Hollands Wild’s products. I bought some goose meat, and tonight we enjoyed some cold smoked goose. One goose down, 79.999 to go.

20190503_104336

20190503_104402

Open Nederland heeft een eerste podcast geproduceerd. Sebastiaan ter Burg is de gastheer en Maarten Brinkerink deed de productie en muziek.

In de Open Nederland podcast komen mensen aan het woord komen die kennis en creativiteit delen om een eerlijke, toegankelijke en innovatieve wereld te bouwen. In deze eerste aflevering gaat het over open in verschillende domeinen, zoals open overheid en open onderwijs, en hoe deze op elkaar aansluiten.

De gasten in deze aflevering zijn:

  • Wilma Haan, algemeen directeur van de Open State Foundation,
  • Jan-Bart de Vreede, domeinmanager leermiddelen en metadata van Kennisnet en
  • Maarten Zeinstra van Vereniging Open Nederland en Chapter Lead van Creative Commons Nederland.

(full disclosure: ik ben zowel bestuurslid van Open Nederland als bestuursvoorzitter van Open State Foundation, waarvan CEO Wilma Haan in deze podcast deelneemt.)

Two years ago a colleague let their dog swim in a lake without paying attention to the information signs. It turned out the water was infested with a type of algae that caused the dog irritation. Since then my colleague thought it would be great if you could somehow subscribe to notifications of when the quality of status of some nearby surface water changes.

Recently this colleague took a look at the provincial external communications concerning swimming waters. A provincial government has specific public tasks in designating swimming waters and monitoring its quality. It turns out there are six (6) public information or data sources from the particular province my colleague lives in concerning swimming waters.

My colleague compared those 6 datasets on a number of criteria: factual correctness, comparability based on an administrative index or key, and comparability on spatial / geographic aspects. Factual correctness here means whether the right objects have been represented in the data sets. Are the names, geographic location, status (safe, caution, unsafe) correct? Are details such as available amenities represented correctly everywhere?

Als ze me missen, ben ik vissen
A lake (photo by facemepls, license CC-BY)

As it turns out each of the 6 public data sets contains a different number of objects. The 6 data sets cannot be connected based on a unique key or ID. Slightly more than half of the swimming waters can be correlated across the 6 data sets by name, but a spatial/geographic connection isn’t always possible. 30% of swimming waters have the wrong status (safe/caution/unsafe) on the provincial website! And 13% of swimming waters are wrongly represented geometrically, meaning they end up in completely wrong locations and even municipalities on the map.

Every year at the start of the year the provincial government takes a decision which designates the public swimming waters. Yet the decision from this province cannot be found online (even though it was taken last February, and publication is mandatory). Only a draft decision can be found on the website of one of the municipalities concerned.

The differences in the 6 data sets are more or less reflective of the internal division of tasks of the province. Every department keeps its own files, and dataset. One is responsible for designating public swimming waters, another for monitoring swimming water quality. Yet another for making sure those swimming waters are represented in overall public planning / environmental plans. Another for the placement and location of information signs about the water quality, and still another for placing that same information on the website of the province. Every unit has their own task and keeps their own data set for it.

Which ultimately means large inconsistencies internally, and a confusing mix of information being presented to the public.