The fate of anyone working to change something in how government works, or any larger organisation or system really, is that most often you’re not around to see the effects. Small course changes can take years to become noticeable shifts, and by that time no-one will remember where that started or who helped start it. So anytime you do get to see a glimpse of how something played out over a longer time frame is a rare gift and compliment. Yesterday I was moved when I was given such a glimpse by a civil servant publicly calling me out in a session that I attended.

A little over nine years ago (my notes tell me), I reached out to a provincial civil servant to tell them I thought it a disgrace how they were treated by their director and that in my view they’d done everything right. They had received a public data request on a (to this day) politically highly sensitive topic. The province not having the data, was the wrong ‘door’ for the request, so the civil servant did the right thing and connected the requester to the government entity that had the data that was requested. A director in the provincial organisation then reprimanded them for doing this, even though their actions were by the book. I had heard about this through my network and contacted the civil servant in question. To tell them they did what was proper, and show my support. They told me thank you and said they had briefly considered quitting, but wouldn’t and would continue to push open government data forward.

Yesterday I was in a session about the use of algorithms in public sector decision making led by that same civil servant. One of the cases they mentioned was how a city government had used an algorithm but then internally had come to the conclusion it was discriminatory, stopped its use and documented the entire process transparently to learn from it. Recently they got vilified in the press (only possible because of their own documentation and transparency), and caught flak. The leader of the session described how useful that well documented case actually was for the entire community of civil servants working on how to responsibly use algorithms in public service. Because of the issues it surfaced, the process it had followed etc. Another participant in the session was using the case as part of their PhD research into these topics.

Then the civil servant leading the session turned to me and said "Then I did what you did for me years ago. I reached out to those civil servants to show my support and tell them how valuable their work was for others and that they did the right thing. It always stayed with me that you contacted me to support me, how important that was for me, and now I paid it forward." When they said that I remembered it, but otherwise had forgotten it happened. It moved me to hear it, and it makes me grateful. Nine years ago I moved a small pebble in a river bed out of basic empathy, and yesterday I got to see how the river that is public sector culture and attitude w.r.t. openness runs a bit differently because of it. It’s a gem of a gift to hear what it meant for the civil servant involved.

Thank you W. for paying it forward. And for letting me know, it means a lot to me.

In October 2017 Maltese investigative journalist Daphne Caruana Galizia was assassinated by planting an explosive in her car. She worked on exposing financial and political corruption, and worked on the Panama Papers. Last year October the murderers were convicted, while the court case against the political and business principal is still ongoing. Four months later Slovak journalist Ján Kuciak was gunned down alongside with his life partner Martina Kušnírová, possibly after his name leaked from freedom of information requests. Kuciak was also involved in reporting on the Panama Papers. In that case the gunmen have been convicted, whereas similarly to Caruana Galizia’s case, the court case against the principal, also a politically connected businessman, is still ongoing. That is to say both were murdered by the klept, to use William Gibson’s label, for their work on transparency. As someone who has worked on government transparency for some 15 years from a different angle, there are always some overlaps between my (net)work and European investigative journalists, the organisations they work with and the projects they work on.

A large part of this week I was in Valletta, Malta, for the EuroGeographics General Assembly. Tuesday afternoon I strolled around town, and made sure to visit the impromptu memorial for Daphne Caruana Galizia that is on Republic Street in front of the courts of justice. Because the case is still ongoing, the corruption still in place therefore. And because upon arrival the very first Maltese newspaper I saw Sunday evening carried a headline about the assassination, still in the spotlight of attention five and a half years after the fact.


The improvised memorial for Daphne Caruana Galizia, calling for justice to be done almost 6 years on. Located on the square in front of St. Johns Cathedral along Republic Street, facing the front doors of the Courts of Justice. (Photo by me)


The Sunday Malta Times of 19 March 2023, with a headline connected to the 2017 assassination of Daphne Caruana Galizia.

Een paar weken geleden had ik een gesprek van een uur met Bart Ensink van Little Rocket over mijn werk en mijn bedrijf The Green Land. Dat gesprek is als zesde aflevering van de Datadriftig podcast nu te beluisteren. We ‘kwamen elkaar tegen’ in de interactie op een draadje op Mastodon in december. Little Rocket is een ebusiness bedrijf en maakt voor hun zakelijke klanten data bruikbaarder. Het is gevestigd in Enschede, dus bracht ik een bezoek aan de stad waar ik tot 6 jaar geleden woonde, en dook Enschede en de Universiteit Twente vaker op in het gesprek.

Today I heard the EU High Value Data list in its first iteration is finally decided upon. In September 2020 we submitted our advice on what data to include in the thematic areas of geographic data, statistics, mobility, company information, meteorology, earth observation and environment. Last week the Member States submitted their final yes/no vote, and the final text was approved. The EC will now finalise the text for publication, and it should be published before the end of the year. It will enter into force 20 days after publication and government data holders have 16 months until April/May 2024 to ensure compliance. It’s been a long path, and this first list could have been better concerning company information. Yet, when it comes to geographic data (addresses, buildings, land parcels, topography), meteorology and that same company information, it draws a line under two decades of discussion, court cases and studies to help dismantle the revenue model of charging at the point of use. Such charges are a threshold to market entry, and are generally lower than the tax revenue otherwise gained from the activities it’s a threshold to.

It’s easy to just move ahead and think about how this is not enough, what still needs doing, how to implement this etc. But it’s good to acknowledge that when I first started working on open government data in 2008 I heard the stories of those who had been at it for many years since well before the first PSI Directive was agreed in 2003. Some of those people have by now been retired for quite some time already, and I worked on it standing on their shoulders. The implementation act for EU high value data sets is a big step, even if in the field we thought it a no-brainer for decades already.

Today a colleague at the Netherlands Space Office showed me a new Copernicus service, the ground motion service (EGMS). Quite an amazing data service to explore. Earlier I wrote about the European forest fire information service (EFFIS), and its use as a proxy for the fighting going on due to the Russian invasion of Ukraine. EGMS is another service based on satellite remote sensing, here radar telemetry tracking the subsidence or rising of the ground. As far as I understand it can’t ‘see’ soft materials (peat land subsiding e.g.), only sees hard materials (solid ground, or buildings on softer grounds).
The images are quite amazing, and the data is provided right alongside it.

First an overview of northern Europe. Blue is rising ground, red is sinking ground. Sweden and Finland show rising ground, this is still the bounce back of the earth since the last ice age ended when the tremendous weight of glaciers was removed. At the tip of the arrow you see subsiding ground, this is the result of gas extraction in Groningen province.

Zooming in on Groningen province, here’s the data for a single house, subsiding 4 centimeters in the past 6 years. No wonder many homes are getting damaged in that area, both from subsidence as well as from the earthquakes that accompany it.

For comparison, here’s the data from the street I live on. It shows a subsidence of 6 millimeters in the past 6 years.

And here’s the same data as in the graph in the image above, but exported from the Copernicus services as an SVG, and pasted here as text.

-14-12-10-8-6-4-202468101214Displacement mm2016011120160428201608142016113020170318201707042017102020180211201805302018091520190101201904192019080520191121202003082020062420201010Measurement dateORTHO Vertical: 20dXRnBSzzDataset: Point ID: Position: Mean velocity: RMSE: ORTHO Vertical20dXRnBSzz3242050.00 N 4007550.00 E -0.60 m-1.10 mm/year0.40 mm

This week the draft implementation act (PDF) and annex listing the first batch of European High Value Data sets (PDF) has finally been published. In the first half of 2020 I was involved in preparatory research to advise on what data, spread across six predetermined themes, should be put on this mandatory list. It’s the first time open data policy makes the publication of certain data mandatory through an API. Until now European open data policy built upon the freedom of information measures of each EU Member State (MS), and added mandatory conditions to what MS published voluntarily, and to how to respond to public data re-use requests. This new law arranges for the pro-active publication of certain data sets.

In the 2020 research I was responsible for the sections about earth observation, environmental, and meteorological data. We submitted our final report in September 2020, and since then there had been total silence w.r.t. the progress in negotiating the list with the MS, and putting together the implementation act. I knew that at least the earth observation and environmental data would largely be included the way I suggested, when last summer I got a sneak preview of the adaptation of the INSPIRE portal where such data is made available.

The Implementation Act

In the Open Data Directive there’s a provision that the European Commission can, through a separate implementation act, set mandatory open data requirements for data belonging to themes listed in the Directive’s Annex. At launch in 2019, 6 such themes were listed: Geo-information, statistics, mobility, company information, earth observation / environment, and meteorology.
The list of themes can also be amended, through another separate implementation act, and a process to determine the second set of themes is currently underway.

The draft implementation law (PDF) states that government-held datasets mentioned in its Annex must be published through APIs, under an open license such as Creative Commons Zero, By Attribution or equivalent / less restrictive. Governments must publish the terms of use for such APIs and these terms may not be used to discourage re-use. APIs must also be fully publicly documented, and a point of contact must be provided.

MS can temporarily exempt some of the high value datasets, a decision that must be made public, but limited to two years after entry into force of this implementation act. Additional usage restrictions are allowed for personal data within the data sets concerned, but only to the extent needed to protect personal data of individuals (so not as an excuse to disallow re-use and access to the data as a whole).

MS must report on their implementation actions every two years, in which they need to list the actual data sets opened, the links to licenses, API and documentation, and exemptions still in place. The implementation is immediately binding for all MS (no need to first transpose into national law to be enforcable), will apply 20 days after publication in the EU Journal, and MS have 6 months to comply.

The Data Sets Per Theme

In this first batch of mandatory open data, 6 themes are covered (PDF). Some brief remarks on all of them.

Mobility

This is, contrary to what you’d expect, the smallest theme of the six covered. Because everything that is already covered in the Intelligent Transport (ITS) Directive is out of scope, which is most of everything concerning land based mobility. What remains for the High Value Data list is data on transport networks contained in the INSPIRE Annex I theme Transport Networks, and static and dynamic data about inland waterways, as well as the electronic navigational charts (ENC) for inland waterways. This is much in line with the 2020 study report. There was some concern with national hydrographical services about ENCs for seas being included (making it harder to force sea going vessels to use the latest version), but my reassurances that it would be unlikely held true.

Geospatial data

Geospatial data is I would say the ‘original’ high value government data, and has been for centuries. The data sets from the four INSPIRE Annex I themes Administrative Units, Geographical Names, Addresses, Buildings and Cadastral Parcels are within scope. Additionally reference parcels and agricultural parcels as described in the 1306/2013 and 1307/2013 Regulations on the Common Agricultural Policy (CAP) are on the list.

Earth Observation and Environment

This was a theme I was responsible for in the 2020 study. It is an extremely broad category, covering a very wide spectrum of types of data. It was basically impossible to choose something from this list, not in the least because re-use value usually comes from combinations of data, not from any single source used. Therefore my proposed solution was to not choose, and advise to treat it as a coherent whole needed in addressing the EU goals concerning environment/nature, climate adaptation, and pollution. The High Value Data list adopts this approach and puts 19 INSPIRE themes within scope. These are:

  • Annex I: Hydrography, and Protected Sites
  • Annex II in full: Elevation, Geology, Land Cover, and Ortho-imagery
  • Annex III: Area management, Bio-geographical regions, Energy resources, Environmental monitoring facilities, Habitats and biotopes, Land use, Mineral resources, Natural risk zones, Oceanographic geographical features, Production and industrial facilities, Sea regions, Soil, and Species distribution

Additionally all environmental information as covered by the 2003/4/EC Directive on public access to such information is added to the list, and all data originating in the context of a wide range of EU Regulations and Directives on air, climate, emissions, nature preservation and biodiversity, noise, waste and water. I miss soil in this environmental list, but perhaps the Annex III INSPIRE theme is seen as sufficiently covering it. I still need to follow up on the precise formulations w.r.t. data in 31 additionally referenced regulations and directives.

What to me is a surprising phrasing is that earth observation is defined here including satellite based data. Not surprising in terms of earth observation itself, but because satellite data was specifically excluded from the scope of our 2020 study. First because the EU level satellite data is already open. Second because this list deals with data from MS, and not many MS have their own satellite data. When they do it is usually the result of public private collaborative investment, and such private investment may dry up if there are no longer temporary exclusive access arrangements possible, which would have resulted in considerable political objections. Perhaps adding space based data collection is currently being well enough watered down by defining the INSPIRE themes as its scope, while at the same time future proofing the definition for when satellite data does become part of INSPIRE themes.

Together these first three, mobility, geospatial, and EO/environment, place a full 24 out of 34 INSPIRE themes on the list for mandatory open data. This basically amounts to adding an open data requirement to INSPIRE. It places MS’ INSPIRE compliance very much in the focus of attention, which now often is limited, and further positions INSPIRE as a key building block in the coming Green Deal dataspace. It will be of high interest to see what the coming new version of the INSPIRE directive, currently under review, makes of all that.

Statistics

This topic is more widely covered in the High Value Data list, than it was in the 2020 study, both in the types of statistics included, and in the demands made of those types of statistics. Still there are lots of statistics that MS hold, that aren’t included here (while some MS do publish most of their statistics already btw): the selection is based on European reporting obligations that follow from a list of various European laws.
Topics for which statistics must be published as open data in a specified way:

  • Industrial production
  • Industrial producer price index, by activity
  • Volume of sales by activity
  • EU international trade in goods
  • Tourism flows in Europe
  • Harmonised consumer prices indices
  • National accounts: GDP, key indicators on corporations and households
  • Government expenditure and revenue, government gross debt
  • Population, fertility, mortality
  • Current healthcare expenditure
  • Poverty
  • Inequality
  • Employment, unemployment, potential labour force

Data for these reporting obligations should be available from the moment the law creating them has been in force. That means for instance that healthcare expenditure should be available from at least 2008, whereas employment statistics must be available from at least 2019, because of the different years in which these laws were enacted.

Company information

Company information from the start has been the most controversial theme of the six covered by this implementation act. I assume this theme has also been the prime political reason for the long delay in the proposal being published. In my perception because this is the only data set that actually might end up challenging the status quo in society (as it involves ownership and power structures, and touches tax evasion). In the 2020 study four aspects were considered, the basic company information, company documents and accounts, ownership information, and insolvency status. Two ended up in the draft law: basic company information and company documents. Opening ownership information, not even the ultimate beneficial ownership (UBO) information, from the start drew vehement objections (including from the Dutch government). Many stakeholders (including the NGO I chair) are disappointed with the current outcome. (Here’s an old blogpost where I explain UBO, and here’s SF writer Brin on what transparent UBO might mean to our societies.) The data that will become open data still may be 2 years in the future: the Open Data Directive allows a 2 year exemption, and this is the data where that exemption will be used I think.
That said, mandatory open company data and documents, even with the delay through exemptions, is already a step forward that puts an end to literally decades of court cases, obstruction, and lobbying for more openness. The very first PSI Directive in 2003 was already an expression of a broad demand for this data, now 20 years on it finally becomes mandatory across the EU. Some people I know have been after this for their entire professional careers and already retired. It’s easy to loose sight of that win when we only focus on not having (ultimate) ownership data included.

Meteorological data

This is the other theme I was responsible for in the 2020 study. Like with company information this is an area where the discussion about making it available for re-use is decades old and precedes digitisation becoming ubiquitous. When I started my open data work in 2008, most of the existing documentation and argumentation for the value of and need for open data concerned meteorological data. A range of EU countries already have this as open data, others not at all. While progress has been made in the past decades, the High Value Data list provides a blanket obligation for all EU MS, a result that would otherwise still be a very long time away if entirely voluntary for the MS involved.
Data included here includes all weather observation data, validated observations / climate data, radar data (useful for things like cloud heights, precipitation and wind), and numerical weather prediction data (these are the outputs of the combined models used for predictions).

The implementation act is up for public feedback until 21 June, but likely will retain its current form. I think it’s a pretty good result, and I am happy that I have been able to contribute to it.