This week the draft implementation act (PDF) and annex listing the first batch of European High Value Data sets (PDF) has finally been published. In the first half of 2020 I was involved in preparatory research to advise on what data, spread across six predetermined themes, should be put on this mandatory list. It’s the first time open data policy makes the publication of certain data mandatory through an API. Until now European open data policy built upon the freedom of information measures of each EU Member State (MS), and added mandatory conditions to what MS published voluntarily, and to how to respond to public data re-use requests. This new law arranges for the pro-active publication of certain data sets.
In the 2020 research I was responsible for the sections about earth observation, environmental, and meteorological data. We submitted our final report in September 2020, and since then there had been total silence w.r.t. the progress in negotiating the list with the MS, and putting together the implementation act. I knew that at least the earth observation and environmental data would largely be included the way I suggested, when last summer I got a sneak preview of the adaptation of the INSPIRE portal where such data is made available.
The Implementation Act
In the Open Data Directive there’s a provision that the European Commission can, through a separate implementation act, set mandatory open data requirements for data belonging to themes listed in the Directive’s Annex. At launch in 2019, 6 such themes were listed: Geo-information, statistics, mobility, company information, earth observation / environment, and meteorology.
The list of themes can also be amended, through another separate implementation act, and a process to determine the second set of themes is currently underway.
The draft implementation law (PDF) states that government-held datasets mentioned in its Annex must be published through APIs, under an open license such as Creative Commons Zero, By Attribution or equivalent / less restrictive. Governments must publish the terms of use for such APIs and these terms may not be used to discourage re-use. APIs must also be fully publicly documented, and a point of contact must be provided.
MS can temporarily exempt some of the high value datasets, a decision that must be made public, but limited to two years after entry into force of this implementation act. Additional usage restrictions are allowed for personal data within the data sets concerned, but only to the extent needed to protect personal data of individuals (so not as an excuse to disallow re-use and access to the data as a whole).
MS must report on their implementation actions every two years, in which they need to list the actual data sets opened, the links to licenses, API and documentation, and exemptions still in place. The implementation is immediately binding for all MS (no need to first transpose into national law to be enforcable), will apply 20 days after publication in the EU Journal, and MS have 6 months to comply.
The Data Sets Per Theme
In this first batch of mandatory open data, 6 themes are covered (PDF). Some brief remarks on all of them.
Mobility
This is, contrary to what you’d expect, the smallest theme of the six covered. Because everything that is already covered in the Intelligent Transport (ITS) Directive is out of scope, which is most of everything concerning land based mobility. What remains for the High Value Data list is data on transport networks contained in the INSPIRE Annex I theme Transport Networks, and static and dynamic data about inland waterways, as well as the electronic navigational charts (ENC) for inland waterways. This is much in line with the 2020 study report. There was some concern with national hydrographical services about ENCs for seas being included (making it harder to force sea going vessels to use the latest version), but my reassurances that it would be unlikely held true.
Geospatial data
Geospatial data is I would say the ‘original’ high value government data, and has been for centuries. The data sets from the four INSPIRE Annex I themes Administrative Units, Geographical Names, Addresses, Buildings and Cadastral Parcels are within scope. Additionally reference parcels and agricultural parcels as described in the 1306/2013 and 1307/2013 Regulations on the Common Agricultural Policy (CAP) are on the list.
Earth Observation and Environment
This was a theme I was responsible for in the 2020 study. It is an extremely broad category, covering a very wide spectrum of types of data. It was basically impossible to choose something from this list, not in the least because re-use value usually comes from combinations of data, not from any single source used. Therefore my proposed solution was to not choose, and advise to treat it as a coherent whole needed in addressing the EU goals concerning environment/nature, climate adaptation, and pollution. The High Value Data list adopts this approach and puts 19 INSPIRE themes within scope. These are:
- Annex I: Hydrography, and Protected Sites
- Annex II in full: Elevation, Geology, Land Cover, and Ortho-imagery
- Annex III: Area management, Bio-geographical regions, Energy resources, Environmental monitoring facilities, Habitats and biotopes, Land use, Mineral resources, Natural risk zones, Oceanographic geographical features, Production and industrial facilities, Sea regions, Soil, and Species distribution
Additionally all environmental information as covered by the 2003/4/EC Directive on public access to such information is added to the list, and all data originating in the context of a wide range of EU Regulations and Directives on air, climate, emissions, nature preservation and biodiversity, noise, waste and water. I miss soil in this environmental list, but perhaps the Annex III INSPIRE theme is seen as sufficiently covering it. I still need to follow up on the precise formulations w.r.t. data in 31 additionally referenced regulations and directives.
What to me is a surprising phrasing is that earth observation is defined here including satellite based data. Not surprising in terms of earth observation itself, but because satellite data was specifically excluded from the scope of our 2020 study. First because the EU level satellite data is already open. Second because this list deals with data from MS, and not many MS have their own satellite data. When they do it is usually the result of public private collaborative investment, and such private investment may dry up if there are no longer temporary exclusive access arrangements possible, which would have resulted in considerable political objections. Perhaps adding space based data collection is currently being well enough watered down by defining the INSPIRE themes as its scope, while at the same time future proofing the definition for when satellite data does become part of INSPIRE themes.
Together these first three, mobility, geospatial, and EO/environment, place a full 24 out of 34 INSPIRE themes on the list for mandatory open data. This basically amounts to adding an open data requirement to INSPIRE. It places MS’ INSPIRE compliance very much in the focus of attention, which now often is limited, and further positions INSPIRE as a key building block in the coming Green Deal dataspace. It will be of high interest to see what the coming new version of the INSPIRE directive, currently under review, makes of all that.
Statistics
This topic is more widely covered in the High Value Data list, than it was in the 2020 study, both in the types of statistics included, and in the demands made of those types of statistics. Still there are lots of statistics that MS hold, that aren’t included here (while some MS do publish most of their statistics already btw): the selection is based on European reporting obligations that follow from a list of various European laws.
Topics for which statistics must be published as open data in a specified way:
- Industrial production
- Industrial producer price index, by activity
- Volume of sales by activity
- EU international trade in goods
- Tourism flows in Europe
- Harmonised consumer prices indices
- National accounts: GDP, key indicators on corporations and households
- Government expenditure and revenue, government gross debt
- Population, fertility, mortality
- Current healthcare expenditure
- Poverty
- Inequality
- Employment, unemployment, potential labour force
Data for these reporting obligations should be available from the moment the law creating them has been in force. That means for instance that healthcare expenditure should be available from at least 2008, whereas employment statistics must be available from at least 2019, because of the different years in which these laws were enacted.
Company information
Company information from the start has been the most controversial theme of the six covered by this implementation act. I assume this theme has also been the prime political reason for the long delay in the proposal being published. In my perception because this is the only data set that actually might end up challenging the status quo in society (as it involves ownership and power structures, and touches tax evasion). In the 2020 study four aspects were considered, the basic company information, company documents and accounts, ownership information, and insolvency status. Two ended up in the draft law: basic company information and company documents. Opening ownership information, not even the ultimate beneficial ownership (UBO) information, from the start drew vehement objections (including from the Dutch government). Many stakeholders (including the NGO I chair) are disappointed with the current outcome. (Here’s an old blogpost where I explain UBO, and here’s SF writer Brin on what transparent UBO might mean to our societies.) The data that will become open data still may be 2 years in the future: the Open Data Directive allows a 2 year exemption, and this is the data where that exemption will be used I think.
That said, mandatory open company data and documents, even with the delay through exemptions, is already a step forward that puts an end to literally decades of court cases, obstruction, and lobbying for more openness. The very first PSI Directive in 2003 was already an expression of a broad demand for this data, now 20 years on it finally becomes mandatory across the EU. Some people I know have been after this for their entire professional careers and already retired. It’s easy to loose sight of that win when we only focus on not having (ultimate) ownership data included.
Meteorological data
This is the other theme I was responsible for in the 2020 study. Like with company information this is an area where the discussion about making it available for re-use is decades old and precedes digitisation becoming ubiquitous. When I started my open data work in 2008, most of the existing documentation and argumentation for the value of and need for open data concerned meteorological data. A range of EU countries already have this as open data, others not at all. While progress has been made in the past decades, the High Value Data list provides a blanket obligation for all EU MS, a result that would otherwise still be a very long time away if entirely voluntary for the MS involved.
Data included here includes all weather observation data, validated observations / climate data, radar data (useful for things like cloud heights, precipitation and wind), and numerical weather prediction data (these are the outputs of the combined models used for predictions).
The implementation act is up for public feedback until 21 June, but likely will retain its current form. I think it’s a pretty good result, and I am happy that I have been able to contribute to it.
A good summary of the EU HVD list. Thanks for that Ton! #OpenData
This week started with me feeling very uncomfortable and restless. Mostly because I had no overview of everything that is going on, while realising that I will be away for a week soon, and the week after that trip has multiple commitments that require focused preparation. Monday I used to get on top of everything again, and once that was done I felt much better. This week I:
Got to grips with everything going on again
Planned our summer holiday and booked a camp ground
Arranged lodging for a brief visit to Bremen, Germany, to attend the wedding of friends this summer
Sent in a proposal to talk about IndieWeb in WordPress at the Netherlands WordPress Camp in September
Sent in the speaker info for a conference on government information management mid June, where I’ll talk about the significance of the EU’s incoming data / digital legal framework
Finalised the retirement program for our employees, pending their agreement in a meeting mid June
Watched/attended a number of online sessions on personal knowledge management. Most I found not containing anything new to me but still took away a few interesting nudges / ideas. It was in itself a good exercise in processing talks for interesting things
Wrote a briefing for an illustrator to turn an existing image that I use to explain the coming EU dataspace in presentations into a standalone self-explanatory illustration
Saw the EU’s High Value Data list published, 20 months after we submitted our research underpinning that list. Did a first close reading to write up for clients, Dutch government dataholders, and a quick overview for my blog.
Prepared with a client team and their client a quarterly meet-up of all government geodata holders for mid June
Prepared a workshop on data use cases in the context of the Green Deal and EU dataspace, to be held during the week I’m away.
Did the montly salaries payments for our team
Discussed improving our client contract management
Had a very good conversation with Peter, the first of my ‘UnOffice’ conversations. Two other such conversations are planned.
Played a bit with the Hypothesis API, and ended up installing an existing Obsidian plugin to grab annotations. I still want to create something to grab annotations of my own blogposts.
Thursday was a national holiday, we used for a day trip. Y in school had seen an image of a house standing on its roof, and by coincidence I came across a venue that has a house like that and other stuff on mind tricks and optical illussions. So we visited and later enjoyed lunch outside at the shore of a small lake.
Friday school was closed, so we and Y used it to prepare for her birthday party on Saturday (and partly on Sunday)
Took Y to her swimming lesson, while E took our elderly cat to the vet. The swimming lesson was cut short as our cat had to be put down, so we hurried to the vet to say goodbye. This is the first time in over 20 years there’s no cat in our home.
Retrieved our car, once more repaired.
Had our family visiting for Y’s sixth birthday, which made for a fun weekend.
unwrapping gifts with her grandparents, around the dining table. Notice the skeelers Y’s wearing, she unwrapped those at 7am, immediately put them on and kept them on, rolling around the living room all day, until she went to bed in the evening.
Thank you @ton_zylstra Nice overview. I seems that application developers, governments and citizens of the #EuropeanUnion can look forward to high value #OpenData (e.g. Adddress data) also from MS who have until now been rather reluctant. Only sorry that #uk will not benefit.
@ton forwarded it to the KNMI data platform team 😉
I’m posting this a week late, as I started travel last Sunday, before my usual writing time for these week notes.
Because of that travel it was a busy week, preparing material for several commitments the week after, because there won’t be time to prepare after my return.
This week I
Took a deep dive into the proposal for mandatory open environmental data within the EU High Value Data list, following up from last week’s first read through. It’s a broad list, and it does include (real time) measurements not just static material. The list is I think neatly anchored in different existing reporting obligations countries have, not in a limiting way but stating that anything originating in the context of what those reporting obligations cover is in scope. It’s a better anchoring I think than what came out of the study I did for this in 2020, where it was connected to a list called the ‘priority data sets for e-reporting’. That is similar, but the current proposal refers back to the underlying law texts, meaning a much more sound fundament.
Participated in a workshop for the European Commission to help define new themes that will be covered by the EU High Value Data obligations
Informed the Dutch government data holders involved in the EU High Value Data list of my interpretations and questions after reading the proposal.
Finally added the last apps to my new phone, finding the headspace for it as the need to transfer them was rising with travel on the horizon
Did a workshop on risk management for a project I’m involved in, and for the client the project is for.
Annotated a list of potential data use cases tied to the Green Deal, in preparation for a workshop during my absence
With E hosted Y’s party for her 6th birthday, ending with pancakes in the nearby pancake restaurant.
Wrote and submitted a proposal to a new client to explore the open data and data sharing aspects of a national database in which very diverse stakeholders, both public and private, pool data, and where part of that data is sensitive.
Did the monthly invoicing for my company
Had a discussion with a client about the strategic choices before them w.r.t. taking on work and roles in an EU context
On Friday morning booked a new plane ticket for my travel on Sunday, as I originally planned to travel with E’s brother S. He had to cancel, but as he booked both our tickets, cancelling his flight meant cancelling mine as well. Luckily, though not cheaply, I could still get a ticket.
Took Y to her swimming lesson
Flew to Dubrovnik in Croatia, and then traveled onwards by coach to the mountains of Montenegro, for the start of a trip with 70 others, celebrating the 30th anniversary of a student club I was a member of. As Schiphol airport had seen enormous waiting lines and times in the past days, meaning people missed their flights, I reactivated (having deactivated it at the start of the pandemic) my Amsterdam airport speed-through-all-the-controls-within-minutes membership card (it also allows you to bypass all passport controls through iris recognition), and took a very early taxi to the airport. I was early enough to breeze through everything and spent the remainder of my waiting time reading and making notes in a lounge. Just before dark I found myself above a skiing area in the north east of Montenegro.
My Sunday morning flight to Dubrovnik this week.
I wrote up a description when the proposal wast first published: zylstra.org/blog/2022/05/a… Also there will be a list of newly proposed themes soon to extend the high value data llst. Hope that helps. WIll try to write an update now it’s final.
Thanks, @ton_zylstra
Starting in 2010 I have posted an annual ‘Tadaa’ list, a list of things that made me feel I had accomplished something that year. I started doing it in 2010 because I tend to forget things I did after completion. Like last year I didn’t feel much like writing this. It seemed a greyish year, passing in the shadow of the war that Russia wages on Ukraine. A year where Covid is still very much around us, yet things sort-of returned to normal. But for a different value of normal, a somewhat twisted normal, a parallel one. An appearance and pretense of normal perhaps more than an actual normal. An intransitive year almost, taking me from 2021 to 2023, but without object. Or maybe it’s because the last few months were extremely busy, pushing through more than being in the here and now, which sapped the colour from the months preceding it. Which is as good a reason as any to try and list the things that did bring a sense of accomplishment. I do have my day logs from the entire year, as well as kept up posting week notes here, so I can look back at what went on these past 12 months.
So here goes, in no particular order:
The European High Value Data list has become law in December. Two years ago I had a defining influence on the data it lists for earth observation, environment and meteorology. Even if the implementation period is 16 months and some datasets may get a temporary exemption for another two years, and even if it doesn’t go far enough (mostly on company information) to the taste of many, it is an important milestone. It draws the line under discussions about paywalls and exclusive access rights that were already old when I got involved in open data in 2009, in favor of mandatory pro-active publication for all to use freely. I’m glad I could translate my experience in this field into something now enshrined much more solidly in EU law.
We took regular breaks as a family. We started the year in Luzern, spent a week in Limburg in April, spent three weeks in Bourgogne doing most of nothing. Had weekend trips, to various musea for instance. One of the things E and I decided, while hanging out in front of our tent in the Bourgogne last summer, was to mark all school holidays in our own calendar in the coming year, to either take them off ourselves, or to keep them free of work appointments. I think it should be possible without impacting my output, but it will require careful planning.
I’ve kept an actualised guide about the incoming EU data legislation in Dutch for a client. It gets automatically generated directly from my own working notes in Obsidian which appeals to me in terms of nerdy workflow, and it is highly used by Dutch government data holders and regularly mentioned as a very useful resource which speaks to its utility.
I enjoyed homecooking a few software tools. Early in the year I adapted my OPML booklists so they are generated directly from my own book notes. (Although the negative side effect has been I did not blog about my reading at all, which I intend to change soon) I particularly enjoyed enabling myself to post through Micropub to my various websites. Through it I can post from various sources bypassing the WordPress back-end, inluding directly from my local notes in Obsidian, and from my feedreader. Every time feels like magic despite the fact I wrote the scripts myself. I think that sense of magic stems from the reduction of friction it affords.
I helped the foundation I chair through a inconvenient period of administrative issues. Nothing serious in itself, but right at a moment where it did have consequences for the team, which I was able to cushion. We also extended the number of board members, laying a better fundament for the coming years.
The influx of many new users into the Fediverse spurred my involvement in the use and governance of Mastodon. I helped plan a governance structure for the largest Dutch instance, and intend to help out in the coming year as well. We’re building a non-profit legal entity around it, and secured initial funding for that from a source in line with that non-profit status. I enjoyed also kicking off some discussion within the Dutch forum for standards that prescribes the mandatory standards for the Dutch public sector.
I keynoted at BeGeo, the Belgian annual conference of the geo-information sector, at the invitation of the Belgian national geographics institute. It was fun to create the story line for it, as well as enjoyed the sense of traveling and meeting with a professional community I’m normally not part of. It’s the type of thing I often did for years, and I miss it I noticed. Something to look out for in the coming months.
My company had a great year, apart from a hick-up after the summer, to the occasion of which the team rose fantastically. We grew despite that hick-up, adding two new team members in May and September, and signed an additional new hire in December. As of February we will be ten people. The work we’re doing is highly interesting, around digital ethics, data governance mostly, engaging new clients frequently. Our team is a great group of people, and I think we all take good care of eachother. We completed the 11th year of my company which I think is already an amazing run. For next year our portfolio is already mostly filled.
During the pandemic lock-downs in 2021 we hired cabins for all team members at a holiday park to work and hang out together for a week while maintaining social distancing advice. We realised we wanted to do that yearly regardless of pandemics, and did so in 2021 again. It’s an important thing for both the social and professional dimensions of our company.
I took my homecooked projects as the starting point for a presentation at WordCamp Netherlands to plead for more general adoption of IndieWeb principles, specifically webmention and microformats in WordPress which met with good responses and helped spur on at least one coder to finish and publish a plugin. I’m mostly a boundary spanner in these settings, at the edge of communities, in this case the WordPress community, and being able to bring a story and suggestions for change into a commmunity from another context and see it getting a response is something I enjoy.
Seeing Y grow and thrive, in school, socially, reading, swimming, skating.
Decided to join my old fraternity on their 30th anniversary trip to Montenegro, and am glad I did. Montenegro is a beautiful and rugged country.
I’ve been writing in this space continuously for twenty years now. Even if my writing here in the past few months has been less frequent, an expression of how busy it was in other aspects of my life, blogging has been a constant and a key to creating new conversations, connections, ideas and experiments.
I explored new tools to integrate in my personal workflow, like annotating with Hypothes.is, using machine translation (DeepL) and AI text and image generators. This as starting point for turning them into personal software tools in future months.
We spent some days around New Year in Switzerland, visiting dear friends. As years go by, such things become more important, never less. The simple fact of time passing means old friendships carry ever more context and meaning.
Ever onwards! (After having the first week of January off and spending it with the three of us that is.)
E and Y discussing artworks in the Rijksmuseum Twenthe. A great way to spend time together.