Last year I spent a large amount of time participating in the study that provided advice on which government data sets to include in the mandatory list that is part of the Open Data Directive.

The Open Data Directive, which should have been transposed by EU Member States into national law by last July, but still mostly isn’t, provides a role to the EC to maintain a list of ‘high value data sets’ that all EU countries must make freely available for re-use through APIs and bulk download. This is the first time that it becomes mandatory to pro-actively publish certain data as open government data. Until now, there were mandatory ways to provide open data upon request, but the pro-active publication of such open data has always been voluntary (with various countries making a wide variety of voluntary efforts btw). Also the availability of government data builds on the national freedom of information framework, so the actual availability of a certain data set depends on different legal considerations in different places. The high value data list is the first pan-EU legal requirement that is equal in all EU Member States.

I was part of a team that provided a study into which data sets should appear on that high value data list. The first iteration of this list (to be extended and amended periodically) by the EC covers six thematic areas: geographic data, statistics, mobility data, company information, earth observation and environment, and meteorology. I was responsible for the sections on earth observation and environment, and meteorology, and I’m eager to see how it has been translated into the implementation act as for both those thematic areas it would mean a very significant jump in open data availability if the study results get adopted. We submitted our final report by September 2020, and in the year since then we’re all waiting to see how the implementation act for the high value data will turn out. Our study only is a part in that, as it is itself an input for the EC’s impact assessment for different choices and options, which in turn forms the basis for a negotiation process that includes all Member States.

Originally the implementation act was expected to be published together with other EC proposals, such as the Data Governance Act last December. This as the EU High Value Data list is part of a wider newly emerging EU legal framework on digitisation and data. But nothing much happened until now. First the expectation was Q1, then by the summer, then shortly after the summer, and now the latest I hear from within the EC is ‘hopefully by the end of the year’.

It all depends on poltical will at this stage it seems, to move the dossier forward. The obstacle to getting the implementing act done apparently is what to do with company data (and ultimate beneficial ownership data). Opening up company registers has clear socio-economic benefits, outweighing the costs of opening them up. There are privacy aspects to consider, which can be dealt with well enough I think, but were not part of our study as it only considered socio-economic impacts and expected transition costs, and the demarcation between the Open Data Directive and the GDPR was placed outside our scope.

There apparently is significant political pressure to limit the openness of such company registers. There must be similarly significant political pressure to move to more openness, or the discussion would already have been resolved. It sounds to me that the Netherlands is one of those politically blocking progress towards more openness. Even before our study commenced I heard rumours that certain wealthy families had the ear of the Dutch prime minister to prevent general free access to this data, and that the pm seemed to agree up to the point of being willing to risk infringement proceedings for not transposing the Open Data Directive completely. As it stands the transposition into national law of the Open Data Directive hasn’t happened mostly, and the implementing act for high value data hasn’t been proposed at all yet.

Access Info, the Madrid based European NGO promoting the right to access to information, has in June requested documents concerning the high value data list from the EC, including the study report. The report hasn’t been published by the EC themselves yet because it needs to be published together with the EC’s impact assessment for which the study is an input, and alongside the implementation act itself.
Access Info has received documents, and has published the 400+ page study report (PDF) that was submitted a year ago, alongside a critical take on the company register data issue.

I am pleased the study is now out there informally at least, so I can point people to sections of it in my current work discussions on how to accomodate the new EU legal framework w.r.t. data, of which the high value open data is a part. Previously there were only publicly available slides from the last workshop that was part of the study, which by neccessity held only general information about the study results. Next to company registers, which I am assuming is the roadblock, there is much in the study that also is of importance and now equally suffering under the delays. I hope the formal publication of the report will follow soon. The publication of the impementing act is a key step for European open data given its first ever EU-wide mandates.

Virk Data Dag
A 2014 workshop, Virk Data Dag, at the Danish Business Authority discussing use cases for the open Danish company register, where I presented and participated.

Finally, a declaration of interests is in order for this posting I think:

  • My company was part of the consortium that did the mentioned study. I led the efforts on earth observation and environmental data, and on meteorological data.
  • In current work for my company, the implementation act for high value data, and other recent EC legal proposals are of importance, as I am helping translate their impact and potential to Dutch national (open) data infrastructure and facilitating data re-use for public issues.
  • I am a voluntary board member of the NGO Open State Foundation. OSF advocates full openness of company registers, and co-signed the critical take Access Info published. The board has no influence on day to day actions, which are the responsibility of the NGO’s director and team.
  • I am personally in favor of opening up company registers as open data.
    • I think that privacy issues can be readily addressed (something that is directly relevant to me as a sole trader business, as co-owner of an incorporated business, and as a board member of an association for which my home address is currently visible in the company register)
    • I think that being visible as a business owner or decision maker is part of the social contract I entered into in exchange for being personally shielded from the business risks I am exposed to. Society is partially shielding me from those risks, as it allows social benefits to emerge (such as creating an income for our team), but in turn people need to be able to see who they’re dealing with.
    • I think there is never a business case where charging fees for access to a monopolistic government database such as company registries makes sense. Such fees merely limit access to those able to afford it, causing unequality of access, and power and information assymmetries. Data collection for public tasks is a sunk cost by definition, access fees are always less over time than the additional tax revenue and social value resulting from re-use of freely available data. The only relevant financial aspect to address is that provision costs accrue with the dataholder and benefits with the treasury, which general budget financing is the remedy for
    • I think that already open company registers in Europe and elsewhere provide ample evidence that many formulated fears w.r.t. such openness don’t become reality.

8 reactions on “EU High Value Data Study Informally Published

  1. This week felt a bit sluggish, but was otherwise ok. The sluggishness was caused by participating in a client’s group event on Monday afternoon and evening and Tuesday all day. With the pandemic measures relaxed as of last week, this event was face to face. This turned out to be remarkably intensive for me, clearly I had to get used again to having a room full of people and especially their conversations around me. After Tuesday I much needed time to rest which left limited time for other things.
    This week I

    spent two days on a client’s all hands meeting
    learned that I will be a speaker at the FOSS for Geo Netherlands conference in a few weeks
    worked on a proposal for a client to submit to their client
    had planned to attend an in-person event of the Medinge Group, of which I used to be a member for a decade, but in the end decided against it based on a slightly runny nose and the intensity of the event at the start of the week
    got lots of positive feedback on my work on the EU legal framework for data, including an additional small assignment to match the new framework against a specific pilot project on data architecture
    had the usual weekly client meetings
    got asked to participate in a podcast session on note taking tools
    checked with the EC on the progress w.r.t. the EU High Value Data list, which turned out to currently stalled
    had an interesting conversation about knowledge .management with a client that saw significant growth in their team. This led me to resurfacing some projects I did 2000-2010 which I feel still have meaningful lessons to be taken from them, and writing notes about them
    talked about non-fiction book writing with E, which she suggested might be a useful thing for both herself and me. I find I feel a lot of resistance against the notion. Upon closer inspection this is based on a) mistrusting the (PR-)objectives of a lot of authors of non-fiction books, b) the notion that many such books are anecdote padded ideas that themselves would fit on a few pages, ideas usually more intuited than researched, and c) not seeing a topic that shouldn’t also belong on the heap of books that needn’t be written. At the same time I buy loads of non-fiction books, so there’s some paradox there to further explore.
    worked on a series of 4 sessions and workshops for a Dutch Ministry later this month
    had a conversation with colleagues at the World Bank about a potential assignment for digital transformation work in Jordan
    worked on the 2020 bookkeeping in preparation for doing the 2020 tax returns for both E and me
    did some preparation for a session on (open) data for anti-corruption efforts, which I will be presenting next week
    did some work in the garden, removing summer items such as parasols, and prepare for the next season
    gave most our apple harvest away to four neighbours, keeping enough for baking an apple pie for ourselves during the weekend

  2. Today I learned that on 30 September the EC has initiated infringement procedures against 19 EU Member States because of failing to provide complete information on the transposition in national law of the new Open Data Directive.
    Just one day before that announcement I already wrote that transposition was far from complete, but I hadn’t noticed that the day after that turned out to apply to 19 countries or about 70% of Member States.
    One of those countries, unsurprisingly, is the Netherlands. Unsurprising because the work on the transposition only started in earnest last February. A year delayed because of shifting priorities due to the pandemic, and much too late to ensure timely compliance, for which the deadline was last July.
    My contact on this within the responsible ministry however told me that progress has been made. An internet consultation on the new law should open shortly after New Year, meaning the text of the proposed transposition will be publicly available then. I am eager to read it.

  3. De Rechtbank Midden-Nederland heeft op 22 december een belangrijke uitspraak gedaan. De Kamer van Koophandel (KvK) kan zich niet beroepen op het databankenrecht in het stellen van eisen aan gebruikers van de data uit het Handelsregister.
    De KvK had per 1 januari 2021 haar voorwaarden aangepast en daarbij het databankenrecht voor zichzelf geclaimd. Dit was een redelijk verbijsterende stap, die ik zelf interpreteerde als bewust obstructieve handeling vooruitlopend op een mogelijke verplichting tot open data binnen de nieuwe Hergebruiksrichtlijn / Open Data Richtlijn. Die nieuwe Hergebruiksrichtlijn verbiedt namelijk in Artikel 1.6 het gebruik van het databankenrecht door publieke instellingen (zoals ZBO’s als KvK ook) om extra toegangsrestricties te kunnen opleggen behoudens wat is toegestaan in de Hergebruiksrichtlijn zelf. Dat wist de KvK uiteraard ook sinds 2019, en dus was het tevoren een nieuwe status quo proberen te creëren (zodat je die kunt verdedigen tegen de ‘overmatige’ eisen van nieuwe regels) een daad van obstructie in mijn ogen.
    Die in 2021 verwachte verplichting is er overigens nog niet. Enerzijds omdat Nederland in gebreke is gebleven bij het aanpassen van de Wet Hergebruik Overheidsinformatie dat afgelopen juli rond had moeten zijn (en daarvoor in het strafbankje is gezet door de EU). Anderzijds omdat de implementatiewet t.a.v. verplichte open data die een jaar geleden al bekend had zullen zijn nog altijd niet gepubliceerd is (waarschijnlijk door politieke onenigheid tussen lidstaten over precies diezelfde handelsregisters.).
    Terecht is over de nieuwe gebruiksvoorwaarden een zaak begonnen door een aantal commerciële hergebruikers van informatie uit het Handelsregister. In deze zaak is nu uitspraak gedaan.
    Die uitspraak zaagt de stoelpoten onder KvK weg. Waar de Hergebruiksrichtlijn stelt dat het hebben van databankenrecht de werking van de Hergebruiksrichtlijn kan beperken, maar dat overheden zich daar niet op mogen beroepen, stelt de rechter dat de KvK helemaal geen databankenrecht heeft.
    In 2009 hadden we al de Landmark zaak tegen de Gemeente Amsterdam waar de rechter uitsprak dat de Gemeente geen producent in de zin van het databankenrecht is omdat de Gemeente niet aan de investeringsvoorwaarde voor toekenning daarvan voldeed.
    In de nu gedane uitspraak wordt door de rechter erkend dat de KvK weliswaar flinke inspanningen investeert in het opbouwen van het Handelsregister. Maar omdat er geen enkel financieel risico is voor die investering (wettelijk gedekt door de overheid), en omdat het de uitvoering van een wettelijke taak betreft die de investering noodzakelijk maakt, is er geen economische rechtvaardigingsgrond voor databankenrechten. De KvK is, aldus de uitspraak, geen producent in de zin van de Databankenwet. Daarmee is er een streep getrokken door de obstructieve elementen in de gebruiksvoorwaarden van de KvK die begin dit jaar zijn ingevoerd.
    In deze context is ook de recente (10-12) brief aan de Tweede Kamer van de Minister van EZK over de datavisie van het Handelsregister interessant. Daarin wordt ondermeer ingegaan op problemen die ervaren worden met het Handelsregister. Het gaat dan tegelijkertijd om teveel toegang en hergebruik, als om te weinig toegang en hergebruik. De brief positioneert dat als een tegenstelling tussen privacy en transparantie, maar dat is een vals dilemma. Je kunt gerust beiden maximaliseren.
    EZK stelt, terecht, dat het hebben van een betaalmuur geen privacybescherming kan zijn (het betekent hooguit dat alleen mensen met iets meer te besteden je privacy schenden), en stelt ook terecht het maatschappelijk belang van openbare informatie over rechtspersonen centraal (ik schreef er al eerder over, in ruil voor het zichtbaar maken van wie ik ben als ondernemer, zodat anderen kunnen nagaan met wie ze te maken hebben, krijg ik als ondernemer ook bepaalde voordelen en rechtsbescherming die me in mijn ondernemendheid kunnen stimuleren).
    Wat niet klopt in de brief van EZK is dat de voorstanders van open data alleen naar de socio-economische opbrengsten zouden kijken, en EZK probeert ook nog eens aan te voeren dat de socio-economische voordelen die met die data elders zijn bereikt misschien door andere oorzaken zijn ontstaan. Hier wordt kennelijk ook verwezen naar het nog niet openbaar gemaakte maar wel gepubliceerde advies over de verplichte open data. Daarin wordt inderdaad bijna niets gezegd over bescherming van persoonsgegevens, omdat dat door de Europese Commissie nadrukkelijk buiten scope van het onderzoek en advies was geplaatst. De AVG is gewoon een gegeven.
    Er is volgens mij niemand die de privacy problemen rond het Handelsregister miskent. Zoals bijvoorbeeld dat tweederde van alle rechtspersonen op een woonadres van iemand (zoals ik) staan.
    De volgorde van redenering t.a.v. open data begint namelijk met de bescherming van persoonsgegevens:
    De AVG beschermt persoonsgegevens en werkt beperkend op de WOB (WOO).
    Wat openbaar is wordt geregeld in de WOB (WOO), met persoonsgegevens als uitzonderingsgrond, en een aantal specifieke wetten (bijv Handelsregisterwet, Kadasterwet). Die laatsten, de Handelsregisterwet en Kadasterwet maken een gerichte afweging t.a.v. het maatschappelijk belang van openbaarheid in het economisch verkeer van bepaalde persoonsgegevens versus de bescherming van persoonsgegevens.
    De Hergebruiksrichtlijn stelt dat wat openbaar is, herbruikbaar moet kunnen zijn.
    De Implementatiewet m.b.t. High Value Data in de Hergebruiksrichtlijn maakt het pro-actief voor hergebruik publiceren van sommige data die herbruikbaar moet kunnen zijn verplicht (waaronder waarschijnlijk het Handelsregister).
    De AVG en de Hergebruiksrichtlijn met de Implementatiewet EU High Value Data zijn gelijktijdig van groot maatschappelijk belang voor iedereen, en dat is niet strijdig met elkaar. Ook niet bij het Handelsregister. De bewegende delen zitten in de Handelsregisterwet en de praktische informatiehuishouding van de KvK. Daar moeten de problemen worden opgelost. Niet met pogingen het zo in te richten dat je als KvK vooral zelf niets hoeft te doen door je voorwaarden aan te passen. Het is een herkenbaar patroon dat we ook bij de WOO al zagen. De VNG/Gemeenten vonden voldoen aan de WOO eerst te duur, en kregen toen extra de tijd van Tweede Kamer om e.e.a. te regelen, om vervolgens bij voorbaat al te zeggen dat het nooit gaat lukken. De EU open data verplichting voor het Handelsregister is net zomin als de invoering van de WOO voor gemeenten het probleem. De nieuwe wetten maken alleen heel nadrukkelijk zichtbaar dat de KvK, en m.b.t. de WOO de Gemeenten, hun informatiehuishouding niet op orde hebben. De inspanningen die nodig zijn om te kunnen gaan voldoen zijn namelijk niet de invoeringskosten van die wetten, maar zijn de optelsom van achterstallig onderhoud aan je informatiehuishouding, legacy systemen en als organisatie nalaten vooruit te kijken in je eigen informatievak startend vanuit de positieve impact die je op de omgeving nastreeft. Die optelsom van organisatiegebreken wordt nu slechts voor iedereen, inclusief de KvK zelf, zichtbaar door die nieuwe wetten, omdat je er nu eindelijk iets aan moet gaan doen.
    De obstructiepoging van de KvK is met de rechterlijke uitspraak niet alleen terzijde geschoven door bijvoorbeeld te zeggen dat databankenrecht niet mag worden gebruikt om hergebruik te beperken. Sterker, de argumentatie van KvK is geheel ondergraven: de KvK heeft helemaal geen databankenrecht.
    Dat is een mooi kerstcadeau voor iedereen.

  4. In juni 2019 nam de Europese Unie een nieuwe Open Data Richtlijn aan. Deze moest vervolgens binnen 2 jaren in alle Lidstaten in nationale wetgeving worden omgezet. Dat is in veel landen niet op tijd gebeurd, waaronder in Nederland. Ondermeer door de pandemie ging de aandacht naar andere dingen. Het Ministerie van Binnenlandse Zaken heeft de draad een tijdje geleden weer opgepakt, en dat resulteert in een Kerstcadeautje: de internetconsultatie voor de nieuwe Wet Hergebruik Overheidsinformatie is op 24 december geopend. Tot 6 februari kan iedereen zijn inzichten delen over de ontwerptekst.
    De nieuwe Open Data Richtlijn breidt de scope van organisaties die onder de Richtlijn uit, voegt dynamische data nadrukkelijk toe (vroeger kwamen overheden er mee weg om real time meetgegevens niet te delen omdat op het moment van aanvragen die gegevens nog niet bestonden, alleen een jurist met ‘nee’ als uitgangspunt verzint zoiets) en voegt een lijst verplichte open data publicaties toe (die regels moeten nog in een aparte wet gepubliceerd worden, de Europese Commissie heeft dat nog niet gedaan, al zou dat al een jaar geleden rond zijn).
    Verder worden weer stappen gezet in het verder beperken van exclusieve overeenkomsten en het vragen van geld voor gegevensverstrekking, zoals vorige edities van deze Richtlijn ook telkens deden.
    Ik ga dit kerstcadeau nog niet openmaken, maar neem in januari uitgebreid de tijd om de tekst te lezen en zonodig van opmerkingen te voorzien.
    (Full disclosure: ik werkte in 2017/2018 mee aan de evaluatie en impact assessment van de vorige Hergebruiksrichtlijn. De resultaten daarvan hebben mede de inhoud van de nieuwe Open Data Richtlijn in 2019 bepaald.)

  5. Starting in 2010 I have posted an annual ‘Tadaa’ list, a list of things that made me feel I had accomplished something.
    This is the first time in 11 years I did not feel like making this list. This second pandemic year was again a year where our lives had a small and local scope mostly, where most days just carried over into the next. Additionally as I’ve been keeping day logs since April 2020, and have been posting week notes for three years now, maybe there’s less of an internal need of looking back annually, as unlike a decade ago I’ve been doing it weekly and daily for myself as well. Mostly I think it’s the pandemic, where nothing much happens during a year of staying home almost exclusively. As E mentioned this week, you miss out on so much coincidental inspiration, ideas and associative thoughts that you’d normally get from just being out in the world.
    Yet, maybe that means I really should be making the effort of writing the annual list. So here goes, in no particular order.

    Made sure that Y got to fully enjoy playing in the snow, and skating on the ice, for the few days in February that both were possible. Important memories to make with her.
    E and I made it work well at home, despite irregular school closures, a quarantine, and having Covid breach our household. I appreciated our house a lot, allowing us space as it does to both have our own home office, being able to sit in the garden under the apple tree or at the water’s edge watching the swans, ducks and coots. We complemented each other well, and E even completed a half year training program on data and AI on top of all of it.
    Went away when we could, e.g. to Zeeland over the easter weekend, enjoyed some lunches in town, visited a few museums.
    We spent two weeks in Copenhagen in the summer in a beautiful house we rented. Cycling through the city, just hanging out, meeting up with friends and having a nice place to return to or stay at and relax for a day was a great break. I am very glad that I booked the rental early in the spring, when it wasn’t at all clear that it would be even possible to travel across inner-EU borders. Just the act of having booked it was valuable as it put something on the horizon a few months out.
    A week in Versailles and Paris at the end of summer was an unplanned but huge pleasure. We enjoyed camping out in a forest area on the edge of Versailles, while having Paris within 30 mins by train and the railway station a 10 minute walk away. We got to be outside a lot, played around with Y in the camp ground’s swimming pool, while also exploring Paris (which Y loved), taking in (a small section of) the Louvre, and having lunch and coffee any place we liked. Paris wasn’t very busy, but not empty either, the perfect setting to roam as we pleased in a city that was lively enough to feel its pulse. It was a very energising week, and the best spur of the moment decision we made this year.
    Volunteered to speak at the FOSS4G Netherlands conference this fall, that fell in the brief period where such events could take place face to face.
    My company had a good year, again well above the pre-pandemic 2019. Our team I think grew tighter, and we managed to have a lot of fun despite the pandemic measures taking a mental toll on all of us at times. That financially things went well helped as stabilising factor, reducing uncertainty in uncertain times. Renting cabins in a holiday park in June, so we could work together for a week while each having our own cabin, is something to do during regular years as well. Last month it was a decade ago that we started our company, and in fact I feel these past years, despite the pandemic, were the best ones as a group and for me personally had most meaning.
    I got to work this year on a topic that I really enjoy, learning to work with and within the coming EU digital and data legal framework. The work evolved from a study I did last year, advising the European Commission on the planned open data obligations for EU countries. This important wave of 6 pieces of legislation is the biggest influence on data governance in Europe since the original PSI Directive and INSPIRE Directive 10-15 years ago. It goes much deeper and is much wider in scope than what came before though. There’s a renewed elan, and I feel the type of energy that my work 10 years ago generated around European open data efforts. This new wave will be key to any data work for at least five years, if not for the rest of the decade.
    For next year, I’ve already signed a contract with a client to keep track of those European developments, help Dutch dataholders and users to leverage their potential, and build bridges to initiatives elsewhere in Europe. It provides me with even more time to do that, which allows me to organise it more as a program of continuous work, not like one project out of several. I hope and intend to use this opportunity to help drive the momentum from this new batch of data legislation in 2022.
    I’ve been writing my blog here for 19 years now. Again this year it was an important instrument in having and generating conversations with a wide variety of people. In these stay at home times having a way of connecting to people all over the world is very valuable, and doing it all from my own domain is a source of agency. Thank you to all I had the opportunity to interact with this year, to all who dropped by in my inbox.
    Last year I started making a notes system (in Obsidian) having revamped my personal KM system. Last year I made some 800 conceptual notes mostly gleaned form existing blogposts and presentations I wrote the past 20 years. That number hasn’t grown very fast this year, to a 1050 plus about 200 more factual notes. Together with an ideas collection, and book notes they make some 1650 notes, or about a third of the total number of 5000 notes in my PKM system. Other notes are work related notes, day logs and an annotated library of things that caught my eye this year. I am happy it felt effortless to keep the note making going this year, even if I feel I had too little time to actually sit down and think and write, growing the conceptual part of it all. I’ve also done little non-fiction reading, an annual complaint I have though it was more than in previous years. Such reading provides input that could let my notes grow. Having dusted off my PKM system last year has really helped me this year in keeping track of my work, and being able to keep building on little things I started earlier and then had to leave alone for a while. What pleases me no end, in terms of reducing friction and the sense of ‘magic’ that I got it to work, I now run two client websites, where I publish information for them directly from my notes collection. It allows me to work in my own notes on my own laptop, and in the background GitHub ensures that those notes get published as a website.
    I’m what is called the ‘programming equivalent of a home cook. Making small adaptations to my laptop’s working environment, and little pieces of code to help me do some tasks is gratifying (if sometimes frustrating during the process of creation), and let’s me incrementally reduce friction in my workflows. This year I enjoyed rummaging around the back-end of my feed reader, and experimenting with what I call federated bookshelves, and a few other small things. The federated bookshelves stuff will be a topic of discussion and, I hope, making during a tentatively planned online IndieWeb meet-up in February on distributed libraries.

    In terms of work hours, I mostly worked about 3 days per week in the first six months, using the rest to balance the logistics of a household in times of pandemic and find some space for myself. The rest of the year I worked more or less fulltime.
    As we’ve been home mostly I had ample time to read, just over 70 books, of which a handful non-fiction. Fiction reading is something I worked into my day well in the past years (at least 30 mins before sleeping, an easy to arrange habit). The non-fiction reading is still something I want to find a working flow and rhythm for (and have been for years). It requires making time in a way that is less easy (reading, noting, thinking) than it is for fiction. On the plus side, the non-fiction I did read I also much more actively made notes on.
    We will spend some days around New Year in Switzerland, visiting dear friends. A tradition we couldn’t adhere to last year, but can do this year (if we test negative before leaving).
    Ever onwards! (After having the first week of January off that is)
    2021 wasn’t a piece of cake, but like the one pictured despite its imperfections and cracks still held beauty. I enjoyed this raspberry and chocolate confection towards the end of a joyful day with E and Y in Tivolo Gardens in Copenhagen last August.

  6. It’s finally here, published today: the proposal for the EU High Value Data list. The list for the first time makes open data publication mandatory for government concerning (for now) 6 themes (geographic information, meteorology, mobility, statistics, earth observation and environment, and company information). Already in September 2020 an impact assessment and advice on policy options was delivered to the European Commission. I was part of that assessment team, and responsible for the themes Meteorology, Earth Observation and environment. Now we get to see what has been proposed to be implementend in law. I haven’t read it yet, will do that tomorrow first thing, but wanted to share the link here. There’s a window for feedback on the proposal open until 21 June 2022.

  7. This week the draft implementation act (PDF) and annex listing the first batch of European High Value Data sets (PDF) has finally been published. In the first half of 2020 I was involved in preparatory research to advise on what data, spread across six predetermined themes, should be put on this mandatory list. It’s the first time open data policy makes the publication of certain data mandatory through an API. Until now European open data policy built upon the freedom of information measures of each EU Member State (MS), and added mandatory conditions to what MS published voluntarily, and to how to respond to public data re-use requests. This new law arranges for the pro-active publication of certain data sets.
    In the 2020 research I was responsible for the sections about earth observation, environmental, and meteorological data. We submitted our final report in September 2020, and since then there had been total silence w.r.t. the progress in negotiating the list with the MS, and putting together the implementation act. I knew that at least the earth observation and environmental data would largely be included the way I suggested, when last summer I got a sneak preview of the adaptation of the INSPIRE portal where such data is made available.
    The Implementation Act
    In the Open Data Directive there’s a provision that the European Commission can, through a separate implementation act, set mandatory open data requirements for data belonging to themes listed in the Directive’s Annex. At launch in 2019, 6 such themes were listed: Geo-information, statistics, mobility, company information, earth observation / environment, and meteorology.
    The list of themes can also be amended, through another separate implementation act, and a process to determine the second set of themes is currently underway.
    The draft implementation law (PDF) states that government-held datasets mentioned in its Annex must be published through APIs, under an open license such as Creative Commons Zero, By Attribution or equivalent / less restrictive. Governments must publish the terms of use for such APIs and these terms may not be used to discourage re-use. APIs must also be fully publicly documented, and a point of contact must be provided.
    MS can temporarily exempt some of the high value datasets, a decision that must be made public, but limited to two years after entry into force of this implementation act. Additional usage restrictions are allowed for personal data within the data sets concerned, but only to the extent needed to protect personal data of individuals (so not as an excuse to disallow re-use and access to the data as a whole).
    MS must report on their implementation actions every two years, in which they need to list the actual data sets opened, the links to licenses, API and documentation, and exemptions still in place. The implementation is immediately binding for all MS (no need to first transpose into national law to be enforcable), will apply 20 days after publication in the EU Journal, and MS have 6 months to comply.
    The Data Sets Per Theme
    In this first batch of mandatory open data, 6 themes are covered (PDF). Some brief remarks on all of them.
    Mobility
    This is, contrary to what you’d expect, the smallest theme of the six covered. Because everything that is already covered in the Intelligent Transport (ITS) Directive is out of scope, which is most of everything concerning land based mobility. What remains for the High Value Data list is data on transport networks contained in the INSPIRE Annex I theme Transport Networks, and static and dynamic data about inland waterways, as well as the electronic navigational charts (ENC) for inland waterways. This is much in line with the 2020 study report. There was some concern with national hydrographical services about ENCs for seas being included (making it harder to force sea going vessels to use the latest version), but my reassurances that it would be unlikely held true.
    Geospatial data
    Geospatial data is I would say the ‘original’ high value government data, and has been for centuries. The data sets from the four INSPIRE Annex I themes Administrative Units, Geographical Names, Addresses, Buildings and Cadastral Parcels are within scope. Additionally reference parcels and agricultural parcels as described in the 1306/2013 and 1307/2013 Regulations on the Common Agricultural Policy (CAP) are on the list.
    Earth Observation and Environment
    This was a theme I was responsible for in the 2020 study. It is an extremely broad category, covering a very wide spectrum of types of data. It was basically impossible to choose something from this list, not in the least because re-use value usually comes from combinations of data, not from any single source used. Therefore my proposed solution was to not choose, and advise to treat it as a coherent whole needed in addressing the EU goals concerning environment/nature, climate adaptation, and pollution. The High Value Data list adopts this approach and puts 19 INSPIRE themes within scope. These are:

    Annex I: Hydrography, and Protected Sites
    Annex II in full: Elevation, Geology, Land Cover, and Ortho-imagery
    Annex III: Area management, Bio-geographical regions, Energy resources, Environmental monitoring facilities, Habitats and biotopes, Land use, Mineral resources, Natural risk zones, Oceanographic geographical features, Production and industrial facilities, Sea regions, Soil, and Species distribution

    Additionally all environmental information as covered by the 2003/4/EC Directive on public access to such information is added to the list, and all data originating in the context of a wide range of EU Regulations and Directives on air, climate, emissions, nature preservation and biodiversity, noise, waste and water. I miss soil in this environmental list, but perhaps the Annex III INSPIRE theme is seen as sufficiently covering it. I still need to follow up on the precise formulations w.r.t. data in 31 additionally referenced regulations and directives.
    What to me is a surprising phrasing is that earth observation is defined here including satellite based data. Not surprising in terms of earth observation itself, but because satellite data was specifically excluded from the scope of our 2020 study. First because the EU level satellite data is already open. Second because this list deals with data from MS, and not many MS have their own satellite data. When they do it is usually the result of public private collaborative investment, and such private investment may dry up if there are no longer temporary exclusive access arrangements possible, which would have resulted in considerable political objections. Perhaps adding space based data collection is currently being well enough watered down by defining the INSPIRE themes as its scope, while at the same time future proofing the definition for when satellite data does become part of INSPIRE themes.
    Together these first three, mobility, geospatial, and EO/environment, place a full 24 out of 34 INSPIRE themes on the list for mandatory open data. This basically amounts to adding an open data requirement to INSPIRE. It places MS’ INSPIRE compliance very much in the focus of attention, which now often is limited, and further positions INSPIRE as a key building block in the coming Green Deal dataspace. It will be of high interest to see what the coming new version of the INSPIRE directive, currently under review, makes of all that.
    Statistics
    This topic is more widely covered in the High Value Data list, than it was in the 2020 study, both in the types of statistics included, and in the demands made of those types of statistics. Still there are lots of statistics that MS hold, that aren’t included here (while some MS do publish most of their statistics already btw): the selection is based on European reporting obligations that follow from a list of various European laws.
    Topics for which statistics must be published as open data in a specified way:

    Industrial production
    Industrial producer price index, by activity
    Volume of sales by activity
    EU international trade in goods
    Tourism flows in Europe
    Harmonised consumer prices indices
    National accounts: GDP, key indicators on corporations and households
    Government expenditure and revenue, government gross debt
    Population, fertility, mortality
    Current healthcare expenditure
    Poverty
    Inequality
    Employment, unemployment, potential labour force

    Data for these reporting obligations should be available from the moment the law creating them has been in force. That means for instance that healthcare expenditure should be available from at least 2008, whereas employment statistics must be available from at least 2019, because of the different years in which these laws were enacted.
    Company information
    Company information from the start has been the most controversial theme of the six covered by this implementation act. I assume this theme has also been the prime political reason for the long delay in the proposal being published. In my perception because this is the only data set that actually might end up challenging the status quo in society (as it involves ownership and power structures, and touches tax evasion). In the 2020 study four aspects were considered, the basic company information, company documents and accounts, ownership information, and insolvency status. Two ended up in the draft law: basic company information and company documents. Opening ownership information, not even the ultimate beneficial ownership (UBO) information, from the start drew vehement objections (including from the Dutch government). Many stakeholders (including the NGO I chair) are disappointed with the current outcome. (Here’s an old blogpost where I explain UBO, and here’s SF writer Brin on what transparent UBO might mean to our societies.) The data that will become open data still may be 2 years in the future: the Open Data Directive allows a 2 year exemption, and this is the data where that exemption will be used I think.
    That said, mandatory open company data and documents, even with the delay through exemptions, is already a step forward that puts an end to literally decades of court cases, obstruction, and lobbying for more openness. The very first PSI Directive in 2003 was already an expression of a broad demand for this data, now 20 years on it finally becomes mandatory across the EU. Some people I know have been after this for their entire professional careers and already retired. It’s easy to loose sight of that win when we only focus on not having (ultimate) ownership data included.
    Meteorological data
    This is the other theme I was responsible for in the 2020 study. Like with company information this is an area where the discussion about making it available for re-use is decades old and precedes digitisation becoming ubiquitous. When I started my open data work in 2008, most of the existing documentation and argumentation for the value of and need for open data concerned meteorological data. A range of EU countries already have this as open data, others not at all. While progress has been made in the past decades, the High Value Data list provides a blanket obligation for all EU MS, a result that would otherwise still be a very long time away if entirely voluntary for the MS involved.
    Data included here includes all weather observation data, validated observations / climate data, radar data (useful for things like cloud heights, precipitation and wind), and numerical weather prediction data (these are the outputs of the combined models used for predictions).
    The implementation act is up for public feedback until 21 June, but likely will retain its current form. I think it’s a pretty good result, and I am happy that I have been able to contribute to it.

Comments are closed.

To respond on your own website, enter the URL of your response which should contain a link to this post's permalink URL. Your response will then appear (possibly after moderation) on this page. Want to update or remove your response? Update or delete your post and re-enter your post's URL again. (Find out more about Webmentions.)

Mentions