The period of the European Commission that has just finished delivered an ambitious and coherent legal framework for both the single digital market and the single market for data, based on the digital and data strategies the EU formulated. Those laws, such as the Data Governance Act, Data Act, High Value Data implementing regulation and the AI Act are all finished and in force (if not always fully in application). This means efforts are now switching to implementation. The detailed programme of the next European Commission, now being formed, isn’t known yet. Big new legislation efforts in this area are however not expected.

This summer Ursula von der Leyen, the incoming chairperson of the Commission has presented the political guidelines. In it you can find what the EC will pay attention to in the coming years in the field of data and digitisation.

Data and digital are geopolitical in nature
The guidelines underline the geopolitical nature of both digitisation and data. The EU will therefore seek to modernise and strengthen international institutions and processes. It is noted that outside influence in regular policy domains has become a more common instrument in geopolitics. Data and transparency are likely tools to keep a level headed view of what’s going on for real. Data also is crucial in driving several technology developments, such as in AI and digital twins.

European Climate Adaptation Plan Built on Data
The EU will increase their focus on mapping risks and preparedness w.r.t. natural disasters and their impact on infrastructure, energy, food security, water, land use both in cities and in rural areas, as well as early warning systems. This is sure to contain a large data component, a role for the Green Deal Data Space (for which the implementation phase will start soon, now the preparatory phase has been completed) and the climate change digital twin of the earth (DestinE, for which the first phase has been delivered). Climate and environment are the areas where already before the EC emphasised the close connection between digitisation and data and the ability to achieve European climate and environmental goals.

AI trained with data
Garbage in, garbage out: access to enough high quality data is crucial to all AI development, en therefore data will play a role in all AI plans from the Commission.

An Apply AI Strategy was announced, aimed at sectoral AI applications (in industry, public services or healthcare e.g.). The direction here is towards smaller models, squarely aimed at specific questions or tasks, in the context of specific sectors. This requires the availability and responsible access to data in these sectors, in which the European common data spaces will play a key role.

In the first half of 2025 an AI Factories Initiative will be launched. This is meant to provide SME’s and newly starting companies with access to the computing power of the European supercomputing network, for AI applications.

There will also be an European AI Research Council, dubbed a ‘CERN for AI’, in which knowledge, resources, money, people, and data.

Focus on implementing data regulations
The make the above possible a coherent and consistent implementation of the existing data rules from the previous Commission period is crucial. Useful explanations and translations of the rules for companies and public sector bodies is needed, to allow for seamless data usage across Europe and at scale. This within the rules for data protection and information security that equally apply. The directorate within the Commission that is responsible for data, DG Connect, sees their task for the coming years a mainly being ensuring the consistent implementation of the new laws from the last few years. The implementation of the GDPR until 2018 is seen as an example where such consistency was lacking.

European Data Union
The political guidelines announce a strategy for a European Data Union. Aimed at better and more detailed explanations of the existing regulations, and above all the actual availability and usage of data, it reinforces the measure of success the data strategy already used: the socio-economic impact of data usage. This means involving SME’s at a much larger volume, and in this context also the difference between such SME’s and large data users outside of the EU is specifically mentioned. This Data Union is a new label and a new emphasis on what the European Data Strategy already seeks to do, the creation of a single market for data, meaning a freedom of movement for people, goods, capital and data. That Data Strategy forms a consistent whole with the digital strategy of which the Digital Markets Act, Digital Services Act and AI Act are part. That coherence will be maintained.

My work: ensuring that implementation and normalisation is informed by good practice
In 2020 I helped write what is now the High Value Data implementing regulation, and in the past years my role has been tracking and explaining the many EU digital and data regulations initiatives on behalf of the main Dutch government holders of geo-data. Not just in terms of new requirements, but with an accent on the new instruments and affordances those rules create. The new instruments allow new agency of different stakeholder groups, and new opportunities for societal impact come from them.
The phase shift from regulation to implementation provides an opportunity to influence how the new rules get applied in practice, for instance in the common European data spaces. Which compelling cases of data use can have an impact on implementation process, can help set the tone or even have a normalisation effect? I’m certain practice can play a role like this, but it takes bringing those practical experiences to a wider European network. Good examples help keep the actual goal of socio-economic impact in sight, and means you can argue from tangible experience in your interactions.

My work for Geonovum the coming time is aimed at this phase shift. I already helped them take on a role in the coming implementation of the Green Deal Data Space, and I’m now exploring other related efforts. I’m also assisting the Ministry for the Interior in formulating guidance for public sector bodies and data users on how to deal with the chapter of the Data Governance Act that allows for the use (but not the sharing) of protected data held by the public sector. Personally I’m also seeking ways to increase the involvement of civil society organisations in this area.

Juni is een goede maand voor open data dit jaar.

Ten eerste keurde vorige week dinsdag 4 juni de Eerste Kamer de wet goed die de Europese open data richtlijn implementeert in de Nederlandse Wet Hergebruik Overheidsinformatie. Al is de wet nog niet gepubliceerd en dus nog niet van kracht komt daarmee een einde aan drie jaar vertraging. De wet had al per juli 2021 in moeten gaan. De Europese richtlijn ging namelijk in juli 2019 in en gaf Lidstaten twee jaar de tijd voor omzetting in nationale wetgeving.

Ten tweede ging afgelopen zondag 9 juni de verplichting voor het actief publiceren door overheden via API’s van belangrijke data op zes thema’s in. Die Europese verordening werd eind 2022 aanvaard, werd begin februari 2023 van kracht, en gaf overheden 16 maanden d.w.z. tot zondag om er aan te voldoen. De eerste rapportage over de implementatie moeten Lidstaten in februari 2025 doen, dus ik neem aan dat veel landen die periode nog gebruiken om aan de verplichtingen te voldoen. Maar het begin is er. In Nederland is de impact van deze High Value Data verordening relatief gering, want het merendeel van de data die er onder valt was hier al open. Tegelijkertijd was dat in andere EU landen niet altijd het geval. Nu kun je dus Europees dekkende datasets samenstellen.

ODRL, Open Digital Rights Language popped up twice this week for me and I don’t think I’ve been aware of it before. Some notes for me to start exploring.

Rights Expression Languages

Rights Expression Languages, RELs, provide a machine readable way to convey or transfer usage conditions, rights, restraints, granularly w.r.t. both actions and actors. This can then be added as metadata to something. ODRL is a rights expression language, and seems to be a de facto standard.

ODRL is a W3C recommendation since 2018, and thus part of the open web standards. ODRL has its roots in the ’00s and Digital Rights Management (DRM): the abhorred protections media companies added to music and movies, and now e-books, in ways that restrains what people can do with media they bought to well below the level of what was possible before and commonly thought part of having bought something.

ODRL can be expressed in JSON or RDF and XML. A basic example from Wikipedia looks like this:


{
"@context": "http://www.w3.org/ns/odrl.jsonld",
"uid": "http://example.com/policy:001",
"permission": [{
"target": "http://example.com/mysong.mp3",
"assignee": "John Doe",
"action": "play"
}]
}

In this JSON example a policy describes that example.com grants John permission to play mysong.

ODRL in the EU Data Space

In the shaping of the EU common market for data, aka the European common data space, it is important to be able to trace provenance and usage conditions for not just data sets, but singular pieces of data, as it flows through use cases, through applications and their output back into the data space.
This week I participated in a webinar by the EU Data Space Support Center (DSSC) about their first blueprint of data space building blocks, and for federation of such data spaces.

They propose ODRL as the standard to describe usage conditions throughout data spaces.

The question of enactment

It wasn’t the first time I talked about ODRL this week. I had a conversation with Pieter Colpaert. I reached out to get some input on his current view of the landscape of civic organisations active around the EU data spaces. We also touched upon his current work at the University of Gent. His research interest is on ODRL currently, specifically on enactment. ODRL is a REL, a rights expression language. Describing rights is one thing, enacting them in practice, in technology, processes etc. is a different thing. Next to that, how do you demonstrate that you adhere to the conditions expressed and that you qualify for using the things described?

For the EU data space(s) this part sounds key to me, as none of the data involved is merely part of a single clear interaction like in the song example above. It’s part of a variety of flows in which actors likely don’t directly interact, where many different data elements come together. This includes flows through applications that tap into a data space for inputs and outputs but are otherwise outside of it. Such applications are also digital twins, federated systems of digital twins even, meaning a confluence of many different data and conditions across multiple domains (and thus data spaces). All this removes a piece of data lightyears from the neat situation where two actors share it between them in a clearly described transaction within a single-faceted use case.

Expressing the commons

It’s one thing to express restrictions or usage conditions. The DSSC in their webinar talked a lot about business models around use cases, and ODRL as a means for a data source to stay in control throughout a piece of data’s life cycle. Luckily they stopped using the phrase ‘data ownership’ as they realised it’s not meaningful (and confusing on top of it), and focused on control and maintaining having a say by an actor.
An open question for me is how you would express openness and the commons in ODRL. A shallow search surfaces some examples of trying to express Creative Commons or other licenses this way, but none recent.

Openness, can mean an absence of certain conditions, although there may be some (like adding the same absence of conditions to re-shared material or derivative works), which is not the same as setting explicit permissions. If I e.g. dedicate something to the public domain, an image for instance, then there are no permissions for me to grant, as I’ve removed myself from that role of being able to give permission. Yet, you still want to express it to ensure that it is clear for all that that is what happened, and especially that it remains that way.

Part of that question is about the overlap and distinction between rights expressed in ODRL and authorship rights. You can obviously have many conditions outside of copyright, and can have copyright elements that may be outside of what can be expressed in RELs. I wonder how for instance moral authorship rights (that an author in some (all) European jurisdictions cannot do away with) can be expressed after an author has transferred/sold the copyrights to something? Or maybe, expressing authorship rights / copyrights is not what RELs are primarily for, as it those are generic and RELs may be meant for expressing conditions around a specific asset in a specific transaction. There have been various attempts to map all kinds of licenses to RELs though, so I need to explore more.

This is relevant for the EU common data spaces as my government clients will be actors in them and bringing in both open data and closed and unsharable but re-usable data, and several different shades in between. A range of new obligations and possibilities w.r.t. data use for government are created in the EU data strategy laws and the data space is where those become actualised. Meaning it should be possible to express the corresponding usage conditions in ODRL.

ODRL gaps?

Are there gaps in the ODRL standard w.r.t. what it can cover? Or things that are hard to express in it?
I came across one paper ‘A critical reflection on ODRL’ (PDF Kebede, Sileno, Van Engers 2020), that I have yet to read, that describes some of those potential weaknesses, based on use cases in healthcare and logistics. Looking forward to digging out their specific critique.

I’ve been involved in open data for about 15 years. Back then we had a vibrant European wide network of activists and civic organisations around open data, partially triggered by the first PSI Directive that was the European legal fundament for our call for more open government data.

Since 2020 a much wider and fundamental legal framework than the PSI Directive ever was is taking shape, with the Data Governance Act, Data Act, AI Regulation, Open Data Directive, High Value Data implementing regulation as building blocks. Together they create the EU single market for data, adding data as fourth element to the list of freedom of movement for people, products and capital within the EU. This will all take shape as the European common dataspace(s), built from a range of sectoral dataspaces.

In the past years I’ve been actively involved in these developments, currently helping large government data holders in the Netherlands interpret the new obligations and above all new opportunities for public service that result from all this.

Now that the dataspaces are slowly taking shape, what I find missing from most discussions and events is the voice of civic organisations and activists. It’s mostly IT companies and research institutions that are involved. While for the Commission social impact (climate, health, energy and agricultural transitions e.g.) is a key element in why they seek to implement these new laws, for most parties involved in the dataspaces that is less of a consideration, and economic and technological factors are more important. Not even government data holders themselves are represented much in how the European data space will turn out. Even though everyone single one of us and every public entity by default is a part of this common market.

I would like to strengthen the voice of civil society and activists in this area, to together influence the shape these dataspaces are taking. So that they are of use and value to us too. To use the new (legal) tools to strengthen the commons, to increase our agency.

Most of the old European open data network however over time has dissolved, as we all got involved in national level practical projects and the European network as a source of sense of belonging and strengthening each others commitment became less important. And we’ve moved on a good number of years, so many new people have come on to the scene, unconnected to that history, with new perspectives and new capabilities.

So the question is: who is active on these topics, from a civil society perspective, as activists? Who should be involved? What are the organisations, the events, that are relevant regionally, nationally, EU wide? Can we connect those existing dots: to share experiencs, examples, join our voices, pool our efforts?

Currently I’m doing a first scan of who is involved in which EU country, what type of events are visible, organisations that are active etc. Starting from my old network of a decade ago. I will share lists of what I find at Our Common Data Space.

Let me know if you count yourself as part of this European network. Let me know the relevant efforts you are aware of. Let me know which events you think bring together people likely to want to be involved.

I look forward to finding out about you!


Open Government Data Camp in Warsaw 2011. An example of the vibrancy of the European open data network, I called it the community’s ‘family christmas party’, at the time. Above the schedule of sessions created collectively by the participants, with many local initiatives and examples shared with the EU wide network. Below one of those sessions, on local policy making and open data.

Some good movement on EU data legislation this month! I’ve been keeping track of EU data and digital legislation in the past three years. In 2020 I helped determine the content of what has become the High Value Data implementing regulation (my focus was on earth observation, environmental and meteorological data), and since then for the Dutch government I’ve been involved in translating the incoming legislation to implementing steps and opportunities for Dutch government geo-data holders.

AI Act

The AI Act stipulates what types of algorithmic applications are allowed on the European market under which conditions. A few things are banned, the rest of the provisions are tied to a risk assessment. Higher risk applications carry heavier responsibilities and obligations for market entry. It’s a CE marking for these applications, with responsibilities for producers, distributors, users, and users of output of usage.
The Commission proposed the AI Act in april 2021, the Council responded with its version in December 2022.

Two weeks ago the European Parliament approved in plenary its version of the AI Act.
In my reading the EP both strengthens and weakens the original proposal. It strengthens it by restricting certain types of uses further than the original proposal, and adds foundational models to its scope.
It also adds a definition of what is considered AI in the context of this law. This in itself is logical as, originally the proposal did not try to define that other than listing technologies in an annex that were deemed in scope. However while adding that definition, they removed the annex. That, I think weakens the AI Act and will make future enforcement much slower and harder. Because now everything will depend on the interpretation of the definition, meaning it will be a key point of contention before the courts (‘my product is out of scope!’). Whereas by having both the definition and the annex, the legislative specifically states which things it considers in scope of the definition at the very least. As the Annex would be periodically updated, it would also remain future proof.

With the stated positions of the Council and Parliament the trilogue can now start to negotiate the final text which then needs to be approved by both Council and Parliament again.

All in all this looks like the AI Act will be finished and in force before the end of year, and will be applied by 2025.

Data Act

The Data Act is one of the building blocks of the EU Data Strategy (the others being the Data Governance Act, applied from September, the Open Data Directive, in force since mid 2021, and the implementing regulation High Value Data which the public sector must comply with by spring 2024). The Data Act contains several interesting proposals. One is requiring connected devices to not only allow users access to the (real time) data they create (think thermostats, solar panel transformers, sensors etc.), as well as allowing users to share that data with third parties. You can think of this as ‘PSD2-for-everything’. PSD2 says that banks must enable you to share your banking data with third parties (meaning you can manage your account at Bank A with the mobile app of Bank B, can connect your book keeping software etc.). The Data Act extends this to ‘everything’ that is connected. Another interesting component is that it allows public sector bodies in case of emergencies (floods e.g.) to require certain data from private sector parties, across borders. The Dutch government heavily opposed this so I am interested in seeing what the final formulation of this part is in the Act. Other provisions make it easier for people to switch platform services (e.g. cloud providers), and create space for the European Commission to set, let develop, adopt or mandate certain data standards across sectors. That last element is of relevance to the shaping of the single market for data, aka the European common data space(s), and here too I look forward to reading the final formulation.

With the Council of the European Union and the European Parliament having reached a common text, what rests is final approval by both bodies. This should be concluded under the Spanish presidency that starts this weekend, and the Data Act will then enter into force sometime this fall, with a grace period of some 18 months or so until sometime in 2025.

There’s more this month: ITS Directive

The Intelligent Transport Systems Directive (ITS Directive) was originally created in 2010, to ensure data availability about traffic conditions etc. for e.g. (multi-modal) planning purposes. In the Netherlands for instance real-time information about traffic intensity is available in this context. The Commmission proposed to revise the ITS Directive late 2021 to take into account technological developments and things like automated mobility and on-demand mobility systems. This month the Council and European Parliament agreed a common text on the new ITS Directive. I look forward to close reading the final text, also on its connections to the Data Act above, and its potential in the context of the European mobility data space. Between the Data Act and the ITS Directive I’m also interested in the position of in-car data. Our cars increasinly are mobile sensor platforms, to which the owner/driver has little to no access, which should change imo.

This week it was 15 years ago that I became involved in open government data. In this post I look back on how my open data work evolved, and if it brought any lasting results.

I was at a BarCamp in Graz on political communication the last days of May 2008 and ended up in a conversation with Keith Andrews in a session about his wish for more government held data to use for his data visualisation research. I continued that conversation a week later with others at NL GovCamp on 7 June 2008 in Amsterdam, an event that I helped organise with James Burke and Peter Robinnet. There, on the rotting carpets of the derelict office building that had been the Volkskrant offices until 2007, several of us discussed how to bring about open data in the Netherlands:

My major take-away … was that a small group found itself around the task of making inventory of what datasets are actually held within Dutch government agencies. … I think this is an important thing to do, and am curious how it will develop and what I can contribute.
Me, 10 June 2008

Fifteen years on, what came of that ‘important thing to do’ and seeing ‘what I can contribute’?

At first it was mostly talk, ‘wouldn’t it be nice if ..’, but importantly part of that talk was with the Ministry responsible for government transparency who were present at NL GovCamp. Initially we weren’t allowed to meet at the Ministry itself, inviting ‘hackers’ in was seen as too sensitive, and over the course of 6 months several conversations with civil servants took place in a pub in Utrecht, before being formally invited to come talk. That however did result in a first assignment from January 2009, which I did with James and with Alper (who also had participated in NL GovCamp).

With some tangible results in hand from that project, I hosted a conversation at Reboot 11 in 2009 in Copenhagen about open data, leading to an extension of my European network on the topic. There I also encountered the Danish IT/open government team. Cathrine of that team invited me to host a panel at an event early 2010 where also the responsible official at the European Commission for open data was presenting. He invited me to Luxembourg to meet the PSI Group of national representatives in June 2010, and it landed me an invitation as a guest blogger that same month for an open data event hosted by the Spanish government and the ePSIplatform team, a European website on re-using government information.

There I also met Marc, a Dutch lawyer in open government. Having met various European data portal teams in Madrid, I then did some research for the Dutch government on the governance and costs of a Dutch open data portal in the summer of 2010, through which I met Paul who took on a role in further shaping the Dutch portal. Stimulated by the Commission with Marc I submitted a proposal to run the ePSIplatform, a public tender we won. The launching workshop of our work on the ePSIplatform in January 2011 in Berlin is where I met Frank. In the fall of 2011 I attended the Warsaw open government data camp, where Marc, Frank, Paul and I all had roles. I also met Oleg from the World Bank there. In November 2011 Frank, Paul, Marc and I founded The Green Land, and I have worked on over 40 open data projects since then under that label. Early 2012 I was invited to the World Bank in the US to provide some training, and later that year worked in Moldova for them. From 2014 I worked in Kazachstan, Kyrgyzstan, Serbia and Malaysia for the World Bank until 2019, before the pandemic ended it for now.

What stands out to me in this history of a decade and a half is:

  • How crucial chance encounters were/are and how those occurred around small tangible things to do. From those encounters the bigger things grew. Those chance encounters could happen because I helped organise small events, went to events by others, and even if they were nominally about something else, had conversations there about open data with likeminded people. Being in it for real, spending effort to strengthen the community of practitioners around this topic created track record quickly. This is something I recently mentioned when speaking about my work to students as well: making time for side interests is important, I’ve come to trust it as a source of new activities.
  • The small practical steps I took, a first exploratory project, creating a small collection of open data examples out of my own interest, writing the first version of an open data handbook with four others during a weekend in Berlin served as material for those conversations and were the scaffolding for bigger things.
  • I was at the right time, not too early, not late. There already was a certain general conversation on open data going on. In 2003 the EC had legislated for government data re-use, which had entered into force in May 2008, just 3 weeks before I picked the topic up. Thus, there was an implemented legal basis for open data in place in the EU, which however hadn’t been used by anyone as new instrument yet. By late 2008 Barack Obama was elected to the US presidency on a platform that included government transparency, which on the day after his inauguration in January 2009 resulted in a Memorandum to kick-start open government plans across the public sector. This meant there was global attention to the topic. So the circumstances were right, there was general momentum, just not very many people yet trying to do something practical.
  • Open data took several years to really materialise as professional activity for me. During those years most time was spent on explaining the topic, weaving the network of people involved across Europe and beyond. I have so many open data slide decks from 2009 and 2010 in my archive. In 2008, 2009 and 2010, I was active in the field but my main professional activities were still elsewhere. In 2009 after my first open data project I wondered out loud if this was a topic I could and wanted to continue in professionally. From early 2011 most of my income came from open data, while the need for building out the network of people involved was still strong. Later, from 2014 or so open data became more local, more regular, shifted to being part of data governance, and now data ethics. The pan-European network evaporated. Nevertheless helping improve European open data legislation has been a crucial element until now, to keep providing a fundament beneath the work.

From those 15 years, what stands out as meaningful results? What did it bring?
This is a hard and easy question at the same time. Hard because ‘meaningful’ can have many definitions. If we take achieving permanent or even institutionalised results as yard stick, two things stand-out. One at the beginning and one at the end of the 15 years.

  • My 2010 report for the Ministry for the Interior on the governance and financing of a national open data portal and facilitating a public consultation on what it would need to do, helped launch the Dutch open government data portal data.overheid.nl in 2011. A dozen years on, it is a key building block of the Dutch government’s public data infrastructure, and on the verge of taking on a bigger role with the implementation of the European data strategy.
  • At the other end of the timeline is the publication of the EU Implementing Regulation on High Value Data last December, for which I did preparatory research (PDF report), and which compels the entire public sector in Europe to publish a growing list of datasets through APIs for free re-use. Things I wrote about earth observation, environmental and meteorological data are in the law’s Annexes which every public body must comply with by next spring. What’s in that law about geographic data, company data and meteorological data ends more than three decades worth of discussion and court proceedings w.r.t. access to such data.

Talking about meaningful results is also an easy question, especially when not looking for institutional change:

  • Practically, it means my and my now 10 colleagues have an income, which is meaningful within the scope of our personal everyday lives. The director of a company I worked at 25 years ago once said to me when I remarked on the low profits of the company that year ‘well, over 40 families had an income meanwhile, so that’s something.’ I never forgot it. That’s certainly something.
  • There’s the NGO Open State Foundation that directly emerged from the event James, Peter and I organised in 2008. The next event in 2009 was named ‘Hack the Government’ and organised by James and several others who had attended in 2008. It was registered as a non-profit and from 2011 became the Open State Foundation, now a team of eight people still doing impactful work on making Dutch government more transparant. I’ve been the chair of their board for the last 5 years, which is a privilege.
  • Yet the most meaningful results concern people, changes they’ve made, and the shift in attitude they bring to public sector organisations. When you see a light go on in the eyes of someone during a presentation or conversation. Mostly you never learn what happens next. Sometimes you do. Handing out a few free beers (‘Data Drinks’) in Copenhagen making someone say ‘you’re doing more for Danish open data in a month by bringing everyone together than we did in the past years’. An Eastern European national expert seconded to the EC on open data telling me he ultimately came to this job because as a student he heard me speak once at his university and decided he wanted to be involved in the topic. An Irish civil servant who asked me in 2012 about examples I presented of collaboratively making public services with citizens, and at the end of 2019 messaged me it had led to the crowd sourced mapping of Lesotho in Open Street Map over five years to assist the Lesotho Land Registry and Planning Authority in getting good quality maps (embed of paywalled paper on LinkedIn). Someone picking up the phone in support, because I similarly picked up the phone 9 years earlier. None of that is directly a result of my work, it is fully the result of the work of those people themselves. Nothing is ever just one person, it’s always a network. One’s influence is in sustaining and sharing with that network. I happened to be there at some point, in a conversation, in a chance encounter, from which someone took some inspiration. Just as I took some inspiration from a chance encounter in 2008 myself. To me it’s the very best kind of impact when it comes to achieving change.

I’ve plotted the things mentioned above in this image for the most part. As part of trying to map the evolution of my work, inspired by another type of chance encounter with a mind map on the wall of museum.


The evolution of my open data (net)work. Click for larger version.