This week at the EU Open Data Days in Luxembourg, Davide Taibi a senior researcher at the Institute for Educational Technology of the National Research Council of Italy, talked about his research into a possible European curriculum for data literacy.

He mentioned how, in the highly multilingual context of Europe, data literacy is an unclear term. In German data literacy translates to data competence, while literacy itself translates to alphabetisation. Other terms like information literacy and data science are used more commonly across countries.

On one of his slides (image) he wrote:

The term data literacy isn’t well known in most of the countries analysed. The most widely used terms are ‘digital literacy’, ‘information literacy’, ‘data competence’, ‘media literacy’, ‘statistical literacy’, ‘computer/IT literacy’, among others. In most countries it is closely related to digital skills.

I usually use Howard Rheingold’s shorthand for literacy as skills plus community. Skills benefit individuals, but for some when you add in the context of a community or network of skilled people in which that skill gets deployed, the value of usage sees a nonlinear effect, a kind of network effect basically. That communal aspect, and the jump in usage value is connected to my notion of networked agency. It works as a multiplier.

Looping back to the lack of clarity around data literacy as a term, I wonder.
Is it because we haven’t yet described clearly enough which _skills_ we mean when talking about data literacy?
Or is it because we don’t really know which communities would see which non linear use value, when deploying the data skills concerned?

The period of the European Commission that has just finished delivered an ambitious and coherent legal framework for both the single digital market and the single market for data, based on the digital and data strategies the EU formulated. Those laws, such as the Data Governance Act, Data Act, High Value Data implementing regulation and the AI Act are all finished and in force (if not always fully in application). This means efforts are now switching to implementation. The detailed programme of the next European Commission, now being formed, isn’t known yet. Big new legislation efforts in this area are however not expected.

This summer Ursula von der Leyen, the incoming chairperson of the Commission has presented the political guidelines. In it you can find what the EC will pay attention to in the coming years in the field of data and digitisation.

Data and digital are geopolitical in nature
The guidelines underline the geopolitical nature of both digitisation and data. The EU will therefore seek to modernise and strengthen international institutions and processes. It is noted that outside influence in regular policy domains has become a more common instrument in geopolitics. Data and transparency are likely tools to keep a level headed view of what’s going on for real. Data also is crucial in driving several technology developments, such as in AI and digital twins.

European Climate Adaptation Plan Built on Data
The EU will increase their focus on mapping risks and preparedness w.r.t. natural disasters and their impact on infrastructure, energy, food security, water, land use both in cities and in rural areas, as well as early warning systems. This is sure to contain a large data component, a role for the Green Deal Data Space (for which the implementation phase will start soon, now the preparatory phase has been completed) and the climate change digital twin of the earth (DestinE, for which the first phase has been delivered). Climate and environment are the areas where already before the EC emphasised the close connection between digitisation and data and the ability to achieve European climate and environmental goals.

AI trained with data
Garbage in, garbage out: access to enough high quality data is crucial to all AI development, en therefore data will play a role in all AI plans from the Commission.

An Apply AI Strategy was announced, aimed at sectoral AI applications (in industry, public services or healthcare e.g.). The direction here is towards smaller models, squarely aimed at specific questions or tasks, in the context of specific sectors. This requires the availability and responsible access to data in these sectors, in which the European common data spaces will play a key role.

In the first half of 2025 an AI Factories Initiative will be launched. This is meant to provide SME’s and newly starting companies with access to the computing power of the European supercomputing network, for AI applications.

There will also be an European AI Research Council, dubbed a ‘CERN for AI’, in which knowledge, resources, money, people, and data.

Focus on implementing data regulations
The make the above possible a coherent and consistent implementation of the existing data rules from the previous Commission period is crucial. Useful explanations and translations of the rules for companies and public sector bodies is needed, to allow for seamless data usage across Europe and at scale. This within the rules for data protection and information security that equally apply. The directorate within the Commission that is responsible for data, DG Connect, sees their task for the coming years a mainly being ensuring the consistent implementation of the new laws from the last few years. The implementation of the GDPR until 2018 is seen as an example where such consistency was lacking.

European Data Union
The political guidelines announce a strategy for a European Data Union. Aimed at better and more detailed explanations of the existing regulations, and above all the actual availability and usage of data, it reinforces the measure of success the data strategy already used: the socio-economic impact of data usage. This means involving SME’s at a much larger volume, and in this context also the difference between such SME’s and large data users outside of the EU is specifically mentioned. This Data Union is a new label and a new emphasis on what the European Data Strategy already seeks to do, the creation of a single market for data, meaning a freedom of movement for people, goods, capital and data. That Data Strategy forms a consistent whole with the digital strategy of which the Digital Markets Act, Digital Services Act and AI Act are part. That coherence will be maintained.

My work: ensuring that implementation and normalisation is informed by good practice
In 2020 I helped write what is now the High Value Data implementing regulation, and in the past years my role has been tracking and explaining the many EU digital and data regulations initiatives on behalf of the main Dutch government holders of geo-data. Not just in terms of new requirements, but with an accent on the new instruments and affordances those rules create. The new instruments allow new agency of different stakeholder groups, and new opportunities for societal impact come from them.
The phase shift from regulation to implementation provides an opportunity to influence how the new rules get applied in practice, for instance in the common European data spaces. Which compelling cases of data use can have an impact on implementation process, can help set the tone or even have a normalisation effect? I’m certain practice can play a role like this, but it takes bringing those practical experiences to a wider European network. Good examples help keep the actual goal of socio-economic impact in sight, and means you can argue from tangible experience in your interactions.

My work for Geonovum the coming time is aimed at this phase shift. I already helped them take on a role in the coming implementation of the Green Deal Data Space, and I’m now exploring other related efforts. I’m also assisting the Ministry for the Interior in formulating guidance for public sector bodies and data users on how to deal with the chapter of the Data Governance Act that allows for the use (but not the sharing) of protected data held by the public sector. Personally I’m also seeking ways to increase the involvement of civil society organisations in this area.

Juni is een goede maand voor open data dit jaar.

Ten eerste keurde vorige week dinsdag 4 juni de Eerste Kamer de wet goed die de Europese open data richtlijn implementeert in de Nederlandse Wet Hergebruik Overheidsinformatie. Al is de wet nog niet gepubliceerd en dus nog niet van kracht komt daarmee een einde aan drie jaar vertraging. De wet had al per juli 2021 in moeten gaan. De Europese richtlijn ging namelijk in juli 2019 in en gaf Lidstaten twee jaar de tijd voor omzetting in nationale wetgeving.

Ten tweede ging afgelopen zondag 9 juni de verplichting voor het actief publiceren door overheden via API’s van belangrijke data op zes thema’s in. Die Europese verordening werd eind 2022 aanvaard, werd begin februari 2023 van kracht, en gaf overheden 16 maanden d.w.z. tot zondag om er aan te voldoen. De eerste rapportage over de implementatie moeten Lidstaten in februari 2025 doen, dus ik neem aan dat veel landen die periode nog gebruiken om aan de verplichtingen te voldoen. Maar het begin is er. In Nederland is de impact van deze High Value Data verordening relatief gering, want het merendeel van de data die er onder valt was hier al open. Tegelijkertijd was dat in andere EU landen niet altijd het geval. Nu kun je dus Europees dekkende datasets samenstellen.

ODRL, Open Digital Rights Language popped up twice this week for me and I don’t think I’ve been aware of it before. Some notes for me to start exploring.

Rights Expression Languages

Rights Expression Languages, RELs, provide a machine readable way to convey or transfer usage conditions, rights, restraints, granularly w.r.t. both actions and actors. This can then be added as metadata to something. ODRL is a rights expression language, and seems to be a de facto standard.

ODRL is a W3C recommendation since 2018, and thus part of the open web standards. ODRL has its roots in the ’00s and Digital Rights Management (DRM): the abhorred protections media companies added to music and movies, and now e-books, in ways that restrains what people can do with media they bought to well below the level of what was possible before and commonly thought part of having bought something.

ODRL can be expressed in JSON or RDF and XML. A basic example from Wikipedia looks like this:


{
"@context": "http://www.w3.org/ns/odrl.jsonld",
"uid": "http://example.com/policy:001",
"permission": [{
"target": "http://example.com/mysong.mp3",
"assignee": "John Doe",
"action": "play"
}]
}

In this JSON example a policy describes that example.com grants John permission to play mysong.

ODRL in the EU Data Space

In the shaping of the EU common market for data, aka the European common data space, it is important to be able to trace provenance and usage conditions for not just data sets, but singular pieces of data, as it flows through use cases, through applications and their output back into the data space.
This week I participated in a webinar by the EU Data Space Support Center (DSSC) about their first blueprint of data space building blocks, and for federation of such data spaces.

They propose ODRL as the standard to describe usage conditions throughout data spaces.

The question of enactment

It wasn’t the first time I talked about ODRL this week. I had a conversation with Pieter Colpaert. I reached out to get some input on his current view of the landscape of civic organisations active around the EU data spaces. We also touched upon his current work at the University of Gent. His research interest is on ODRL currently, specifically on enactment. ODRL is a REL, a rights expression language. Describing rights is one thing, enacting them in practice, in technology, processes etc. is a different thing. Next to that, how do you demonstrate that you adhere to the conditions expressed and that you qualify for using the things described?

For the EU data space(s) this part sounds key to me, as none of the data involved is merely part of a single clear interaction like in the song example above. It’s part of a variety of flows in which actors likely don’t directly interact, where many different data elements come together. This includes flows through applications that tap into a data space for inputs and outputs but are otherwise outside of it. Such applications are also digital twins, federated systems of digital twins even, meaning a confluence of many different data and conditions across multiple domains (and thus data spaces). All this removes a piece of data lightyears from the neat situation where two actors share it between them in a clearly described transaction within a single-faceted use case.

Expressing the commons

It’s one thing to express restrictions or usage conditions. The DSSC in their webinar talked a lot about business models around use cases, and ODRL as a means for a data source to stay in control throughout a piece of data’s life cycle. Luckily they stopped using the phrase ‘data ownership’ as they realised it’s not meaningful (and confusing on top of it), and focused on control and maintaining having a say by an actor.
An open question for me is how you would express openness and the commons in ODRL. A shallow search surfaces some examples of trying to express Creative Commons or other licenses this way, but none recent.

Openness, can mean an absence of certain conditions, although there may be some (like adding the same absence of conditions to re-shared material or derivative works), which is not the same as setting explicit permissions. If I e.g. dedicate something to the public domain, an image for instance, then there are no permissions for me to grant, as I’ve removed myself from that role of being able to give permission. Yet, you still want to express it to ensure that it is clear for all that that is what happened, and especially that it remains that way.

Part of that question is about the overlap and distinction between rights expressed in ODRL and authorship rights. You can obviously have many conditions outside of copyright, and can have copyright elements that may be outside of what can be expressed in RELs. I wonder how for instance moral authorship rights (that an author in some (all) European jurisdictions cannot do away with) can be expressed after an author has transferred/sold the copyrights to something? Or maybe, expressing authorship rights / copyrights is not what RELs are primarily for, as it those are generic and RELs may be meant for expressing conditions around a specific asset in a specific transaction. There have been various attempts to map all kinds of licenses to RELs though, so I need to explore more.

This is relevant for the EU common data spaces as my government clients will be actors in them and bringing in both open data and closed and unsharable but re-usable data, and several different shades in between. A range of new obligations and possibilities w.r.t. data use for government are created in the EU data strategy laws and the data space is where those become actualised. Meaning it should be possible to express the corresponding usage conditions in ODRL.

ODRL gaps?

Are there gaps in the ODRL standard w.r.t. what it can cover? Or things that are hard to express in it?
I came across one paper ‘A critical reflection on ODRL’ (PDF Kebede, Sileno, Van Engers 2020), that I have yet to read, that describes some of those potential weaknesses, based on use cases in healthcare and logistics. Looking forward to digging out their specific critique.

I’ve been involved in open data for about 15 years. Back then we had a vibrant European wide network of activists and civic organisations around open data, partially triggered by the first PSI Directive that was the European legal fundament for our call for more open government data.

Since 2020 a much wider and fundamental legal framework than the PSI Directive ever was is taking shape, with the Data Governance Act, Data Act, AI Regulation, Open Data Directive, High Value Data implementing regulation as building blocks. Together they create the EU single market for data, adding data as fourth element to the list of freedom of movement for people, products and capital within the EU. This will all take shape as the European common dataspace(s), built from a range of sectoral dataspaces.

In the past years I’ve been actively involved in these developments, currently helping large government data holders in the Netherlands interpret the new obligations and above all new opportunities for public service that result from all this.

Now that the dataspaces are slowly taking shape, what I find missing from most discussions and events is the voice of civic organisations and activists. It’s mostly IT companies and research institutions that are involved. While for the Commission social impact (climate, health, energy and agricultural transitions e.g.) is a key element in why they seek to implement these new laws, for most parties involved in the dataspaces that is less of a consideration, and economic and technological factors are more important. Not even government data holders themselves are represented much in how the European data space will turn out. Even though everyone single one of us and every public entity by default is a part of this common market.

I would like to strengthen the voice of civil society and activists in this area, to together influence the shape these dataspaces are taking. So that they are of use and value to us too. To use the new (legal) tools to strengthen the commons, to increase our agency.

Most of the old European open data network however over time has dissolved, as we all got involved in national level practical projects and the European network as a source of sense of belonging and strengthening each others commitment became less important. And we’ve moved on a good number of years, so many new people have come on to the scene, unconnected to that history, with new perspectives and new capabilities.

So the question is: who is active on these topics, from a civil society perspective, as activists? Who should be involved? What are the organisations, the events, that are relevant regionally, nationally, EU wide? Can we connect those existing dots: to share experiencs, examples, join our voices, pool our efforts?

Currently I’m doing a first scan of who is involved in which EU country, what type of events are visible, organisations that are active etc. Starting from my old network of a decade ago. I will share lists of what I find at Our Common Data Space.

Let me know if you count yourself as part of this European network. Let me know the relevant efforts you are aware of. Let me know which events you think bring together people likely to want to be involved.

I look forward to finding out about you!


Open Government Data Camp in Warsaw 2011. An example of the vibrancy of the European open data network, I called it the community’s ‘family christmas party’, at the time. Above the schedule of sessions created collectively by the participants, with many local initiatives and examples shared with the EU wide network. Below one of those sessions, on local policy making and open data.

Some good movement on EU data legislation this month! I’ve been keeping track of EU data and digital legislation in the past three years. In 2020 I helped determine the content of what has become the High Value Data implementing regulation (my focus was on earth observation, environmental and meteorological data), and since then for the Dutch government I’ve been involved in translating the incoming legislation to implementing steps and opportunities for Dutch government geo-data holders.

AI Act

The AI Act stipulates what types of algorithmic applications are allowed on the European market under which conditions. A few things are banned, the rest of the provisions are tied to a risk assessment. Higher risk applications carry heavier responsibilities and obligations for market entry. It’s a CE marking for these applications, with responsibilities for producers, distributors, users, and users of output of usage.
The Commission proposed the AI Act in april 2021, the Council responded with its version in December 2022.

Two weeks ago the European Parliament approved in plenary its version of the AI Act.
In my reading the EP both strengthens and weakens the original proposal. It strengthens it by restricting certain types of uses further than the original proposal, and adds foundational models to its scope.
It also adds a definition of what is considered AI in the context of this law. This in itself is logical as, originally the proposal did not try to define that other than listing technologies in an annex that were deemed in scope. However while adding that definition, they removed the annex. That, I think weakens the AI Act and will make future enforcement much slower and harder. Because now everything will depend on the interpretation of the definition, meaning it will be a key point of contention before the courts (‘my product is out of scope!’). Whereas by having both the definition and the annex, the legislative specifically states which things it considers in scope of the definition at the very least. As the Annex would be periodically updated, it would also remain future proof.

With the stated positions of the Council and Parliament the trilogue can now start to negotiate the final text which then needs to be approved by both Council and Parliament again.

All in all this looks like the AI Act will be finished and in force before the end of year, and will be applied by 2025.

Data Act

The Data Act is one of the building blocks of the EU Data Strategy (the others being the Data Governance Act, applied from September, the Open Data Directive, in force since mid 2021, and the implementing regulation High Value Data which the public sector must comply with by spring 2024). The Data Act contains several interesting proposals. One is requiring connected devices to not only allow users access to the (real time) data they create (think thermostats, solar panel transformers, sensors etc.), as well as allowing users to share that data with third parties. You can think of this as ‘PSD2-for-everything’. PSD2 says that banks must enable you to share your banking data with third parties (meaning you can manage your account at Bank A with the mobile app of Bank B, can connect your book keeping software etc.). The Data Act extends this to ‘everything’ that is connected. Another interesting component is that it allows public sector bodies in case of emergencies (floods e.g.) to require certain data from private sector parties, across borders. The Dutch government heavily opposed this so I am interested in seeing what the final formulation of this part is in the Act. Other provisions make it easier for people to switch platform services (e.g. cloud providers), and create space for the European Commission to set, let develop, adopt or mandate certain data standards across sectors. That last element is of relevance to the shaping of the single market for data, aka the European common data space(s), and here too I look forward to reading the final formulation.

With the Council of the European Union and the European Parliament having reached a common text, what rests is final approval by both bodies. This should be concluded under the Spanish presidency that starts this weekend, and the Data Act will then enter into force sometime this fall, with a grace period of some 18 months or so until sometime in 2025.

There’s more this month: ITS Directive

The Intelligent Transport Systems Directive (ITS Directive) was originally created in 2010, to ensure data availability about traffic conditions etc. for e.g. (multi-modal) planning purposes. In the Netherlands for instance real-time information about traffic intensity is available in this context. The Commmission proposed to revise the ITS Directive late 2021 to take into account technological developments and things like automated mobility and on-demand mobility systems. This month the Council and European Parliament agreed a common text on the new ITS Directive. I look forward to close reading the final text, also on its connections to the Data Act above, and its potential in the context of the European mobility data space. Between the Data Act and the ITS Directive I’m also interested in the position of in-car data. Our cars increasinly are mobile sensor platforms, to which the owner/driver has little to no access, which should change imo.