Data Sovereignty as Prerequisite for Open Data Agency

As we are living in a networked world, increasingly government bodies execute their tasks while collaborating in networks of various other stakeholders. This also happens when it comes to collecting, providing or working with data as part of public tasks. One of the potential detrimental side effects is that it quickly becomes unclear who can decide to open such data up. Or whether a government entity, who wants to publish data as part of a policy intervention, still feels able to do so. This ability to decide over your own data, I call data sovereignty. I think without proper attention, the data sovereignty of public institutions is under pressure in collaborative situations and a threat to the freedom of public entities to decide and act on their own open data efforts. This is especially problematic where the lack of data sovereignty hinders public entities in deploying open data as a policy instrument.

I have just completed an inventory of the data sets that a Dutch province holds and the visible erosion of data sovereignty was the main unexpected outcome for me.
This erosion takes different shapes. Here are a few examples of it, encountered in the Province I mentioned:

  • Data collection on businesses locations and the number of people they employ (to track employment per municipality per sector) is being pooled by all provinces (as a national level data set is more useful). The pooling takes place in a separate legal entity. It is unclear if this entity still falls under FOIA and re-use regulations. This entity also exploits the data by selling it. Logical at the organisational level perhaps, but illogical in comparison with the provincial public task (and maybe not even legal under the Re-Use law). Opening up the data needs to be done through that new entity, meaning not just convincing yourself, but all other provinces as well as the entity who has commercial interest in not being convinced. The slowest will thus set the speed.
  • Data collection on traffic flows, collected by the Province, is stored directly in a national data warehouse (NDW). Again pooling data makes it more useful, but the Province cannot store cleaned data there (anomalies filtered out, pattern changes explained etc.), so always needs to redo that cleaning and filtering whenever they want to work or access their own data. Although the publicly owned NDW now publishes open data, until recently they saw themselves as a commercial outfit, adverse to the notion of open data.
  • Data collection on bicycle traffic, done by the Province, is stored in the online database of a French service provider active in the entire EU. Ownership of the data is unclear. The Province only accesses the data through the French website. If a FOIA request came, it would be unclear if providing the data runs counter to any rights the service provider is claiming.
  • Data collection on the prevalence of bird species is being collected in collaboration with nature preservation groups and large numbers of volunteers. The Province pays for the data collection, but the nature preservation groups claim their volunteers (by virtue of their voluntary efforts) are the rightful owners of the data. Without seeking internal legal advice, the discussion remains unsolved and stalls.

None of these situations are unsolvable, all of them can get a definitive answer. The issue however is that nobody is clearly in a position, or has the explicit role to make sure such an definitive answer gets formulated. Because of that, uncertainties remain, which easily leads to inaction. If and when the Province wants to act to open data up, it therefore easily runs into all kinds of questions that will slow action down, or ensure action does not get taken.

It is entirely logical that public entities are collaborating in networks with other public entities and domain-specific stakeholders for the collection, dissemination and use of data. It is also certain, given our networked society and the drive for efficiency, the number of situations where such collaboration takes place will only rise. However, for the drive towards more openness it is detrimental when ownership of public data becomes unclear, gets transferred to an entity that potentially falls outside the scope of FOIA, or falls under the rights of a private entity, just because nobody sought to clarify such matters at the outset.

Public entities should learn to strongly guard their data sovereignty if they want to maintain their own agency in using opening up data as a policy instrument. Moving to open by design as a default for the public sector, requires stopping the erosion of data sovereignty.

3 thoughts on “Data Sovereignty as Prerequisite for Open Data Agency

  1. bart rosseau

    Very interesting read. “Erosion of data Sovereignty” is a great way of describing it.

    I can see a lot of similarities with my experience on a local level.
    I think that datagovernance between governement levels/cocreation platforms and PPP constructions will be THE challenge for the coming years…

  2. Bruce

    Per your request to share my comment from twitter:

    @ton_zylstra for sure! Incl. vendors refusing to provide even us our own data, or tech company saying our data in their system is theirs :(— Bruce Haupt (@brucehaupt) June 15, 2016

    It’s definitely a critical issue, and not just in terms of open data. Governments themselves often have trouble getting access to their own data once others are contracted/delegated the role of managing it on their behalf.

  3. Baden Appleyard

    I’m glad you blogged about this Ton. I have been wondering if this problem / issue is prevalent in Europe, because I don’t often hear it mentioned. Its an enormous problem in Australia, where government procurement of data is prolific, and procurement contracts are diverse. I also have a wry smile when I talk to academics in research projects that are doing “a collaborative project”, where data is sourced by and from each project member, only to find that their collaboration agreement says nothing, or something unworkable, about project output data, who owns it, who has open licensing responsibility for it, if copy or other rights in the nested data have been resolved, etc etc. I often find that the “collaboration” is not, at least on their agreement, a collaboration at all.

Comments are closed.