As we are living in a networked world, increasingly government bodies execute their tasks while collaborating in networks of various other stakeholders. This also happens when it comes to collecting, providing or working with data as part of public tasks. One of the potential detrimental side effects is that it quickly becomes unclear who can decide to open such data up. Or whether a government entity, who wants to publish data as part of a policy intervention, still feels able to do so. This ability to decide over your own data, I call data sovereignty. I think without proper attention, the data sovereignty of public institutions is under pressure in collaborative situations and a threat to the freedom of public entities to decide and act on their own open data efforts. This is especially problematic where the lack of data sovereignty hinders public entities in deploying open data as a policy instrument.
I have just completed an inventory of the data sets that a Dutch province holds and the visible erosion of data sovereignty was the main unexpected outcome for me.
This erosion takes different shapes. Here are a few examples of it, encountered in the Province I mentioned:
- Data collection on businesses locations and the number of people they employ (to track employment per municipality per sector) is being pooled by all provinces (as a national level data set is more useful). The pooling takes place in a separate legal entity. It is unclear if this entity still falls under FOIA and re-use regulations. This entity also exploits the data by selling it. Logical at the organisational level perhaps, but illogical in comparison with the provincial public task (and maybe not even legal under the Re-Use law). Opening up the data needs to be done through that new entity, meaning not just convincing yourself, but all other provinces as well as the entity who has commercial interest in not being convinced. The slowest will thus set the speed.
- Data collection on traffic flows, collected by the Province, is stored directly in a national data warehouse (NDW). Again pooling data makes it more useful, but the Province cannot store cleaned data there (anomalies filtered out, pattern changes explained etc.), so always needs to redo that cleaning and filtering whenever they want to work or access their own data. Although the publicly owned NDW now publishes open data, until recently they saw themselves as a commercial outfit, adverse to the notion of open data.
- Data collection on bicycle traffic, done by the Province, is stored in the online database of a French service provider active in the entire EU. Ownership of the data is unclear. The Province only accesses the data through the French website. If a FOIA request came, it would be unclear if providing the data runs counter to any rights the service provider is claiming.
- Data collection on the prevalence of bird species is being collected in collaboration with nature preservation groups and large numbers of volunteers. The Province pays for the data collection, but the nature preservation groups claim their volunteers (by virtue of their voluntary efforts) are the rightful owners of the data. Without seeking internal legal advice, the discussion remains unsolved and stalls.
None of these situations are unsolvable, all of them can get a definitive answer. The issue however is that nobody is clearly in a position, or has the explicit role to make sure such an definitive answer gets formulated. Because of that, uncertainties remain, which easily leads to inaction. If and when the Province wants to act to open data up, it therefore easily runs into all kinds of questions that will slow action down, or ensure action does not get taken.
It is entirely logical that public entities are collaborating in networks with other public entities and domain-specific stakeholders for the collection, dissemination and use of data. It is also certain, given our networked society and the drive for efficiency, the number of situations where such collaboration takes place will only rise. However, for the drive towards more openness it is detrimental when ownership of public data becomes unclear, gets transferred to an entity that potentially falls outside the scope of FOIA, or falls under the rights of a private entity, just because nobody sought to clarify such matters at the outset.
Public entities should learn to strongly guard their data sovereignty if they want to maintain their own agency in using opening up data as a policy instrument. Moving to open by design as a default for the public sector, requires stopping the erosion of data sovereignty.