Last week I presented to a provincial procurement team about how to better support open data efforts. Below is what I presented and discussed.

Open data as policy instrument and the legal framework demands better procurement

Publishing open data creates new activity. It does so in two ways. It allows existing stakeholders to do more themselves or do things differently. It also allows people who could not participate before become active as well. We’ve seen for instance how opening up provincial and national geographic data increases the independent usage of that data by local governments. We’ve also seen how for instance the Dutch hiking association started using national geographic data to create and better document routes. To the surprise of the Cadastre a whole new area of usage appeared as well, by cultural organisations who before had never requested such data. So open data is an enabler for agency.

If as a government data holder you know this effect takes place, you can also try and achieve it deliberately. For policy domains and groups of stakeholders where you would like to see more activity, publishing data then is an instrument in for instance achieving your own policy goals. Next to regulation and financing, publishing open data is a new third policy instrument. It also happens to be the cheapest of those three to deploy.

Open data in the EU has a legal framework where over time more things are mandated. There is a right to re-use. Upon request dataholders must be able to provide machine readable data formats. In the Netherlands open standards are compulsory for government entities since 2008. Exclusive access to government data for re-use is, except for a few very strictly regulated situations, illegal.

To be able to comply with the legal framework, and to be able to actively use open data as a policy instrument, public sector bodies must pay more attention to how they acquire data, and as a consequence must pay more attention to what happens during procurement processes. If you don’t the government entity’s data sovereignty is strongly diminished, which carries costs.

Procurement awareness needed on multiple levels

The goal is to ensure full data sovereignty. This means paying real attention to various things on different levels of abstraction around procurement.

  • Ensuring data is received in open standards and regular domain specific standards
  • Ensure when reports are received that the data used, such as for graphs and tables, are also received
  • Ensure when information products are received (maps, visualisations) the data used for them are also received
  • Ensure procurement and collaboration contracts do not preclude sharing data with third parties, apart from on grounds already mentioned as exceptions in the law on freedom of information and re-use
  • Ensure that when raw data is provided to service providers, that data is still available to the government entity
  • Ensure that when data is collected by external entities who in turn outsource the collection, all parties involved know the data falls under the decision making power of the government entity
  • Ensure in collaborations you do not sign away decision power over the data you contribute, you have rights to the data you collectively create, and have as little restriction as possible on the data others contribute.

What could go wrong?

Unless you always pay attention to these points, you run the risk of losing your data sovereignty. This can lead to situations where a government entity is no longer able to comply with its own legal obligations concerning data provision and transparency.

A few existing examples from what can go wrong.

  • A province is counting bicycle traffic through a network of sensors they deployed themselves. The data is directly transmitted to a service provider in a different country. The province can see dashboards and download reports, but has no access to the sensor data itself, and cannot download the sensor data. While any citizen requesting the data could not be provided with that data, the service provider itself does base commercial services on that and other data it receives, having de facto exclusive access to it.
  • Another province is outsourcing bird inventory counting to nature preservation organisations, who in turn rely on volunteers to do the bird watching. The province pays for the effort. When it comes to sharing the data publicly, the nature preservation organisations say their volunteers actually own the data, so nothing can be publicly shared. This is untrue for multiple reasons (database rights do not apply, it is a paid for effort so procurement terms that unequivocally transfer such rights should they exist to the province etc), but as the province doesn’t want to waste time on this, nor wants to get into a fight, it leaves it be, resulting in the data not being made available.
  • An energy network provider pools a lot of different data sources concerning energy usage in their service area from a network of collaborating entities, both private and public. They also publish a lot of open data already. As part of the national effort towards energy transition they receive many data requests from local governments, housing associations and other entities. They would like to provide data, as they see it as a way of contributing to an essential public task (energy transition), but still say no to data requests in 60% of all cases. Because they can’t figure out which contractual obligations apply to which parts of the data, or cannot reconcile conflicting or ambiguous contract clauses concerning the data.
  • All provinces pool data concerning economic activity and the labor market in a private foundation in which also private entities participate. That foundation sells data subscriptions. Currently they also publish some open data, but if any of the provinces would like to do more, they would have to wait for full agreement. The slowest in the group would determine the actual level of transparency.
  • A province has outsourced the creation of a ‘heat transition atlas’, in which the potential for moving away from natural gas burning heating systems in homes using various alternatives is mapped. The resulting interactive website contains different data layers, but those data layers are themselves unavailable. Although there is a general list of which data sources have been used, it is not precisely stating its sources and not providing details on how the data has been transformed for the website.

In all cases the public sector data holder has put itself in a position that could have been prevented had they paid more attention at the time of procurement or at the time of entering into collaboration. All these situations can be fixed later on, but they require additional effort, time and costs to arrange, which are unnecessary if dealt with during procurement.

But we have procurement regulations already!

What about procurement regulations. We have those, so don’t they cover all this? Mostly not it turns out.

Terms of procurement talk about rights transfer of all deliverables, but in many cases the data involved isn’t listed as a deliverable, so not covered by those terms.
The terms talk about transfer of database rights, but those hardly ever apply as usually the scale of data collection and structuring into a database is limited.
Concerning research there is some talk about also transferring the data concerned, but a lot of reports aren’t research but consultancy services.

In the general regulations that apply to provincial procurement, the word data only is used in the context of personal data protection, as the dutch plural for date, and in the context of data carriers (hard drives etc). The word standards never occurs, nor does it contain references to data formats (even though legal obligations exist for government entities concerning standards and data formats)

The procurement terms are neither broad enough, nor detailed enough.

How to improve the situation

So what needs to be arranged to ensure government entities arrange their data needs correctly during procurement? How to plug the holes? A few things at the very least:

  • Likely, when it comes to standards and formats (which may differ per domain), the only viable place is in the mandatory technical requirements in a call for tender / request for proposals.
  • To get the data behind graphs, tables, info products and reports, including a list of resources and transformations applied, it needs to be specified in the list of deliverables.
  • Collaboration contracts entered into should always have articles on sharing the data you contribute, being able to share the data resulting from the collaboration, and rules about data that others contribute.

It is important to realise that you cannot through contracts do away with any mandatory transparency, open data, or data governance aspects. Any resulting issues will mean time consuming and likely costly repair activities.

Who needs to be involved

In order to prevent the costs of repair or mitigation of consequences, there are a number of questions concerning who should be doing what, inside a government entity.

  • What needs to be arranged at the point of tender, who will check it?
  • What needs to be part of all project starts (e.g. Checklists, data paragraphs), is the project manager aware of this, and who will check it?
  • Who at the writing and signing of any contract will check data aspects?
  • Who at the time of delivery will check if data requirements are met?
  • What part of this is more about awareness and operatios, what needs to be done through regulation?

Our work in the next steps

We intend to assist the province involved in making sure procurement better enables data sharing from now on. Steps we are currently taking to move this forward are:

  • We’ve put data sovereignty into the organisations strategy document, and tied it into overall data governance improvement.
  • With the information management department we’ll visit all main procurers to discuss and propose actions
  • We’ll likely build one or more checklists for different aspects
  • We’ll work with a 3 person team from the procurement department to more deeply embed data awareness and amend procurement processes

All this is basically a preventative step to ensure the province has its house in order concerning data.

Today I was at a session at the Ministry for Interior Affairs in The Hague on the GDPR, organised by the center of expertise on open government.
It made me realise how I actually approach the GDPR, and how I see all the overblown reactions to it, like sending all of us a heap of mail to re-request consent where none’s needed, or taking your website or personal blog even offline. I find I approach the GDPR like I approach a quality assurance (QA) system.

One key change with the GDPR is that organisations can now be audited concerning their preventive data protection measures, which of course already mimics QA. (Next to that the GDPR is mostly an incremental change to the previous law, except for the people described by your data having articulated rights that apply globally, and having a new set of teeth in the form of substantial penalties.)

AVG mindmap
My colleague Paul facilitated the session and showed this mindmap of GDPR aspects. I think it misses the more future oriented parts.

The session today had three brief presentations.

In one a student showed some results from his thesis research on the implementation of the GDPR, in which he had spoken with a lot of data protection officers or DPO’s. These are mandatory roles for all public sector bodies, and also mandatory for some specific types of data processing companies. One of the surprising outcomes is that some of these DPO’s saw themselves, and were seen as, ‘outposts’ of the data protection authority, in other words seen as enforcers or even potentially as moles. This is not conducive to a DPO fulfilling the part of its role in raising awareness of and sensitivity to data protection issues. This strongly reminded me of when 20 years ago I was involved in creating a QA system from scratch for my then employer. Some of my colleagues saw the role of the quality assurance manager as policing their work. It took effort to show how we were not building a straightjacket around them that kept them within strict boundaries, but providing a solid skeleton to grow on, and move faster. Where audits are not hunts for breaches of compliance but a way to make emergent changes in the way people worked visible, and incorporate professionally justified ones in that skeleton.

In another presentation a civil servant of the Ministry involved in creating a register of all person related data being processed. What stood out most for me was the (rightly) pragmatic approach they took with describing current practices and data collections inside the organisation. This is a key element of QA as well. You work from descriptions of what happens, and not at what ’should’ happen or ‘ideally’ happens. QA is a practice rooted in pragmatism, where once that practice is described and agreed it will be audited.
Of course in the case of the Ministry it helps that they only have tasks mandated by law, and therefore the grounds for processing are clear by default, and if not the data should not be collected. This reduces the range of potential grey areas. Similarly for security measures, they already need to adhere to national security guidelines (called the national baseline information security), which likewise helps with avoiding new measures, proves compliance for them, and provides an auditable security requirement to go with it. This no doubt helped them to be able to take that pragmatic approach. Pragmatism is at the core of QA as well, it takes its cues from what is really happening in the organisation, what the professionals are really doing.

A third one dealt with open standards for both processes and technologies by the national Forum for Standardisation. Since 2008 a growing list of currently some 40 or so standards is mandatory for Dutch public sector bodies. In this list of standards you find a range of elements that are ready made to help with GDPR compliance. In terms of support for the rights of those described by the data, such as the right to export and portability for instance, or in terms of preventive technological security measures, and ‘by design’ data protection measures. Some of these are ISO norms themselves, or, as the mentioned national baseline information security, a compliant derivative of such ISO norms.

These elements, the ‘police’ vs ‘counsel’ perspective on the rol of a DPO, the pragmatism that needs to underpin actions, and the building blocks readily to be found elsewhere in your own practice already based on QA principles, made me realise and better articulate how I’ve been viewing the GDPR all along. As a quality assurance system for data protection.

With a quality assurance system you can still famously produce concrete swimming vests, but it will be at least done consistently. Likewise with GDPR you will still be able to do all kinds of things with data. Big Data and developing machine learning systems are hard but hopefully worthwile to do. With GDPR it will just be hard in a slightly different way, but it will also be helped by establishing some baselines and testing core assumptions. While making your purposes and ways of working available for scrutiny. Introducing QA upon its introduction does not change the way an organisation works, unless it really doesn’t have its house in order. Likewise the GDPR won’t change your organisation much if you have your house in order either.

From the QA perspective on GDPR, it is perfectly clear why it has a moving baseline (through its ‘by design’ and ‘state of the art’ requirements). From the QA perspective on GDPR it is perfectly clear what the connection is to how Europe is positioning itself geopolitically in the race concerning AI. The policing perspective after all only leads to a luddite stance concerning AI, which is not what the EU is doing, far from it. From that it is clear how the legislator intends the thrust of GDPR. As QA really.

Funny how #datagovernance companies publishing #gdpr compliance guides aren’t compliant themselves when asking personal data for downloads: no explicit opt-ins, hidden opt-ins (such as hitting download also subscribes you to their newsletter), no specific explanations on what data will be used how, asking more personal information than necessary.