I expect that whatever she finds in missing data within the UK public sector, similar or matching examples can be found in other countries, such as here in the Netherlands.
One such Dutch example are the election results per candidate per polling station. The election council (Kiesraad) that certifies election results only needs the aggregated results per municipality, and that is what it keeps track of. Local governments of course have this data immediately after counting the votes, but after providing that data to the Kiesraad their role is finished.
The Open State Foundation (disclosure: I’m its current chairman of the board) in recent years has worked towards ensuring results per polling station are available as open data. In the recent provincial and water authority elections the Minister for the Interior called upon municipalities to publish these results as machine readable data. About 25% complied, the other data files were requested by the Open State Foundation in collaboration with national media to get to a complete data set. This way for the first time, this data now exists as a national data set, and is available to the public.
I’ve worked on opening data, mainly with governments worldwide for the past decade. Since 2 years I’ve been living in Amersfoort, and since then I’ve been a participant in the Measure Your City network, with a sensor kit. I also run a LoRaWan gateway to provide additional infrastructure to people wanting to collect sensor data. Today I’d like to talk to you about using open data. What it is, what exists, where to find it, and how to get it. Because I think it can be a useful resource in citizen science.
What is open data? It is data that is published by whoever collected it in such a way, so that anyone is permitted to use it. Without any legal, technical or financial barriers.
This means an open license, such as Creative Commons 0, open standards, and machine readable formats.
Anyone can publish open data, simply by making it available on the internet. And plenty people, academics, and companies do. But mostly open data means we’re looking at government for data.
That’s because we all have a claim on our government, we are all stakeholders. We already paid for the data as well, so it’s all sunk costs, while making it available to all as infrastructure does not increase the costs a lot. And above all: governments have many different tasks, and therefore lots of different data. Usually over many years and at relatively good quality.
The legal framework for open data consists of two parts. The national access to information rules, in NL the WOB, which says everything government has is public, unless it is not.
And the EU initiated regulation on re-using, not just accessing, government material. That says everything that is public can be re-used, unless it can’t. Both these elements are passive, you need to request material.
A new law, the WOO, makes publication mandatory for more things. (For some parts publication is already mandated in laws, like in the WOB, the Cadastre law, and the Company Register)
Next to that there are other elements that play a role. Environmental data must be public (Arhus convention), and INSPIRE makes it mandatory for all EU members to publish certain geographic data. A new EU directive is in the works, making it mandatory for more organisations to publish data, and for some key data sets to be free of charge (like the company register and meteo data)
Next to the legal framework there are active Dutch policies towards more open data: the Data Agenda and the Open Government action plan.
The reason open data is important is because it allows people to do new things, and more importantly it allows new people, who did not have that access before, to do new things. It democratises data sources, that were previously only available to a select few, often those big enough to be able to pay for access. This has now been a growing movement for 10-15 years.
That new agency has visible effects. Economically and socially.In fact you probably already use open data on a daily basis without noticing. When you came here today by bike, you probably checked Buienradar. Which is based on the open data of the KNMI. Whenever in Wikipedia you find additional facts in the right hand column, that informations doesn’t come from Wikipedia but is often directly taken from government databases. The same is true for a lot of the images in Wikipedia, of monuments, historic events etc. They usually come from the open collections of national archives, etc.
When Google presents you with traffic density, like here the queues in front of the traffic lights on my way here, it’s not Google’s data. It’s government data, that is provided in near real-time from all the sensors in the roads. Google just taps into it, and anyone could do the same.You could do the same.
There are many big and small data sets that can be used for a new specific purpose. Like when you go to get gas for the car. You may have noticed at manned stations it takes a few seconds for the gas pump to start? That’s because they check your license plate against the make of the car, in the RDW’s open database. Or for small practical issues. Like when looking for a new house, how much sunshine does the garden get. Or can I wear shorts today (No!).
But more importantly for today’s discussion, It can be a powerful tool for citizen scientists as well. Such as in the public discussion about the Groningen earth quakes. Open seismological data allowed citizens to show their intuition that the strength and frequency of quakes was increasing was real. Using open data by the KNMI.Or you can use it to explore the impact of certain things or policies like analysing the usage statistics of the Utrecht bicycle parking locations.A key role open data can play is to provide context for your own questions. Core registers serve as infrastructure, key datasets on policy domains can be the source for your analysis. Or just a context or reference.
Here is a range of examples. The AHN gives you heights of everything, buildings, landscape etc.
But it also allows you to track growth of trees etc. Or estimate if your roof is suitable for solar panels.This in combination with the BAG and the TOP10NL makes the 3d image I started with possible. To construct it from multiple data sources: it is not a photograph but a constructed image.
The Sentinel satellites provide you with free high resolution data. Useful for icebreakers at sea, precision agriculture, forest management globally, flooding prevention, health of plants, and even to see if grasslands have been damaged by feeding geese or mice. Gas mains maintainer Stedin uses this to plan preventative maintenance on the grid, by looking for soil subsidence. Same is true for dams, dikes and railroads. And that goes for many other subjects. The data is all there. Use it to your advantage. To map your measurements, to provide additional proof or context, to formulate better questions or hypotheses.
It can be used to build tools that create more insigt. Here decision making docs are tied to locations. 38 Amersfoort council issues are tied to De Koppel, the area we are in now. The same is true for many other subjects. The data is all there. Use it to your advantage. To map your measurements, to provide additional proof or context, to formulate better questions or hypotheses.
Maybe the data you need isn’t public yet. But it might be. So request it. It’s your right. Think about what data you need or might be useful to you.
Be public about your data requests. Maybe we can for a Koppelting Data Team. Working with data can be hard and disappointing, doing it together goes some way to mitigate that.
[This post was created using a small hack to export the speaking notes from my slidedeck. Strangely enough, Keynote itself does not have such an option. Copying by hand takes time, by script it is just a single click. It took less than 10 minutes to clean up my notes a little bit, and then post the entire thing.]
The European Commission proposed a new PSI Directive, that describes when and how publicly held data can be re-used by anyone (aka open government data). The proposal contains several highly interesting elements: it extends the scope to public undertakings (utilities and transport mostly) and research data, it limits the ways in which government can charge for data, introduces a high value data list which must be freely and openly available, mandates API’s, and makes de-facto exclusive arrangements transparant. It also calls for delegated powers for the EC to change practical details of the Directive in future, which opens interesting possibilities. In the coming months (years) it remains to be seen what the Member States and the European Parliament will do to weaken or strengthen this proposal.
Changes in the PSI Directive announced
On 25 April, the European Commission announced new measures to stimulate the European data economy, said to be building on the GDPR, as well as detailing the European framework for the free flow of non-personal data. The EC announced new guidelines for the sharing of scientific data, and for how businesses exchange data. It announced an action plan that increases safeguards on personal data related to health care and seeks to stimulate European cooperation on using this data. The EC also proposes to change the PSI Directive which governs the re-use of public sector information, commonly known as Open Government Data. In previous months the PSI Directive was evaluated (see an evaluation report here, in which my colleague Marc and I were involved)
This post takes a closer look at what the EC proposes for the PSI Directive. (I did the same thing when the last version was published in 2013)
This is of course a first proposal from the EC, and it may significantly change as a result of discussions with Member States and the European Parliament, before it becomes finalised and enters into law. Taking a look at the proposed new directive is of interest to see what’s new, what from an open data perspective is missing, and to see where debate with MS is most likely. Square bullets indicate the more interesting changes.
The Open Data yardstick
The original PSI Directive was adopted in 2003 and a revised version implemented in 2015. Where the original PSI Directive stems from well before the emergence of the Open Data movement, and was written with mostly ‘traditional’ and existing re-users of government information in mind, the 2015 revision already adopted some elements bringing it closer to the Open Definition. With this new proposal, again the yardstick is how it increases openness and sets minimum requirements that align with the open definition, and how much of it will be mandatory for Member States. So, scope and access rights, redress, charging and licensing, standards and formats are important. There are also some general context elements that stand out from the proposal.
A floor for the data-based society
In the recital for the proposal what jumps out is a small change in wording concerning the necessity of the PSI Directive. Where it used to say “information and knowledge” it now says “the evolution towards a data-based society influences the life of every citizen”. Towards the end of the proposal it describes the Directive as a means to improve the proper functioning of the European data economy, where it used to read ‘content industry’. The proposed directive lists minimum requirements for governments to provide data in ways that enable citizens and economic activity, but suggests Member States can and should do more, and not just stick with the floor this proposal puts in place.
Novel elements: delegated acts, public undertakings, dynamic data, high value data
There are a few novel elements spread out through the proposal that are of interest, because they seem intended to make the PSI Directive more flexible with an eye to the future.
The EC proposal ads the ability to create delegated acts. This would allow practical changes without the need to revise the PSI Directive and have it transposed into national law by each Member States. While this delegated power cannot be used to change the principles in the directive, it can be used to tweak it. Concerning charging, scope, licenses and formats this would provide the EC with more elbow room than the existing ability to merely provide guidance. The article is added to be able to maintain a list of ‘high value data sets’, see below.
Public undertakings are defined and mentioned in parallel to public sector bodies in each provision . Public undertakings are all those that are (in)directly owned by government bodies, significantly financed by them or controlled by them through regulation or decision making powers. It used to say only public sector, basically allowing governments to withdraw data from the scope of the Directive by putting them at a distance in a private entity under government control. While the scope is enlarged to include public undertakings in specific sectors only, the rest of the proposal refers to public undertakings in general. This is significant I think, given the delegated powers the EC also seeks.
Dynamic and real-time data is brought firmly in scope of the Directive. There have been court cases where data provision was refused on the grounds that the data did not exist when the request was made. That will no longer be possible with this proposal.
The EC wants to make a list of ‘high value datasets’ for which more things are mandatory (machine readable, API, free of charge, open standard license). It will create the list through the mentioned delegated powers. In my experience deciding on high value data sets is problematic (What value, how high? To whom?) and reinforces a supply-side perspective more over a demand driven approach. The Commission defines high value as “being associated with important socio-economic benefits” due to their suitability for creating services, and “the number of potential beneficiaries” of those services based on these data sets.
Access rights and scope
Public undertakings in specific sectors are declared within scope. These sectors are water, gas/heat, electricity, ports and airports, postal services, water transport and air transport. These public undertakings are only within scope in the sense that requests for re-use can be submitted to them. They are under no obligation to release data.
Research data from publicly funded research that are already made available e.g. through institution repositories are within scope. Member States shall adopt national policies to make more research data available.
A previous scope extension (museums, archives, libraries and university libraries) is maintained. For educational institutions a clarification is added that it only concerns tertiary education.
The proposed directive builds as before on existing access regimes, and only deals with the re-use of accessible data. This maintains existing differences between Member States concerning right to information.
Public sector bodies, although they retain any database rights they may have, cannot use those database rights to prevent or limit re-use.
Asking for documents to re-use, and redress mechanisms if denied
The way in which citizens can ask for data or the way government bodies can respond, has not changed
The redress mechanisms haven’t changed, and public undertakings, educational institutes research organisations and research funding organisations do not need to provide one.
The proposal now explicitly mentions free of charge data provision as the first option. Fees are otherwise limited to at most ‘marginal costs’
The marginal costs are redefined to include the costs of anonymizing data and protecting commercially confidential material. The full definition now reads “ marginal costs incurred for their reproduction, provision and dissemination and where applicable anonymisation of personal data and measures to protect commercially confidential information.” While this likely helps in making more data available, in contrast to a blanket refusal, it also looks like externalising costs on the re-user of what is essentially badly implemented data governance internally. Data holders already should be able to do this quickly and effectively for internal reporting and democratic control. Marginal costing is an important principle, as in the case of digital material it would normally mean no charges apply, but this addition seems to open up the definition to much wider interpretation.
The ‘marginal costs at most’ principle only applies to the public sector. Public undertakings and museum, archives etc. are excepted.
As before public sector bodies that are required (by law) to generate revenue to cover the costs of their public task performance are excepted from the marginal costs principle. However a previous exception for other public sector bodies having requirements to charge for the re-use of specific documents is deleted.
The total revenue from allowed charges may not exceed the total actual cost of producing and disseminating the data plus a reasonable return on investment. This is unchanged, but the ‘reasonable return on investment’ is now defined as at most 5 percentage points above the ECB fixed interest rate.
Re-use of research data and the high value data-sets must be free of charge. In practice various data sets that are currently charged for are also likely high value datasets (cadastral records, business registers for instance). Here the views of Member States are most likely to clash with those of the EC
The proposal contains no explicit move towards open licenses, and retains the existing rules that standard license should be available, and those should not unnecessarily restrict re-use, nor restrict competition. The only addition is that Member States shall not only encourage public sector bodies but all data holders to use such standard licenses
High value data sets must have a license compatible with open standard licenses.
Non-discrimination and Exclusive agreements
Non-discrimination rules in how conditions for re-use are applied, including for commercial activities by the public sector itself, are continued
Exclusive arrangements are not allowed for public undertakings, as before for the public sector, with the same existing exceptions.
Where new exclusive rights are granted the arrangements now need to made public at least two months before coming into force, and the final terms of the arrangement need to be transparant and public as well.
Important is that any agreement or practical arrangement with third parties that in practice results in restricted availability for re-use of data other than for those third parties, also must be published two months in advance, and the final terms also made transparant and public. This concerns data sharing agreements and other collaborations where a few third parties have de facto exclusive access to data. With all the developments around smart cities where companies e.g. have access to sensor data others don’t, this is a very welcome step.
Formats and standards
Public undertakings will need to adhere to the same rules as the public sector already does: open standards and machine readable formats should be used for both documents and their metadata, where easily possible, but otherwise any pre-existing format and language is acceptable.
Both public sector bodies and public undertakings should provide API’s to dynamic data, either in real time, or if that is too costly within a timeframe that does not unduly impair the re-use potential.
High value data sets must be machine readable and available through an API
Let’s see how the EC takes this proposal forward, and what the reactions of the Member States and the European Parliament will be.
The quality of information households in local governments is often lacking.
Things like security, openness and privacy are safeguarded by putting separate fences for each around the organisation, but those safeguards lack having detailed insight into data structures and effective corresponding processes. As archiving, security, openness and privacy in a digitised environment are basically inseparable, doing ‘everything by design’ is the only option. The only effective way is doing everything at the level of the data itself. Fences are inefficient, ineffective, and the GDPR due to its obligations will show how the privacy fence fails, forcing organisations to act. Only doing data governance for privacy is senseless, doing it also for openness, security and archiving at the same time is logical. Having good detailed inventories of your data holdings is a useful instrument to start asking the hard questions, and have meaningful conversations. It additionally allows local government to deploy open or shared data as policy instrument, and releasing the inventory itself will help articulate civic demand for data. We’ve done a range of these inventories with local government.
1: High time for mature data governance in local and regional government
Digitisation changes how we look at things like openness, privacy, security and archiving, as it creates new affordances now that the content and its medium have become decoupled. It creates new forms of usage, and new needs to manage those. As a result of that e.g. archivists find they now need to be involved at the very start of digital information processes, whereas earlier their work would basically start when the boxes of papers were delivered to them.
The reality is that local and regional governments have barely begun to fully embrace and leverage the affordances that digitisation provides them with. It shows in how most of them deal with information security, openness and privacy: by building three fences.
Security is mostly interpreted as keeping other people out, so a fence is put between the organisation and the outside world. Inside it nothing much is changed. Similarly a second fence is put in place for determining openness. What is open can reach the outside world, and the fence is there to do the filtering. Finally privacy is also dealt with by a fence, either around the entire organisation or a specific system, keeping unwanted eyes out. All fences are a barrier between outside and in, and within the organisation usually no further measures are taken. All three fences exist separately from each other, as stand alone fixes for their singular purpose.
The first fence: security
In the Netherlands for local governments a ‘baseline information security’ standard applies, and it determines what information should be regarded as business critical. Something is business critical if its downtime will stop public service delivery, or of its lack of quality has immediate negative consequences for decision making (e.g. decisions on benefits impacting citizens). Uptime and downtime are mostly about IT infrastructure, dependencies and service level agreements, and those fit the fence tactic quite well. Quality in the context of security is about ensuring data is tamper free, doing audits, input checks, and knowing sources. That requires a data-centric approach, and it doesn’t fit the fence-around-the-organisation tactic.
The second fence: openness
Openness of local government information is mostly at request, or at best as a process separate from regular operational routines. Yet the stated end game is that everything should be actively open by design, meaning everything that can be made public will be published the moment it is publishable. We also see that open data is becoming infrastructure in some domains. The implementation of the digitisation of the law on public spaces, requires all involved stakeholders to have the same (access to) information. Many public sector bodies, both local ones and central ones like the cadastral office, have concluded that doing that through open data is the most viable way. For both the desired end game and using open data as infrastructure the fence tactic is however very inefficient.
At the same time the data sovereignty of local governments is under threat. They increasingly collaborate in networks or outsource part of their processes. In most contracts there is no attention paid to data, other than in generic terms in the general procurement conditions. We’ve come across a variety of examples where this results 1) in governments not being able to provide data to citizens, even though by law they should be able to 2) governments not being able to access their own data, only resulting graphs and reports, or 3) the slowest partner in a network determining the speed of disclosure. In short, the fence tactic is also ineffective. A more data-centric approach is needed.
The third fence: personal data protection
Mostly privacy is being dealt with by identifying privacy sensitive material (but not what, where and when), and locking it down by putting up the third fence. The new EU privacy regulations GDPR, which will be enforced from May this year, is seen as a source of uncertainty by local governments. It is also responded to in the accustomed way: reinforcing the fence, by making a ‘better’ list of what personal data is used within the organisation but still not paying much attention to processes, nor the shape and form of the personal data.
However in the case of the GDPR, if it indeed will be really enforced, this will not be enough.
GDPR an opportunity for ‘everything by design’
The GDPR confers rights to the people described by data, like the right to review, to portability, and to be forgotten. It also demands compliance is done ‘by design’, and ‘state of the art’. This can only be done by design if you are able to turn the rights of the GDPR into queries on your data, and have (automated) processes in place to deal with requests. It cannot be done with a ‘better’ fence. In the case of the GDPR, the first data related law that takes the affordances of digitisation as a given, the fence tactic is set to fail spectacularly. This makes the GDPR a great opportunity to move to a data focus not just for privacy by design, but to do openness, archiving and information security (in terms of quality) by design at the same time, as they are converging aspects of the same thing and can no longer be meaningfully separated. Detailed knowledge about your data structures then is needed.
Local governments inadvertently admit fence-tactic is failing
Governments already clearly yet indirectly admit that the fences don’t really work as tactic.
Local governments have been loudly complaining for years about the feared costs of compliance, concerning both openness and privacy. Drilling down into those complaints reveals that the feared costs concern the time and effort involved in e.g. dealing with requests. Because there’s only a fence, and usually no processes or detailed knowledge of the data they hold, every request becomes an expedition for answers. If local governments had detailed insight in the data structures, data content, and systems in use, the cost of compliance would be zero or at least indistinguishable from the rest of operations. Dealing with a request would be nothing more than running a query against their systems.
Complaints about compliance costs are essentially an admission that governments do not have their house in order when it comes to data.
The interviews I did with various stakeholders as part of the evaluation of the PSI Directive confirm this: the biggest obstacle stakeholders perceive to being more open and to realising impact with open data is the low quality of information systems and processes. It blocks fully leveraging the affordances digitisation brings.
Towards mature data governance, by making inventory
Changing tactics, doing away with the three fences, and focusing on having detailed knowledge of their data is needed. Combining what now are separate and disconnected activities (information security, openness, archiving and personal data protection), into ‘everything by design’. Basically it means turning all you know about your data into metadata that becomes part of your data. So that it will be easy to see which parts of a specific data set contain what type of person related data, which data fields are public, which subset is business critical, the records that have third party rights attached, or which records need to be deleted after a specific amount of time. Don’t man the fences where every check is always extra work, but let the data be able to tell exactly what is or is(n’t) possible, allowed, meant or needed. Getting there starts with making an inventory of what data a local or regional government currently holds, and describing the data in detailed operational, legal and technological terms.
Mature digital data governance: all aspects about the data are part of the data, allowing all processes and decisions access to all relevant material in determining what’s possible.
2: Ways local government data inventories are useful
Inventories are a key first step in doing away with the ineffective fences and towards mature data governance. Inventories are also useful as an instrument for several other purposes.
Local is where you are, but not the data pro’s
There’s a clear reason why local governments don’t have their house in order when it comes to data.
Most of our lives are local. The streets we live on, the shopping center we frequent, the schools we attend, the spaces we park in, the quality of life in our neighbourhood, the parks we walk our dogs in, the public transport we use for our commutes. All those acts are local.
Local governments have a wide variety of tasks, reflecting the variety of our acts. They hold a corresponding variety of data, connected to all those different tasks. Yet local governments are not data professionals. Unlike singular-task, data heavy national government bodies, like the Cadastre, the Meteo institute or the department for motor vehicles, local governments usually don’t have the capacity or capability. As a result local governments mostly don’t know their own data, and don’t have established effective processes that build on that data knowledge. Inventories are a first step. Inventories point to where contracts, procurement and collaboration leads to loss of needed data sovereignty. Inventories also allow determining what, from a technology perspective, is a smooth transition path to the actively open by design end-game local governments envision.
Open data as a policy instrument
Where local governments want to use the data they have as a way to enable others to act differently or in support of policy goals, they need to know in detail which data they hold and what can be done with it. Using open data as policy instrument means creating new connections between stakeholders around a policy issue, by putting the data into play. To be able to see which data could be published to engage certain stakeholders it takes knowing what you have, what it contains, and in which shape you have it first.
Better articulated citizen demands for data
Making public a list of what you have is also important here, as it invites new demand for your data. It allows people to be aware of what data exists, and contemplate if they have a use case for it. If a data set hasn’t been published yet, its existence is discoverable, so they can request it. It also enables local government to extend the data they publish based on actual demand, not assumed demand or blindly. This increases the likelihood data will be used, and increases the socio-economic impact.
More and more new data is emerging, from sensor networks in public and private spaces. This way new stakeholders and citizens are becoming agents in the public space, where they meet up with local governments. New relationships, and new choices result. For instance the sensor in my garden measuring temperature and humidity is part of the citizen-initiated Measure your city network, but also an element in the local governments climate change adaptation policies. For local governments as regulators, as guardian of public space, as data collector, and as source of transparency, this is a rebalancing of their position. It again takes knowing what data you own and how it relates to and complements what others collect and own. Only then is a local government able to weave a network with those stakeholders that connects data into valuable agency for all involved. (We’ve built a guidance tool, in Dutch, for the role of local government with regard to sensors in public spaces)
Having detailed data inventories are a way to start having the right conversations for local governments on all these points.
3: Getting to inventories
To create useful and detailed inventories, as I and my colleagues did for half a dozen local governments, some elements are key in my view. We looked at structured data collections only, so disregarded the thousands of individual once-off spreadsheets. They are not irrelevant, but obscure the wood for the trees. Then we scored all those data sets on up to 80(!) different facets, concerning policy domain, internal usage, current availability, technical details, legal aspects, and concerns etc. A key element in doing that is not making any assumptions:
don’t assume your list of applications will tell you what data you have. Not all your listed apps will be used, others won’t be on the list, and none of it tells you in detail what data actually is processed in them, just a generic pointer
don’t assume information management knows it all, as shadow information processes will exist outside of their view
don’t assume people know when you ask them how they do their work, as their description and rationalisation of their acts will not match up with reality,
let them also show you
don’t assume people know the details of the data they work with, sit down with them and look at it together
don’t assume what it says on the tin is correct, as you’ll find things that don’t belong there (we’ve e.g. found domestic abuse data in a data set on litter in public spaces)
Doing an inventory well means
diving deeply into which applications are actually used,
talking to every unit in the organisation about their actual work and seeing it being done,
looking closely at data structures and real data content,
looking closely at current metadata and its quality
separately looking at large projects and programs as they tend to have their own information systems,
going through external communications as it may refer to internally held data not listed elsewhere,
looking at (procurement and collaboration) contracts to determine what claims other might have on data,
and then cross-referencing it all, and bringing it together in one giant list, scored on up to 80 facets.
Another essential part, especially to ensure the resulting inventory will be used as an instrument, is from the start ensuring the involvement and buy-in of the various parts of local government that usually are islands (IT, IM, legal, policy departments, archivists, domain experts, data experts). So that the inventory is something used to ask a variety of detailed questions of.
We’ve followed various paths to do inventories, sometimes on our own as external team, sometimes in close cooperation with a client team, sometimes a guide for a client team while their operational colleagues do the actual work. All three yield very useful results but there’s a balance to strike between consistency and accuracy, the amount of feasible buy-in, and the way the hand-over is planned, so that the inventory becomes an instrument in future data-discussions.
What comes out as raw numbers is itself often counter-intuitive to local government. Some 98% of data typically held by Dutch Provinces can be public, although usually some 20% is made public (15% open data, usually geo-data). At local level the numbers are a bit different, as local governments hold much more person related data (concerning social benefits for instance, chronic care, and the persons register). About 67% of local data could be public, but only some 5% usually is. This means there’s still a huge gap between what can be open, and what is actually open. That gap is basically invisible if a local government deploys the three fences, and as a consequence they run on assumptions and overestimate the amount that needs the heaviest protection. The gap becomes visible from looking in-depth at data on all pertinent aspects by doing an inventory.
(Interested in doing an inventory of the data your organisations holds? Do get in touch.)
Both reports contain interesting insights and conclusions.
Both reports are also useless.
Because the data underneath the reports has not been published. Without explanation.
That is of course rather surprising because the subject of the reports is open data. At least when the topic is openness, all the related material should be open. That is why, when we built the EU PSI Scoreboard in 2011, we published all the underlying data right alongside the scoreboard. As does the Open Data Barometer. As does the Open Data Index. As does the Digital Agenda Scoreboard. But not the European Open Data Portal project. I would have expected the data under both reports by the European Open Data Portal to actually be available in the European Open Data Portal.
Missing data destroys the report’s value
Not having the data renders the report on open data maturity useless:
it makes interpretation of the conclusions impossible, as there is no way to see if the assertions chime with the collected data, nor if that data chimes with ones own experience in the field
it makes any meaningful discussion about the merits of the report impossible, even where it gives rise to questions (such as, what makes Bulgaria an open data trendsetter?)
it makes formulating actions aimed at improvement impossible, as the data to determine what improvements can be made are not available
Thus after reading the report nobody is, nor can they be, any the wiser as to how to move forward.
I approached the European Commission, and through them the authors, to request the data. After a few messages back and forth, the reason that the data is not published became clear: the national representatives involved in the project, such as the members of the EU PSI Group, have witheld publication of the data. I assume because of cold feet and dreading actual comparison between countries. Not publishing the data however, even if not intended as such, is also sending a clear message: “we’re not serious about openness.” The verdict when it comes to European open data maturity therefore is likely “not very mature”.
Requesting data per country needed
A very few countries may pro-actively publish the data about themselves, but most will not. To obtain the data used for the open data maturity report, it is now needed to approach all the national government representatives involved and request the data from them.
Which I intend to do. Help is welcome. [UPDATE: I have approached most of the governments involved, to ask for the information that could make the maturity report actually useful.]
Current status of open spending
Let’s give you a general overview of open spending in the Netherlands first. As you can see in the Open Data Census, open spending data is the single biggest missing chunk of data in the Netherlands. The national budget is available as open data, since 2012, thanks to the work of the Dutch national audit office, but only on an aggregated level. The Ministry for Foreign Affairs is publishing transaction level data on international aid since 2012 as part of IATI, and is the only Dutch public sector body doing this. On a local level some aggregated spending data is available through the Open State Foundation‘s project openspending.nl. In the past months I have gathered local spending data from 25 local councils, and provided it to this project to make comparisons across local governments possible. In a current project with the Province of North-Holland, we are, in collaboration with 10 local governments, aiming to open up the spending data of 50+ local councils. There is no requirement, unlike in the UK, for government bodies to publish open spending data.
The session took place in the old plenary meeting room of the Parliament
National Audit Authority: Forwards with open spending!
President of the National Audit Authority Saskia Stuiveling had the clearest message during the parliamentary committee meeting, in terms of general outlook as well as leading by example. Even for the audit authority it is often hard to get the right data to properly audit government spending. Opening up spending data by default will help them to concentrate on those parts of public policy where it matters most, e.g. health care spending. To lead by example the audit authority has opened up their own spending data this spring. They also published a ‘Trend Report Open Data‘ tracking the open data efforts of all Ministries, and urging them to do more. Opening up data is becoming a standard advice given in all their audit reports. In other words they are building up pressure for Ministries to do more. (disclosure: I worked with the audit authority on the trend report open data)
Foreign Affairs: Open spending is useful instrument
The Ministry for Foreign Affairs presented itself as a proponent of more financial transparency. Having started publishing open spending data on international development in 2012, they will be launching a (Tableau) based viewer for that data on June 11th, which includes the possibility to drill down to project level information and can link to external sources such as project descriptions published by NGO’s. A viewer like this serves as a replacement for yearly paper based reporting, makes a step towards visualizing impact and not just spending, as well as is a means to motivate more NGO’s towards bigger spending transparency.
Finance Ministry: following Audit Authority’s lead
The Finance Ministry until now has done little towards open spending, but during the session in the Parliament they showed how the work done by the audit authority mentioned above has prodded them into action as well. Triggered by the open data trend report last March, they have now opened up aggregated spending for the first time (update from Rense Posthumus in the comments: data is located at opendata.rijksbegroting.nl). Also the Finance Ministry announced that subsidies data and basic financial data of independent government agencies is available in a viewer in sneak preview, though no URL was given yet. It wasn’t indicated when this would be made publicly available. (UPDATE: see comment by Rense Posthumus) The plan to publish departmental spending for all ministries by 2016 was announced, but made dependent on ‘creating a standard reporting method’ first. That met with resistance in the audience: if the data is good enough for the Finance Ministry to work with, why isn’t it good enough to publish? That argument did seem to resonate with the Ministry director present.
Interior Affairs: very disappointing
A very disappointing contribution was made by the Ministry for the Interior’s deputy director-general. This Ministry is nominally responsible for the open government and open data efforts of the government, as well as in the lead to reform the Freedom of Information Act in light of the new EU Directive on the re-use of public sector information, but in this session showed a shocking lack of vision and no will to act. In 20 minutes nothing was said about open government at all, leaving the attending Members of Parliament confused. Even the actions the Ministry hás taken, such as the launch of the national data portal in 2011, and joining the Open Government Partnership (albeit with an Action Plan that adroitly avoids formulating action), weren’t mentioned. From this presentation one can only conclude that nothing much can be expected from this Ministry in the near future. This means other public sector bodies are left largely to their own devices, which is a shame as it means lots of time will be lost clearing up confusion and raising the general level of knowledge on how to do open government data well. The Ministry for the Interior, being in charge of the open government dossier, is the only one inside government who could claim a much needed role of ‘lighthouse’ and beacon for established good practice, but they’re not on the ball, nor seem to aim to be.
FOIA readiness and process assessment
Now that I have send out 24 identical FOIA requests for spending data, and have the original one as benchmark, this provides good opportunity to compare the way municipalities deal with FOIA requests. So that provides the second purpose of this exercise.
I will track the progress of my 24 FOIA requests, and document the results. Thusfar 5 out of 24 have let me know their digital communication path is closed for FOIA, so I have posted letters to those. One (1) municipality quickly confirmed my request, properly recognizing it as a FOIA request and stating it had been forwarded to the right person internally, a handful of others automatically confirmed reception of my e-mail.
Yesterday I presented at TEDxZwolle. For a general audience I presented the case for Open Data, and called upon them to get involved. Because of the potential, but mostly because it is necessary to understand and deal with the complexity of our societies and lives.
Otherwise we are just ants, with no clue of how the ant hill works, even though we help create it with our actions. In our networked society we need to understand the ant hill.
Don’t be an ant, understand our ant hill. Get involved. Use Open Data. Understand your world.
In the context of the collaborative production in eGovernment study (more information on www.ourservices.eu) that a consortium I am part of is carrying out for the European Commission, we have prepared an online survey that is focused on innovators – initiators and evangelists of collaborative online services delivery, people who are improving public services “from the outside”. By collaborative production we mean services that engage citizens/civic associations/businesses in the design, delivery and evaluation of public services, irrespective of the service provider (government, civil society or business).
We are very interested in your views on drivers, barriers and impact of collaborative production, and hope you are willing to take part in our survey.
We would also appreciate if you could spread the information about the survey in your networks.
At OurServices.eu I have been collecting examples of collaborative e-government services, and am still adding more. I will also publish there descriptions for each EU Member State concerning these services. You are most welcome to also add your own examples. Please use the form on the website for that.
Below is a map of the over 100 examples of collaborative e-gov services collected so far.
Last weekend the Cognitive Cities conference took place in Berlin. It was very well organized and a inspiring event. Over 300 participants looked at how our digital networked era and cities can co-evolve. One of the organizers, Igor Schwarzmann, approached me to speak there and we settled on Open Government as a theme: how open government might be of help for cities.
This posting is a write-up of my talk “Spice Up Your City: Just Add OpenGov“.
Cities are complex adaptive systems. That means there is no predictability as to how they evolve and take shape, but you can see how things, once they are there, came to be. We, as human beings, immediately recognize the patterns and structures that emerge in cities. So much so that if someone mimicks those structures and patterns, for instance with pots, pans and other kitchen utensils, we instantly associate it with city scapes. We also intuitively know on a deep level what cities do for us, that they are serendipity hubs: a heady mix of ideas, people and resources that bounce into and off each other, making all kinds of new combinations possible. That intuition is what is worded in the REM quote. Cities, in short, are very exciting things.
Government on the other hand is mostly seen as much less exciting. And open government can be just as stale. Particularly so if you see open government as something you do for the sake of transparency. Either because you are a civil servant who thinks you need to do it for citizens. Or because you are an activist who thinks the concrete silos of government need to be cracked open so others can see what is going on inside. In both cases it is not for the sake of government or the people creating transparency itself, but for the imagined and assumed sake of unnamed ‘others’. I however hold a different view of open government, one that comes with a lot more excitement.
First, for government itself, open government is a ‘change or die’ issue. This is, as Chris Taggart says, the wave of digital disruption hitting government that previously hit the music and publishing industries. Governments institutions and work flows are ‘business models’ from an era when the logistic costs of organizing and scaling were quite different. In the digital era trust in government, as well as its ability to act, will only survive if government opens up and enters into a much more networked way of interacting with the public. If they don’t we all will see there is no wizard behind the curtain and simply route our actions around it, like is the norm in a network where some nodes fail.
I see open government as consisting of two components: participation + open government data. Now participation in the ‘classic’ way of being consulted at the start of some policy initiative is not what will make open government exciting for citizens. However, participation is actually synonymous with life itself, being an active person in your own social environment. Urban farming is a great example of this. Inner city Detroit has no shops that sell fresh vegetables anymore, and those without cars cannot drive out to shops that do outside the city. So urban farming emerged. Now that is participation! Open data at the same time is a rich untapped resource. Government holds enormous amounts of data about all aspects of society, to be able to execute its tasks. An EU legal framework is in place that, except when privacy and things like state security are concerned, allows citizens to get and re-use that data. Practice is not quite there yet, but ideally open data is data shared in open standards, machine readable, and comes with no legal strings attached.
Participation and open data need each other. Participation needs to be informed by data, and likewise the re-use of data lies in participation.
Together, forming open government, they make government as a platform possible, where government asks itself what type of data and information needs to be released so citizens and organizations can come up with the answers to the questions that politicians and policy makers ask. This in contrast to traditional government, where citizens and organizations ask, and politicians and civil servants are expected to come with solutions. The place where this can be expressed best and most tangible is right in our own living environments, our cities and neighborhoods.
That is where all the things happen that matter to us directly. So you get services where you can check if a restaurant is safe and clean enough to go eat, and platforms where citizens can report issues, or discuss what is going on in there neighborhood. This way you can inform yourself and your decisions.
Using singular data sources can however lead to a pitfall, of making visualizations that are really meaningless, that do not inform at all.
Much more interesting is when multiple data sources are combined and lead to new insights. That is like us all becoming Dr Snow, who figured out the connection between cholera and water quality in London in the 19th century.
But why stop at simply informing ourselves, why not also use data to activate ourselves. Why not use data so we can undertake things again. Like the Danish findtoilet.dk which allows people with bladder problems to go out into the city again without having to fear they will not know where the nearest toilet is in case of need. Or alerts send to you when air quality predictions cross a threshold you have set yourself.
And why not go even one step further. You can start augmenting government data with your own data. Having your own sensors collect data and publish them, like the Dutch sound sensor net created by citizens, or people feeding data into Pachube.com. When government publishes data it turns out that people and organizations are willing to also release data. This is happening in international aid, as well as visible in for instance the food industry.
But you can go one more step further still. That is building your own sensors, as well as actuators. Create data, and feed data from other sources into smart devices you build. So that these devices can take actions, based on the received data. The means for building those devices are available to you in FabLabs.
In this stage, we are truly acting like we should in complex environments: data form probes, and measurement has become intervention. That way we can build much more resilient communities. Cities are the perfect platform for data in the context of action and participation. Open government is a key ingredient to spice up our cities.
It does assume one thing though: your knowledge of a problem is leading, and coding and data skills are the literacy you need and use. Not the other way around. You need intimate knowledge of the issue you are addressing.
So here’s my challenge and invitation to you, to bring open government into play in your city: Find an issue that matters to you, that you own emotionally. Think about what data you need to address the issue. Then go to government and get that data. But realize that ‘the government’ does not exist. It consists of a multitude of organizations and bodies, and all of those are filled with people. So you just need to find one single civil servant that is willing to help you. I found my single civil servant in my city government, the guy in the blue shirt in the picture, who has been working with me and others to release data. You need to go out and find your guy in the blue shirt.
Make it real, make it matter to you, make it count. All it takes is just a little shove, to open things up.
(the conference organizers plan to make videos of the talks available soon)