Category Archives: Open Data

How a Small Municipality Shows the Way with Open Data

In 2014/2015 my colleague Frank and I worked with the Province of North-Holland and 9 municipalities in that province to position open data as a policy instrument: around specific local issues we would publish data, and reach out to potential re-users. Part of this process was to make open data a normal part of every day work on public tasks. Hollands Kroon, a rural municipality in the very north of the Province was one of the participants that succeeded in bringing open data into line management.

Now they have launched a new municipal website, following the so-called ‘top tasks’ model. In this model the most prominent information shown is the information citizens most need or want. I have interacted with many municipalities that because of moving to a ‘top-tasks’ website refused to publish data or the answers to the FOIA requests they received. They said “we’re in the process of limiting the information in our sites to the most sought after, so we’re not going to publish any data etc, that would be confusing.”

Not so in Hollands Kroon. This is how their new site looks, with open data a very prominent menu option.

HK Website

With this step, Hollands Kroon shows how they have embraced open data. Already after the program with the Province, called North-Holland Smarter, they had formed a data team, working to raise internal awareness for open data and data driven work, and working to raise interest in re-use. Now they’ve gone a step further in making open data a significant part of their external communications.

To me this is all the more remarkable, as when we started in 2014 Hollands Kroon as a small rural municipality doubted whether open data could be a useful tool to them, and assumed it would only make sense in urban environments, such as in Amsterdam, the biggest city in the Province of North-Holland. They then quickly realized there is potential for their own local context and policy issues as well, especially if you work together with neighbouring municipalities in the region, in collaboration with the Province.

FOSS4G Keynote: Open Data for Social Impact

Last week I had the pleasure to attend and to speak at the annual FOSS4G conference. This gathering of the community around free and open source software in the geo-sector took place in Bonn, in what used to be the German parliament. I’ve posted the outline, slides and video of my keynote already at my company’s website, but am now also crossposting it here.

Speaking in the former German Parliament
Speaking in the former plenary room of the German Parliament. Photo by Bart van den Eijnden

In my talk I outlined that it is often hard to see the real impact of open data, and explored the reasons why. I ended with a call upon the FOSS4G community to be an active force in driving ethics by design in re-using data.

Impact is often hard to see, because measurement takes effort
Firstly, because it takes a lot of effort to map out all the network effects, for instance when doing micro-economic studies like we did for ESA or when you need to look for many small and varied impacts, both socially and economically. This is especially true if you take a ‘publish and it will happen’ approach. Spotting impact becomes much easier if you already know what type of impact you actually want to achieve and then publish data sets you think may enable other stakeholders to create such impact. Around real issues, in real contexts, it is much easier to spot real impact of publishing and re-using open data. It does require that the published data is serious, as serious as the issues. It also requires openness: that is what brings new stakeholders into play, and creates new perspectives towards agency so that impact results. Openness needs to be vigorously defended because of it. And the FOSS4G community is well suited to do that, as openness is part of their value set.

Impact is often hard to see, because of fragmentation in availability
Secondly, because impact often results from combinations of data sets, and the current reality is that data provision is mostly much too fragmented to allow interesting combinations. Some of the specific data sets, or the right timeframe or geographic scope might be missing, making interesting re-uses impossible.
Emerging national data infrastructures, such as the Danish and the Dutch have been creating, are a good fix for this. They combine several core government data sets into a system and open it up as much as possible. Think of cadastral records, maps, persons, companies, adresses and buildings.
Geo data is at the heart of all this (maps, addresses, buildings, plots, objects), and it turns it into the linking pin for many re-uses where otherwise diverse data sets are combined.

Geo is the linking pin, and its role is shifting: ethics by design needed
Because of geo-data being the linking pin, the role of geo-data is shifting. First of all it puts geo-data in the very heart of every privacy discussion around open data. Combinations of data sets quickly can become privacy issues, with geo-data being the combinator. Privacy and other ethical questions arise even more now that geo-data is no longer about relatively static maps, but where sensors are making many more objects as well as human beings objects on the map in real time.
At the same time geo-data is becoming less visible in these combinations. ‘The map’ is not neccessarily a significant part of the result of combining data sets, just a catalyst on the way to get there. Will geo-data be a neutral ingredient, or will it be an ingredient with a strong attitude? An attitude that aims to actively promulgate ethical choices, not just concerning privacy, but also concerning what are statistically responsible combinations, and what are and are not legal steps in getting to an in itself legal result again? As with defending openness itself, the FOSS4G community is in a good position to push the ethical questions forward in the geo community as well as find ways of incorporating them directly in the tools they build and use.

The video of the keynote has been published by the FOSS4G conference organisers.
Slides are available from Slideshare and embedded below:

Sunday Serendipity Reading Links

Every day I save a bunch of links from my explorations over the interwebs. Stuff that passes my radar, may become fodder for my writing at some point, but often gets piled and forgotten.I thought maybe it is good to share some of the unsought links I encounter, and some of the notions why I bookmarked it. Blogging of course used to be linklogging, sharing links to your blog neighbourhood, so let’s say it’s returning to a respected tradition. Here are a fistful of links from this week.

    Distributed web

  • IPFS, a distributed way of delivering webpages and files. Pointed out to me in the context of my postings on distributedness and agency. Napsterizing/torrenting everything. Also seems to want to preserve everything on the web better.
  • Steem is a blockchain based social media platform. Aims to ‘pay’ you for contributing, and do the bookkeeping in a blockchain ledger. Not sure that may work, nor that permanent records of each social media utterance are desirable. Like with IPFS mentioned above, ’not forgetting’ may not be a feature but a very concerning social bug. My friend Boris Mann is trying it out, looking forward to reading more of his reflections. I may not understand, I never understood the purpose of Medium either, which superficially seems to be the same thing but without the bookkeeping.
  • Anil Dash reflects on the lost infrastructure of social media. This resonates strongly with me in terms of what made blogging so exciting 10-15 years ago, as well as with my recent writings about agency. Part of the picture is weaving a tapestry of functionality across different services and tools that together are a potent mix. It needs plumbing like RSS, trackback and discoverability over the lines of conversations distributed over the individual blogs of the participants. My friend Lilia did her Phd on those distributed conversations. And as Hoder wrote seeing the web again after six years in an Iranian prison: much of our web now, such as Facebook, is just TV, not coffee house interaction.

  • Free private cities. Sign up to live in one, so you have an ‘equal’ position based on contracted service provision. Because tinkering with democracy and the fact that others have different needs is bothersome, or such. Apparantly the social contract isn’t good enough. This has high overtones of Snowcrash Burbclaves, and the micro-democracy states (100.000 people each, and with every election there is freedom of movement globally to pick the government (corporate, value or ethnicity based) of your choice in the very entertaining near-future SF book Infomocracy by Malka Ann Older. These private city contracts don’t seem to account for the cost of leaving if you cancel your contract, as it is still territory bound, so finding a new service provider means physically moving. With all the social and monetary cost of doing that. Also seems to me that the Principality of Monaco held up as a good practice example, incorporated US towns, or the City of London for that matter provide ample demonstration of why this may not be the way forward to a more inclusive global society.

  • The Ribbon Farm, a blog by Venkatesh Rao, newly added to my feed-reader. His recent newsletter edition on premature synchronization as a cause of problems, chimes with a lot of my experience. Converging too early (because there are just 10 minutes left in the meeting), or forcing convergence in a group doesn’t help much usually. The leading example in the link being military reminds me of an anecdote I once heard about “the world championship of armies” where the US military units were failing because they waited or tried to confirm orders continuously, and the Dutch fared better because they upon receiving others did what seemed worth doing based on context and observation, not seeking further orders and disregarding the literal meaning of orders in the process. Desyncing, as a practice seems valuable advice, and similar to making stuff distributed by design, or probe-based evolution. Seek out new perspectives and let yourself be challenged as part of your routines.

Data Sovereignty as Prerequisite for Open Data Agency

As we are living in a networked world, increasingly government bodies execute their tasks while collaborating in networks of various other stakeholders. This also happens when it comes to collecting, providing or working with data as part of public tasks. One of the potential detrimental side effects is that it quickly becomes unclear who can decide to open such data up. Or whether a government entity, who wants to publish data as part of a policy intervention, still feels able to do so. This ability to decide over your own data, I call data sovereignty. I think without proper attention, the data sovereignty of public institutions is under pressure in collaborative situations and a threat to the freedom of public entities to decide and act on their own open data efforts. This is especially problematic where the lack of data sovereignty hinders public entities in deploying open data as a policy instrument.

I have just completed an inventory of the data sets that a Dutch province holds and the visible erosion of data sovereignty was the main unexpected outcome for me.
This erosion takes different shapes. Here are a few examples of it, encountered in the Province I mentioned:

  • Data collection on businesses locations and the number of people they employ (to track employment per municipality per sector) is being pooled by all provinces (as a national level data set is more useful). The pooling takes place in a separate legal entity. It is unclear if this entity still falls under FOIA and re-use regulations. This entity also exploits the data by selling it. Logical at the organisational level perhaps, but illogical in comparison with the provincial public task (and maybe not even legal under the Re-Use law). Opening up the data needs to be done through that new entity, meaning not just convincing yourself, but all other provinces as well as the entity who has commercial interest in not being convinced. The slowest will thus set the speed.
  • Data collection on traffic flows, collected by the Province, is stored directly in a national data warehouse (NDW). Again pooling data makes it more useful, but the Province cannot store cleaned data there (anomalies filtered out, pattern changes explained etc.), so always needs to redo that cleaning and filtering whenever they want to work or access their own data. Although the publicly owned NDW now publishes open data, until recently they saw themselves as a commercial outfit, adverse to the notion of open data.
  • Data collection on bicycle traffic, done by the Province, is stored in the online database of a French service provider active in the entire EU. Ownership of the data is unclear. The Province only accesses the data through the French website. If a FOIA request came, it would be unclear if providing the data runs counter to any rights the service provider is claiming.
  • Data collection on the prevalence of bird species is being collected in collaboration with nature preservation groups and large numbers of volunteers. The Province pays for the data collection, but the nature preservation groups claim their volunteers (by virtue of their voluntary efforts) are the rightful owners of the data. Without seeking internal legal advice, the discussion remains unsolved and stalls.

None of these situations are unsolvable, all of them can get a definitive answer. The issue however is that nobody is clearly in a position, or has the explicit role to make sure such an definitive answer gets formulated. Because of that, uncertainties remain, which easily leads to inaction. If and when the Province wants to act to open data up, it therefore easily runs into all kinds of questions that will slow action down, or ensure action does not get taken.

It is entirely logical that public entities are collaborating in networks with other public entities and domain-specific stakeholders for the collection, dissemination and use of data. It is also certain, given our networked society and the drive for efficiency, the number of situations where such collaboration takes place will only rise. However, for the drive towards more openness it is detrimental when ownership of public data becomes unclear, gets transferred to an entity that potentially falls outside the scope of FOIA, or falls under the rights of a private entity, just because nobody sought to clarify such matters at the outset.

Public entities should learn to strongly guard their data sovereignty if they want to maintain their own agency in using opening up data as a policy instrument. Moving to open by design as a default for the public sector, requires stopping the erosion of data sovereignty.

Serbian Information Commissioner Now Publishing Open Data

Today a tweet from the Serbian office of the Commissioner for Information of Public Importance and Personal Data Protection thanked me and colleagues for promoting open data. As a result the Commissioner’s Office has launched an open data site today, on the data subdomain of their regular website, This is very good news, and a welcome consequence of the open data readiness assessment I did with the World Bank and the UNDP last year. In June I spoke with the Commissioner about their work, and his deputy already took an active role last December at the conference where we presented the results of the assessment.

In a press release (Serbian only), the Commissioner’s Office states that as further encouragement to the Serbian public administration, the Commissioner is opening up data concerning their own work. Thirteen data sets have been published, one of which I think is very important: the list of public institutions that fall under the freedom of information and data protection frameworks (over 11.000!). Other data published concerns the complaints about information requests and their status the office received, as well as complaints and requests concerning data protection and privacy.

With the help of civil society organisation Edukacioni Centar (whom I had the pleasure of meeting as well) the data comes with some visualizations as well, to improve the understanding of what data is now available. One allows navigating through the network of over eleven thousand institutions that fall within the scope of the Commissioner’s Office, another the status and subject of the various complaints received.

Serbian institutions(Screenshot of over 11.000 public institutions)

Steps like these I find important, where institutions such as the Information Commissioner, or here in the Netherlands the Supreme Audit Institution, lead by example. By doing that they underline the importance of transparency also to the functioning of their own institutions.

Open Data Readiness Assessment Kyrgyzstan Published

The UNDP has published the open data readiness assessment for the Krygyz Republic. From November 2014 to June 2015 I visited Kyrgyzstan three times for a week on behalf of the World Bank. In collaboration with the Kyrgyz Government and the UNDP, as well as local companies, civil society organisations and the coding community, we looked for the right starting points for open data in Kyrgyzstan, and which steps to take to get going.

The UNDP has now published the resulting report, which is embedded below. Download link here.

Open Communities / Refugeehack Wuppertal

Last November I attended the yearly Open Communities North-Rhine Westphalia barcamp (OKNRW), which was combined with a hackday called #refugeehack. The latter focused on using open data to help refugees find their way in Germany.

I presented my experiences working with local governments to help them use open data as a policy instrument. We did a year long project with 9 municipalities and 1 province in 2014-2015. The driving thought behind it was that releasing data can be a deliberate intervention in a policy field, as having data in my hands changes a stakeholder’s agency. Slides shown below.

Now a video, showing how the OKNRW 2015 & Refugeehack played out has been released (in German).

Open Data Readiness in Serbia

Last June I spent time in Serbia doing an open data readiness assessment for the World Bank. Early this month I returned to present the findings, and to mentor a number of teams at the first Serbian open data hackathon. The report I wrote is now also available online through the UNDP website.

odrareportthe printed ODRA report

The UNDP organized a conference to present the outcome of the readiness assessment and discuss next steps with stakeholders. At the conference I presented my findings to the Minister for Public Administration and Local Self Government (MPALSG), and a printed version was made available to all present.

ministerme conf1
(l) the minister (center, me left of her) on open data (photo Ministry PALSG), (r) discussing presented app (photo

At the conference the 11 teams that created open data applications at the hackathon the weekend before, called, were also presented. The hackathon took place in the recently opened StartIT Centar, a coworking space (which got funded through kickstarter). I had the pleasure to be a mentor to the teams (together with Georges and Brett from Open Data Kosovo), to channel my experience with open data communities around Europe and open data app-building in the past 8 years. The quality of the results was I think impressive, and it was the first hackathon where I saw people trying to incorporate deep-learning tech. I aim to post separately on the different applications built.

mentorMentoring during the hackathon, with Milos and Nemanja. (photo

That the hackathon was about open data was possible because five public sector institutions (Ministry for Interior, Ministry of Education, Agency for Environmental Protection, Agency for Medicines and Medical Devices, and the Public Procurement Office) have been working constructively to publish data after our first visit in June. In the coming months I hope to return to Belgrade to provide further implementation support.

The report is also embedded below:

Serbia Open Data Readiness Assessment

Largest government spending is also the least transparant

Tuesday will see the presentation of the new Dutch national budget, during the traditional opening of the parliamentary year, ‘Prince’s Day‘.

58% (153 out of 262 billion Euro) of the budget is allocated to social affairs (78 billion Euro), and healthcare (75 billion Euro). These two largest domains are also the two least transparant ones. For both domains little to none exists in terms of meaningful open data.

On the contrary both domains are notorious for their opaqueness. In the social domain a massive decentralization has transferred billions to lower levels of government, making the way they are spent invisible to both Parliament and the High Court of Audit. In healthcare insurers and hospitals are fighting the Minister tooth and nail over disclosing even basic numbers they are legally bound to make public, and freedom of information requests end up in court.

The budget shortfall for the next year comes in at some 12 billion Euro, or about 8% of our spending on social affairs and healthcare.

I bet increased transparency in both the social and healthcare domains can surface lots of potential savings, by exposing inefficiencies etc. Even if you shave of just a few percents of spending that way, it may actually fix the hole in the national budget completely.

Planning for Impact with Open Data

On July 1st I gave a keynote at the OpenData.CH conference in Bern, Switzerland. I talked about using open data as a policy instrument, and looking at open data as a way to provide new agency to stakeholders around an issue, so they can create real impact. The conference organizers have published the video of the presentation, which you can see below, together with the slides I used.