Category Archives: Open Data

Largest government spending is also the least transparant

Tuesday will see the presentation of the new Dutch national budget, during the traditional opening of the parliamentary year, ‘Prince’s Day‘.

58% (153 out of 262 billion Euro) of the budget is allocated to social affairs (78 billion Euro), and healthcare (75 billion Euro). These two largest domains are also the two least transparant ones. For both domains little to none exists in terms of meaningful open data.

On the contrary both domains are notorious for their opaqueness. In the social domain a massive decentralization has transferred billions to lower levels of government, making the way they are spent invisible to both Parliament and the High Court of Audit. In healthcare insurers and hospitals are fighting the Minister tooth and nail over disclosing even basic numbers they are legally bound to make public, and freedom of information requests end up in court.

The budget shortfall for the next year comes in at some 12 billion Euro, or about 8% of our spending on social affairs and healthcare.

I bet increased transparency in both the social and healthcare domains can surface lots of potential savings, by exposing inefficiencies etc. Even if you shave of just a few percents of spending that way, it may actually fix the hole in the national budget completely.

Planning for Impact with Open Data

On July 1st I gave a keynote at the OpenData.CH conference in Bern, Switzerland. I talked about using open data as a policy instrument, and looking at open data as a way to provide new agency to stakeholders around an issue, so they can create real impact. The conference organizers have published the video of the presentation, which you can see below, together with the slides I used.

A Month In Lucca (and CH along the way): Week 1

We’ve packed up the household for a month in Lucca, Tuscany this July with a week in Switzerland before it, and a short stay in Switzerland after it.

More relaxation and sabbatical than working in a different environment this time, so in that sense different from previous month long moves to Copenhagen and Cambridge or other extended working stays in Berlin, Helsinki and Switzerland.

A lot has happened, and is happening, to us and our close relatives on both sides of the family, making it a challenging year. So some extended time to be together with the two of us is something I was looking forward to a lot. At the same time I hope to be able to do some reflection, research and writing as well, in the hours where it’s too hot to venture out anyway. Before heading out to explore and enjoy Tuscany more, as I’ve never visited this area.

Half-way stop: Switzerland
The first week we spent halfway to Lucca, in Switzerland. Staying with dear friends in their home on Lake Zug, Elmine took it easy, while I spent most of my time working.

Walchwil breakfast view. Bbq in Walchwil
View on Lake Zug, and welcoming bbq

Swiss open data conference
Monday was spent on creating two presentations, one on open data as an instrument for policy implementation, one on the economic and organizational rationale for a national data infrastructure of ‘core registers’ such as the Netherlands and Denmark have, and others are currently exploring. Tuesday afternoon I took a train to the Swiss capital Bern for an early bird and speaker’s dinner with the organizers of the Opendata.CH conference. A lovely dinner at the bank of the river Aare. We were just underneath the Swiss parliament building perched on the edge of the higher lying old inner city, in a bend of the river. People were swimming in the river, letting the stream transport them before walking back upriver to jump in again.

Swimming in Aare river (Bern) Bern
People swimming in the Aare, banner

The conference took place for the 4th time this year (I spoke there in 2012 as well), at the University of Bern. Over 200 people ignored the sweltering summer heat and sat in stuffy lecturing halls to discuss opening Swiss government data together. In the morning I gave a keynote where I asked how come we are still meeting like this, to encourage and convince? Why is the visibility of impact so fragmented? After which I proceeded with how starting from a (policy) goal, mobilizing stakeholders with open data leads to more easily visible impact. At the same time also creating intrinsic government motivation to keep publishing open data, as it becomes a valuable policy instrument. It seems the presentation went over well, getting a mention in the press.

The afternoon was given over to workshops. Together with my Swiss colleague André Golliez and with Alessia Neroni (Bern Univ for Applied Sciences) we hosted a workshop on building a national data infrastructure around core registers. I presented the experiences we made in Denmark (research done by colleague Marc) and Netherlands, as well as touching upon France (link to a opinion piece I wrote) and other countries. The Swiss current situation was very well described by Alain Buogo (Deputy director at Swisstopo) and Bertrand Loison (board member of the Swiss statistical office). This was the first such discussion in Switzerland and one I hope to continue.

After the conference I returned to Walchwil by train, joining three board members of the Swiss open data community until Zurich.

C360_2015-07-02-15-41-07-643_org Zürich Hardbrücke
Street art and shipping container shops in Hardbrücke

The next day I traveled to Zurich again to talk more with André Golliez, meeting at the Impact Hub, an international oriented co-working space in one of the spans of a railway viaduct, in the hipster dominated Hardbrücke area. We planned some next steps for our collaboration, which likely will see me return late next month for more meetings. Then we moved next door to pub and music podium Bogen F (viaduct span F), for the 60th birthday party of André, as well as the launch of his new open data consultancy. It was a good opportunity to meet some of his family, friends and professional peers. The relaxed bbq, and some wheat beers, made my German slip into a stronger Austrian accent (where I learned it as a kid), to the amusement of the Swiss.

Zürich Hardbrücke Zürich Hardbrücke
At Kultur Viadukt Bogen F

Open Data Barometer
Friday was spent mostly in conference calls while gazing out over Lake Zug. In the morning working with Aleksandar in Belgrade on the Serbian open data readiness assessment (see recent posting), and in the afternoon taking a deep dive into the methodology behind the W3C Open Data Barometer. The research for the 2015 edition is starting now, and me and my colleague Frank are doing the research for six countries (Austria, Switzerland, Ireland, Belgium and Netherlands). In the evening we had a leisurely dinner at the lakeside, in restaurant Engel.

Off to Lucca, but first…
We had originally planned to drive to Lucca on Saturday but traffic and weather predictions suggested to do otherwise. So instead we met up with our dear friends Hans and Mirjam, who moved to Switzerland 18 months ago, for a nice summer bbq. Much better to spend time in conversation than standing in a traffic jam in tropical temperatures. Sunday we then left relatively early at 8:30, cutting through the Gotthard Tunnel with ease and cruising along mostly empty Italian motorways (except for near Milano), to our destination Lucca, arriving early afternoon.

Here in Lucca, originally an Etruscan city, we were met by our kind host Enrico, who guided us to our apartment located right within the old city walls and gave us some useful tips to help us find our way around. In a renovated former nunnery we now enjoy a quiet home looking out over a garden towards the city wall, with the busiest shopping street Via Fillungo (dating from Roman times), with coffee, wine, shoes, and Italian food right in front of our doorstep. A nice basic meal at Gigi, after unpacking, finished up this first week.

Our gate in Lucca
The gate on Via Fillungo to the inner courtyard leading to our apartment

Flemish Open Data Day 2015

Today I am in Brussels, as a guest of the Flemish government. For the fourth time the ‘open data day’ is held in Flanders, bringing together public and private sector to explore possibilities for open data. I gave the opening keynote this morning, on building public services with ‪#‎opendata‬ in collaboration with other stakeholders.

At the request of Noel van Herreweghe, the organizer and Flanders’ open data program manager, I focussed on public service delivery with open data. My main message elements were to start from where you want to see impact, and then mobilize data and people around in such a way that the data change the opportunities stakeholders have to act.

In my examples I showed how it does take a different perspective on public service, with the citizen at the center of the design, not the internal processes. And that with open data you bring many more new stakeholders to the table, which makes collaborative services possible that become better as more people use them. In practice we see that in many cases civil society organizations or businesses create front-ends to what essentially are public services. At the same time, also data collection can be collaborative (such as BANO in France).
To turn government into a platform, a system of connected core reference data sets is a fundamental element. Denmark and the Netherlands have such systems, which are largely open data as well. France and others are discussing this, and Belgium and Flanders have identified some what they call ‘authentic data sources’. This allows others to build on this fundament, creating value that way. The end game for government itself is to be open by default and by design, as well as providing performance data on dashboards generated from live open data streams. This allows the public to simultaneously see, interact with and use the data for service provision and provide feedback.

Slides are online:

Open Data Readiness Assessment in Serbia

The week before last I worked on an Open Data Readiness Assessment (ODRA) for Serbia during a week long mission to Belgrade. It is part of my work for the World Bank and done in close collaboration with the local UNDP team, at the request of the Serbian directorate for e-government (part of the ministry for administration (reform) and local authorities).

Next to me visiting a wide range of agencies with local colleagues Irena and Aleksandar, my colleague Rayna did a roundtable with civil society organisations, and my colleague Laura a roundtable and a number of conversations with the business community. We also had a session with UN representatives, and WB project managers, to mainstream open data in their project portfolio.

Belgrado Belgrado
the unfinished orthodox Saint Sava church, and the brutalist ‘western gate’ Genex tower

Throughout the week we invited everyone we met inside government who seemed to be interested or have energy/enthusiasm for open data for a meeting on the last day of the mission. There we presented our first results, but also made sure that everyone could see who the other change agents across government are, as a first step of building connections between them.

The final day we also had a session with various donor organisations, chaired by the UNDP representative, to explain the potential of open data and present the first ODRA results.

In the coming few weeks the remaining desk research (such as on the legal framework) will be done, and the draft ODRA report and action plan will be prepared. A delivery mission is foreseen for September. In the meantime I will aim to also spend time helping to strengthen local community building around open data.

Belgrado Belgrado
Ministry of Finance, and the Ministry of Defence building that was bombed in 1999 by NATO

In Serbia, the dissolution of Yugoslavia and ensuing wars (Bosnia, Croatia), the Milosevic era, international sanctions, and NATO bombardments during the Kosovo conflict (1999), have left deep marks on the structures and functioning of government and other institutions (as elsewhere in the region).

I had always more or less assumed that in the early nineties the former Yugoslavian federal institutions had morphed into what are now the Serbian national institutions. Instead these federal structures largely dissolved, leaving gaps in terms of compentencies and structures, which are not helped by (legacies of) corruption and political cronyism. Serbia is a candidate for EU Membership, meaning a path of slow convergence to EU policies and regulations.

16 Months of Local Open Data

The culmination of over a year of work
Last month we concluded a project that started in November 2013. For the Province of North-Holland we worked with 9 municipalities to help them bring publishing open data into their normal routines.
We celebrated the end of the program with an afternoon conference of 125 participants in Amsterdam, sharing the experiences, the good, the bad, the splendid, the ugly.

The program
The program we designed is based on the notion that making open data part of normal operations requires learning by doing, and learning from others who are doing the same thing. Also by taking time for it, and us helping out on the work floor, you allow more colleagues to get involved as well as see new knowledge settle.
To make sure open data is not something ‘extra’ a government does for others (‘nice to have’) we seek to position open data as a policy tool (‘need to have’), that helps governments to engage with stakeholders and impact their own policy goals.

Four phases
The preparation phase was aimed at finding a number of municipalities to participate. They had to allocate people and time to it, and there needed to be some current local policy issues that provided a possible angle for open data. We talked to about 16 local governments, and in the end 9 joined the program.

Three implementation phases were part of the plan: 1) find internal support, raise awareness, and find a suitable policy topic as context. 2) select and publish data, find initial external stakeholders, 3) engage with stakeholders, help them use the data, and make the publishing process permanent.
In practice these three phases weren’t discreet, but overlapped and never ‘finished’.

the plan

the reality

A year long execution phase
The execution phase of the program started in March 2014, with a high-energy kick-off event where some 60 civil servants and political functionaries participated, and the 9 participating municipalities and the province signed the ‘North-Holland Smarter’ Manifesto. (The manifesto was a beautiful side project by my colleague Frank and our artist in residence Ate: wooden panels with different data visualizations of the region, and laser cutted pixelated shapes of the participating local governments to place signatures on. An afternoon well spent in the local FabLab Protospace)

the manifesto

The participating municipalities gathered for collective working days 4 times, and in between worked on their own. We helped out with providing guidance, examples and facilitating session and workshop both internally with civil servants, and externally with citizens, businesses and organizations. Each worked around a locally relevant theme, ranging from flash floods to new entrepreneurs in socially disadvantaged neighborhoods, from regional public transport for the elderly and schools, to financial transparency.

In the end most local governments started publishing data (not a small feat for non-urban local governments I’d say), and some moved it towards their line management successfully. A few first examples of seeing the data used exist, and most don’t see the end of the program as the end of their efforts but as the beginning.

Overall we included a few hundred civil servants and several dozen external stakeholders. Some of that work is still ongoing with our involvement, until May.

The final event

participants working together at final event

We ended the program with a final event to present our learnings. Some 125 people from across the Netherlands, representing local, regional and national government entities came. We shared what we experienced roughly in the same way the program phases were designed, providing the participating municipalities with ample space to discuss what they had done and learned, what worked and what didn’t work at all. Some presented their data publishing platform, others their next steps, some recounted how they helped learn colleagues use data better themselves, others how settling on a policy theme early didn’t work for them. Entrepreneurs and data re-users talked about how they work with data.

It was a good and informal way to convey the actual work involved and how some things can take more time than thought (usually the social aspects), while others turn out to be much easier than anticipated (usually the technical aspects).

the municipalities in North-Holland that are publishing open data (image Ruud Smith)

Getting past the hype
Plans and reality never match, and I think our program created the space and time for that to be ok and part of the journey. Our client with the Province said that to her the project was to help people move beyond the hype towards where open data is part of the normal way of doing things, and softening the ’trough of desillusion’ in the hype cycle. Judging by the quotes we collected of participants we succeeded in doing that.

getting past the hype (image Isabel Brouwer)

Student’s Six Big Data Lessons

Students from a minor ‘big data’ at the local university of applied sciences presented their projects a few weeks ago. As I had done a session with them on open data as a guest lecturer, I was invited to the final presentations. From those presentations in combination several things stood out for me. Things that I later repeated to a different group of students at the Leeuwarden university of applied sciences at the begining of their week of working on local open data projects for them to avoid. I thought I’d share them here too.

The projects students created
First of all let me quickly go through the presented projects. They were varied in types of data used, and types of issues to address:

  • A platform consulting Lithuanian businesses to target other EU markets, using migration patterns and socio-economic and market data
  • A route planner comparing car and train trips
  • A map combining buildings and address data with income per neighborhood from the statistics office to base investment decisions on
  • A project data mining Riot Games online game servers to help live-tweak game environments
  • A project combining retail data from Schiphol Airport with various other data streams (weather, delays, road traffic, social media traffic) to find patterns and interventions to increase sales
  • A project using the IMDB moviedatabase and ratings to predict whether a given team and genre have a chance of success

Patterns across the projects
Some of these projects were much better presented than others, others were more savvy in their data use. Several things stood out:

1) If you make an ‘easy’ decision on your data source it will hurt you further down your development path.

2) If you want to do ‘big data’ be really prepared to struggle with it to understand the potential and limitations

To illustrate both those points:
The Dutch national building and address database is large and complicated, so a team had opted to use the ‘easier’ processed data set released by a geodata company. Later they realized that the ‘easier’ dataset was updated only twice per year (the actual source being updated monthly), and that they needed a different coordinates system (present in the source, not in the processed data) to combine it with the data from the statistical office.

Similarly the route planner shied away from using the open realtime database on motorway traffic density and speed, opting for a derivative data source on traffic jams and then complaining that came in a format they couldn’t really re-use and did not cover all the roads they wanted to cover.
That same project used Google Maps, which is a closed data source, whereas a more detailed and fully open map is available. Google Maps comes with neat pre-configured options and services but in this case they were a hindrance, because they do not allow anything outside of it.

3) You must articulate and test your own assumptions

4) Correlation is not causation (duh!)

The output you get from working with your data is colored by the assumptions you build into your queries. Yes average neighbourhood income can likely be a predictor for certain investment decisions, but is there any indication that is the case for your type of investment, in this country? Is entering the Swedish market different for a Lithuanian company from let’s say a Greek one? What does it say about the usefulness of your datasource?

Data will tell you what happened, but not why. If airport sales of alcohol spike whenever a flight to Russia arrives or leaves (actual data pattern) can that really be attributed to the 2-300 people on that plane, or are other factors at work that may not be part of your data (intercontinental flights for instance that have roughly the same flight schedule but are not in the data set)?

Are you playing around enough with the timeline of your data, to detect e.g. seasonal patterns (like we see in big city crime), zooming out and zooming in enough, to notice that what seems a trend maybe isn’t.

5) Test your predictions, use your big data on yourself

The ‘big’ part of big data is that you are not dealing with a snapshot or a small subset (N= is a few) but with a complete timeline of the full data set (N = all). This means you can and need to test your model / algorithm / great idea on your own big data. If you think you can predict the potential of a movie, given genre and team, then test it with a movie from 2014 where you know the results (as they’re in your own dataset) on the database from before 2014 and see if your algorithm works. Did Lithuanian companies that already have entered the Swedish market fail or flourish in line with your data set? Did known past interventions into the retail experience have the impact your data patterns suggest they should?

6) Your data may be big, but does it contain what you need?

One thing I notice with government data is that most data is about what government knows (number of x, maps, locations of things, environmental measurements etc), and much less about what government does (decisions made, permits given, interventions made in any policy area). Often those are not available at all in data form but hidden somewhere in wordy meeting minutes or project plans. Financial data on spending and procurement is what comes closest to this.

Does your big data contain the things that tell what various actors around the problem you try to solve did to cause the patterns you spot in the data? The actual transactions of liquor stores connected to Russian flight’s boarding passes? The marketing decisions and their reasons for the Schiphol liquor stores? The actions of Lithuanian companies that tried different EU markets and failed or succeeded?

Issue-driven, not data-driven, and willing to do the hard bits
It was fun to work with these students, and there are a range of other things that come into play. Technical savviness, statistical skills, a real understanding of what problem you are trying to solve. It’s tempting to be data-driven, not issue-driven even if in the end that brings more value. With the former the data you have is always the right data, but with the latter you must acknowledge the limitations of your data and your own understanding.

Like I mentioned I used these lessons in a session for a different group of students in a different city, Leeuwarden. There a group worked for a week on data-related projects to support the city’s role as cultural capital of Europe in 2018. The two winning teams there both stood out because they had focussed very much on specific groups of people (international students in Leeuwarden, and elderly visitors to the city), and really tried to design solutions starting with the intended user at the center. That user-centered thinking really turned out to be the hardest part. Especially if you already have a list of available data sets in front of you. Most of the teacher’s time was spent on getting the students to match the datasets to use cases, and not the other way around.

A week in Kyrgyzstan on Open Data

I spent a bit more than a week in Kyrgyzstan, at the invitation of the Kyrgyz prime minister and on behalf of the World Bank, to start an open data readiness assessment and present and facilitate at the Kyrgyz Open Data Days.
Kyrgyzstan is a lower middle income country, with a parliamentary democracy. The people I met are frank, straightforward, and action oriented. Anything longer than 6 months seems to be perceived as long term. This meant that with the right introduction it was possible to arrange meetings with high level officials at short notice. Like arranging a meeting with a deputy minister during lunch for later that afternoon. I did not get to see anything really from the city or the country, except from what I could see from the car that brought me from one office building to another, and from hotel to conference center.

In Bishkek, Kyrgyzstan In Bishkek, Kyrgyzstan
Press coverage of Prime Minister opening the Kyrgyzstan Open Data Days, and my name tag in cyrillic

Towards the end of my stay the Open Data Days took place, for which many other open data people from Moldova, Georgia, Russia, USA, UK, Germany and France came on behalf of the World Bank. It was good fun to meet them, and together we pulled off a good program to kick start open data (also see World Economic Forum blog) in Kyrgyzstan. The Kyrgyz government adopted an e-governance strategy only last week, and open data is part of that new strategy. Our visit was therefore very timely. The first morning was spent explaining open data and sharing experiences with the Kyrgyz prime minister and full cabinet attending, followed by good discussions in the afternoon when we zoomed in on a slightly more practical level. There was quite a bit of press interest, and I had the opportunity to get misquoted in the Kyrgyz press. The second day we did workshops with civil society organizations, and the business community, followed by a developer meet-up in the evening. Two more meetings on the last day completed my program, before the 10 hour flight back home.

In Bishkek, Kyrgyzstan In Bishkek, Kyrgyzstan
Mountainview on a clear morning from my hotel room, and a group photo at the end of the Open Data Days

Bishkek is only a short distance from the mountains (the country’s highest peak is over 7000m), and on clear mornings form a great backdrop for the city. It was a snowy day when I left, so no views of the mountains as the plane took off. Instead I made triple selfies with Victoria from Moldova, and Vitaly from St. Petersburg in departures, as we were on the same flight back. Odd, spending a week 6000km away from home, and more or less no idea where you’ve been. I may return however in January and late spring, both for completing the assessment as well as providing ealry implementation support.

Audit Authorities and Open Data

Last Friday I participated in a study day of the Dutch and Belgian audit authorities (the Algemene Rekenkamer and the Rekenhof). Topic of discussion was how open data can play a role in audit work.

Noël van Herreweghe, the open data program manager of the Flemish government, first sketched the situation of open data in Flanders. Afterwards I talked about the current status of open data in the Netherlands, and the lessons learned about doing open data well from the past years. (see my slides embedded below)

A few elements that I think are relevant in the context of the work of audit authorities are:

  • current open data is mostly about what government knows, not about what government does. The latter is what matters to auditors however. More transactional data is needed, maybe from the back-end of e-government services.
  • open data can be a pre-hypothesis tool, showing patterns that generate questions or give direction to/ help focus audits on areas where it matters most.
  • open data can be used to assess impact of policies, also/specifically/even when the data is not directly describing a certain policy area, but serves as a proxy from further down the chain of causality.
  • And then there is the many-eyes aspect of open data of course: if there is a ‘scandal’ hiding in the data, it may be found more easily through increased eyeballs (although there might be more false positives/noise as well).

    We split up in groups and rotated through three short workshops exploring these notions. One session where specific audit questions were connected (or attempted) to open data sources which could contain pointers, and stakeholders involved. One session showing how free open source online tools can help clean up and explore data and show first patterns. One session with a quick routine to brainstorm indicators that can be proxies for a certain question. In this case we looked at proxy indicators for the quality of school buildings. The Dutch court of audit is currently doing a pilot involving the collection of opinions as well as pictures as part of an audit, concerning the quality of school buildings.

    Open Data Roundtable in Kazachstan

    After arriving in Kazachstan at 4AM, and a bit of rest, my first item on the schedule was key-noting at a roundtable of CIO and CTO level representatives of about a dozen CIS countries. The session was hosted at NITEC in the House of Ministries. The aim was to convey how open government data can be of value, and to provide a few starting points that the participants see possibilities to act on.

    Dashboard of e-government metrics, in the hall of the House of Ministries

    My World Bank colleagues Oleg Petrov and Mikhail Bunchuk presented the World Bank work, and the ways and instruments with which it can support open data efforts of the nations present.

    Tair Sabyrgaliyev and Cornelia Amihalachioae presented the open data program of Kazachstan and the impressive e-government and open data work of Moldova (which I had opportunity to work on and experience first hand in 2012).

    My own contribution was basically a compressed Open Data course, addressing the what, why and how. My slides are embedded below in both English and Russian. (During the session I used Russian slides.)

    Also Cornelia Amihalachioae’s slides are shown below, that are well worth a read.