Category Archives: Open Data

16 Months of Local Open Data

The culmination of over a year of work
Last month we concluded a project that started in November 2013. For the Province of North-Holland we worked with 9 municipalities to help them bring publishing open data into their normal routines.
We celebrated the end of the program with an afternoon conference of 125 participants in Amsterdam, sharing the experiences, the good, the bad, the splendid, the ugly.

The program
The program we designed is based on the notion that making open data part of normal operations requires learning by doing, and learning from others who are doing the same thing. Also by taking time for it, and us helping out on the work floor, you allow more colleagues to get involved as well as see new knowledge settle.
To make sure open data is not something ‘extra’ a government does for others (‘nice to have’) we seek to position open data as a policy tool (‘need to have’), that helps governments to engage with stakeholders and impact their own policy goals.

Four phases
The preparation phase was aimed at finding a number of municipalities to participate. They had to allocate people and time to it, and there needed to be some current local policy issues that provided a possible angle for open data. We talked to about 16 local governments, and in the end 9 joined the program.

Three implementation phases were part of the plan: 1) find internal support, raise awareness, and find a suitable policy topic as context. 2) select and publish data, find initial external stakeholders, 3) engage with stakeholders, help them use the data, and make the publishing process permanent.
In practice these three phases weren’t discreet, but overlapped and never ‘finished’.

plan
the plan

reality
the reality

A year long execution phase
The execution phase of the program started in March 2014, with a high-energy kick-off event where some 60 civil servants and political functionaries participated, and the 9 participating municipalities and the province signed the ‘North-Holland Smarter’ Manifesto. (The manifesto was a beautiful side project by my colleague Frank and our artist in residence Ate: wooden panels with different data visualizations of the region, and laser cutted pixelated shapes of the participating local governments to place signatures on. An afternoon well spent in the local FabLab Protospace)

manifest
the manifesto

The participating municipalities gathered for collective working days 4 times, and in between worked on their own. We helped out with providing guidance, examples and facilitating session and workshop both internally with civil servants, and externally with citizens, businesses and organizations. Each worked around a locally relevant theme, ranging from flash floods to new entrepreneurs in socially disadvantaged neighborhoods, from regional public transport for the elderly and schools, to financial transparency.

In the end most local governments started publishing data (not a small feat for non-urban local governments I’d say), and some moved it towards their line management successfully. A few first examples of seeing the data used exist, and most don’t see the end of the program as the end of their efforts but as the beginning.

Overall we included a few hundred civil servants and several dozen external stakeholders. Some of that work is still ongoing with our involvement, until May.

The final event

eventwork
participants working together at final event

We ended the program with a final event to present our learnings. Some 125 people from across the Netherlands, representing local, regional and national government entities came. We shared what we experienced roughly in the same way the program phases were designed, providing the participating municipalities with ample space to discuss what they had done and learned, what worked and what didn’t work at all. Some presented their data publishing platform, others their next steps, some recounted how they helped learn colleagues use data better themselves, others how settling on a policy theme early didn’t work for them. Entrepreneurs and data re-users talked about how they work with data.

It was a good and informal way to convey the actual work involved and how some things can take more time than thought (usually the social aspects), while others turn out to be much easier than anticipated (usually the technical aspects).

pnhopendata
the municipalities in North-Holland that are publishing open data (image Ruud Smith)

Getting past the hype
Plans and reality never match, and I think our program created the space and time for that to be ok and part of the journey. Our client with the Province said that to her the project was to help people move beyond the hype towards where open data is part of the normal way of doing things, and softening the ’trough of desillusion’ in the hype cycle. Judging by the quotes we collected of participants we succeeded in doing that.

curve
getting past the hype (image Isabel Brouwer)

Student’s Six Big Data Lessons

Students from a minor ‘big data’ at the local university of applied sciences presented their projects a few weeks ago. As I had done a session with them on open data as a guest lecturer, I was invited to the final presentations. From those presentations in combination several things stood out for me. Things that I later repeated to a different group of students at the Leeuwarden university of applied sciences at the begining of their week of working on local open data projects for them to avoid. I thought I’d share them here too.

The projects students created
First of all let me quickly go through the presented projects. They were varied in types of data used, and types of issues to address:

  • A platform consulting Lithuanian businesses to target other EU markets, using migration patterns and socio-economic and market data
  • A route planner comparing car and train trips
  • A map combining buildings and address data with income per neighborhood from the statistics office to base investment decisions on
  • A project data mining Riot Games online game servers to help live-tweak game environments
  • A project combining retail data from Schiphol Airport with various other data streams (weather, delays, road traffic, social media traffic) to find patterns and interventions to increase sales
  • A project using the IMDB moviedatabase and ratings to predict whether a given team and genre have a chance of success

Patterns across the projects
Some of these projects were much better presented than others, others were more savvy in their data use. Several things stood out:

1) If you make an ‘easy’ decision on your data source it will hurt you further down your development path.

2) If you want to do ‘big data’ be really prepared to struggle with it to understand the potential and limitations

To illustrate both those points:
The Dutch national building and address database is large and complicated, so a team had opted to use the ‘easier’ processed data set released by a geodata company. Later they realized that the ‘easier’ dataset was updated only twice per year (the actual source being updated monthly), and that they needed a different coordinates system (present in the source, not in the processed data) to combine it with the data from the statistical office.

Similarly the route planner shied away from using the open realtime database on motorway traffic density and speed, opting for a derivative data source on traffic jams and then complaining that came in a format they couldn’t really re-use and did not cover all the roads they wanted to cover.
That same project used Google Maps, which is a closed data source, whereas a more detailed and fully open map is available. Google Maps comes with neat pre-configured options and services but in this case they were a hindrance, because they do not allow anything outside of it.

3) You must articulate and test your own assumptions

4) Correlation is not causation (duh!)

The output you get from working with your data is colored by the assumptions you build into your queries. Yes average neighbourhood income can likely be a predictor for certain investment decisions, but is there any indication that is the case for your type of investment, in this country? Is entering the Swedish market different for a Lithuanian company from let’s say a Greek one? What does it say about the usefulness of your datasource?

Data will tell you what happened, but not why. If airport sales of alcohol spike whenever a flight to Russia arrives or leaves (actual data pattern) can that really be attributed to the 2-300 people on that plane, or are other factors at work that may not be part of your data (intercontinental flights for instance that have roughly the same flight schedule but are not in the data set)?

Are you playing around enough with the timeline of your data, to detect e.g. seasonal patterns (like we see in big city crime), zooming out and zooming in enough, to notice that what seems a trend maybe isn’t.

5) Test your predictions, use your big data on yourself

The ‘big’ part of big data is that you are not dealing with a snapshot or a small subset (N= is a few) but with a complete timeline of the full data set (N = all). This means you can and need to test your model / algorithm / great idea on your own big data. If you think you can predict the potential of a movie, given genre and team, then test it with a movie from 2014 where you know the results (as they’re in your own dataset) on the database from before 2014 and see if your algorithm works. Did Lithuanian companies that already have entered the Swedish market fail or flourish in line with your data set? Did known past interventions into the retail experience have the impact your data patterns suggest they should?

6) Your data may be big, but does it contain what you need?

One thing I notice with government data is that most data is about what government knows (number of x, maps, locations of things, environmental measurements etc), and much less about what government does (decisions made, permits given, interventions made in any policy area). Often those are not available at all in data form but hidden somewhere in wordy meeting minutes or project plans. Financial data on spending and procurement is what comes closest to this.

Does your big data contain the things that tell what various actors around the problem you try to solve did to cause the patterns you spot in the data? The actual transactions of liquor stores connected to Russian flight’s boarding passes? The marketing decisions and their reasons for the Schiphol liquor stores? The actions of Lithuanian companies that tried different EU markets and failed or succeeded?

Issue-driven, not data-driven, and willing to do the hard bits
It was fun to work with these students, and there are a range of other things that come into play. Technical savviness, statistical skills, a real understanding of what problem you are trying to solve. It’s tempting to be data-driven, not issue-driven even if in the end that brings more value. With the former the data you have is always the right data, but with the latter you must acknowledge the limitations of your data and your own understanding.

Like I mentioned I used these lessons in a session for a different group of students in a different city, Leeuwarden. There a group worked for a week on data-related projects to support the city’s role as cultural capital of Europe in 2018. The two winning teams there both stood out because they had focussed very much on specific groups of people (international students in Leeuwarden, and elderly visitors to the city), and really tried to design solutions starting with the intended user at the center. That user-centered thinking really turned out to be the hardest part. Especially if you already have a list of available data sets in front of you. Most of the teacher’s time was spent on getting the students to match the datasets to use cases, and not the other way around.

A week in Kyrgyzstan on Open Data

I spent a bit more than a week in Kyrgyzstan, at the invitation of the Kyrgyz prime minister and on behalf of the World Bank, to start an open data readiness assessment and present and facilitate at the Kyrgyz Open Data Days.
Kyrgyzstan is a lower middle income country, with a parliamentary democracy. The people I met are frank, straightforward, and action oriented. Anything longer than 6 months seems to be perceived as long term. This meant that with the right introduction it was possible to arrange meetings with high level officials at short notice. Like arranging a meeting with a deputy minister during lunch for later that afternoon. I did not get to see anything really from the city or the country, except from what I could see from the car that brought me from one office building to another, and from hotel to conference center.

In Bishkek, Kyrgyzstan In Bishkek, Kyrgyzstan
Press coverage of Prime Minister opening the Kyrgyzstan Open Data Days, and my name tag in cyrillic

Towards the end of my stay the Open Data Days took place, for which many other open data people from Moldova, Georgia, Russia, USA, UK, Germany and France came on behalf of the World Bank. It was good fun to meet them, and together we pulled off a good program to kick start open data (also see World Economic Forum blog) in Kyrgyzstan. The Kyrgyz government adopted an e-governance strategy only last week, and open data is part of that new strategy. Our visit was therefore very timely. The first morning was spent explaining open data and sharing experiences with the Kyrgyz prime minister and full cabinet attending, followed by good discussions in the afternoon when we zoomed in on a slightly more practical level. There was quite a bit of press interest, and I had the opportunity to get misquoted in the Kyrgyz press. The second day we did workshops with civil society organizations, and the business community, followed by a developer meet-up in the evening. Two more meetings on the last day completed my program, before the 10 hour flight back home.

In Bishkek, Kyrgyzstan In Bishkek, Kyrgyzstan
Mountainview on a clear morning from my hotel room, and a group photo at the end of the Open Data Days

Bishkek is only a short distance from the mountains (the country’s highest peak is over 7000m), and on clear mornings form a great backdrop for the city. It was a snowy day when I left, so no views of the mountains as the plane took off. Instead I made triple selfies with Victoria from Moldova, and Vitaly from St. Petersburg in departures, as we were on the same flight back. Odd, spending a week 6000km away from home, and more or less no idea where you’ve been. I may return however in January and late spring, both for completing the assessment as well as providing ealry implementation support.

Audit Authorities and Open Data

Last Friday I participated in a study day of the Dutch and Belgian audit authorities (the Algemene Rekenkamer and the Rekenhof). Topic of discussion was how open data can play a role in audit work.

Noël van Herreweghe, the open data program manager of the Flemish government, first sketched the situation of open data in Flanders. Afterwards I talked about the current status of open data in the Netherlands, and the lessons learned about doing open data well from the past years. (see my slides embedded below)

A few elements that I think are relevant in the context of the work of audit authorities are:

  • current open data is mostly about what government knows, not about what government does. The latter is what matters to auditors however. More transactional data is needed, maybe from the back-end of e-government services.
  • open data can be a pre-hypothesis tool, showing patterns that generate questions or give direction to/ help focus audits on areas where it matters most.
  • open data can be used to assess impact of policies, also/specifically/even when the data is not directly describing a certain policy area, but serves as a proxy from further down the chain of causality.

  • And then there is the many-eyes aspect of open data of course: if there is a ‘scandal’ hiding in the data, it may be found more easily through increased eyeballs (although there might be more false positives/noise as well).

    We split up in groups and rotated through three short workshops exploring these notions. One session where specific audit questions were connected (or attempted) to open data sources which could contain pointers, and stakeholders involved. One session showing how free open source online tools can help clean up and explore data and show first patterns. One session with a quick routine to brainstorm indicators that can be proxies for a certain question. In this case we looked at proxy indicators for the quality of school buildings. The Dutch court of audit is currently doing a pilot involving the collection of opinions as well as pictures as part of an audit, concerning the quality of school buildings.

    Open Data Roundtable in Kazachstan

    After arriving in Kazachstan at 4AM, and a bit of rest, my first item on the schedule was key-noting at a roundtable of CIO and CTO level representatives of about a dozen CIS countries. The session was hosted at NITEC in the House of Ministries. The aim was to convey how open government data can be of value, and to provide a few starting points that the participants see possibilities to act on.

    Dashboard
    Dashboard of e-government metrics, in the hall of the House of Ministries

    My World Bank colleagues Oleg Petrov and Mikhail Bunchuk presented the World Bank work, and the ways and instruments with which it can support open data efforts of the nations present.

    Tair Sabyrgaliyev and Cornelia Amihalachioae presented the open data program of Kazachstan and the impressive e-government and open data work of Moldova (which I had opportunity to work on and experience first hand in 2012).

    My own contribution was basically a compressed Open Data course, addressing the what, why and how. My slides are embedded below in both English and Russian. (During the session I used Russian slides.)

    Also Cornelia Amihalachioae’s slides are shown below, that are well worth a read.

    At the Global e-Gov Forum with the World Bank, in Kazachstan

    These past days I was in Astana, Kazachstan. Next to enjoying the tremendous hospitality of the Kazachs, and being impressed with their sense of pride and urge to succeed, I spent my time sharing my open government data experiences of the past 6 years.

    The World Bank asked me to keynote at a roundtable with CIO and CTO level officials of a dozen or so CIS countries, at the Kazakh national information technology unit (NITEC) in the House of Ministries (a gigantic building).

    The Kazakh holodeck
    At the CIO/CTO of CIS countries roundtable in what seemed the Star Trek Enterprise command deck

    In the two days after that, at the invitation of the Kazakh government and the ICT Development Fund, I contributed to the Global e-Gov Forum 2014. It is the third event of its kind, the first two having taken place in Korea (the next ones will be in Kazachstan and Singapore). At the conference I contributed to two workshops, one for UNDESA, presenting how our current project with the Province of North Holland on open data is a catalyst voor civic engagement and (e-)participation. The key message being that publishing data is an intervention in your policy area, that not just addresses the information assymetry between me and my government, but als provides me with a tool to act differently on my own behalf. Both these elements are ultimately impacting government policy goals which is a basis for a governments intrinsic motivation to do open data well. Transparency builds trust and putting data on the table enables frank conversations that would otherwise not be possible. The other was basically the same message, this time in a panel discussion that also contained the CIO of the Dutch Ministry for Interior Affairs, the department in charge of e-government and open government. That made for a nice combination with both overlap and contrasts, juxtaposing national policy with the perspective from individual civil servants trying to do things in practice.

    Me speaking
    Discussing the operational aspects and impact of using open data for civic engagement

    I also chaired the final panel discussion on open government and open data, which contributions from the UN, the French and Kazakh national open data units (ETALAB & NITEC), and a research firm. With a thousand people from almost 80 countries this was a great event to exchange experiences, and I heard a range of great stories from Uruguay to Kenya, from Barbados to Bangladesh, from Estonia to Vietnam.

    In panel on civic engagement the UNDESA workshop
    In panel discussion, and the UNDESA workshop room

    Being a VIP guest of the Kazakh government took a bit of getting used to, as I am usually one to arrange my own things. Having a personal assistant plus a dedicated driver with a very luxurious car at my disposal for three days full time does however have its advantages. As does being whisked through airport and border security in under 5 minutes both ways. It meant being able to fully focus on delivering value to the various sessions and audiences, and engage in meaningful conversations, without having to worry about any of the logistics.

    At the conf, my asst Ilyas on left Expert on open data
    Registration desk, and my front row seat

    Financial Transparency in the Netherlands, an overview

    Recently I participated in a session of the Dutch permanent parliamentary commission on national spending, discussing open financial data. A good reason to give a quick update on open government spending data in the Netherlands.

    Current status of open spending
    Let’s give you a general overview of open spending in the Netherlands first. As you can see in the Open Data Census, open spending data is the single biggest missing chunk of data in the Netherlands. The national budget is available as open data, since 2012, thanks to the work of the Dutch national audit office, but only on an aggregated level. The Ministry for Foreign Affairs is publishing transaction level data on international aid since 2012 as part of IATI, and is the only Dutch public sector body doing this. On a local level some aggregated spending data is available through the Open State Foundation‘s project openspending.nl. In the past months I have gathered local spending data from 25 local councils, and provided it to this project to make comparisons across local governments possible. In a current project with the Province of North-Holland, we are, in collaboration with 10 local governments, aiming to open up the spending data of 50+ local councils. There is no requirement, unlike in the UK, for government bodies to publish open spending data.

    Old Parliament
    The session took place in the old plenary meeting room of the Parliament

    National Audit Authority: Forwards with open spending!
    President of the National Audit Authority Saskia Stuiveling had the clearest message during the parliamentary committee meeting, in terms of general outlook as well as leading by example. Even for the audit authority it is often hard to get the right data to properly audit government spending. Opening up spending data by default will help them to concentrate on those parts of public policy where it matters most, e.g. health care spending. To lead by example the audit authority has opened up their own spending data this spring. They also published a ‘Trend Report Open Data‘ tracking the open data efforts of all Ministries, and urging them to do more. Opening up data is becoming a standard advice given in all their audit reports. In other words they are building up pressure for Ministries to do more. (disclosure: I worked with the audit authority on the trend report open data)

    Foreign Affairs: Open spending is useful instrument
    The Ministry for Foreign Affairs presented itself as a proponent of more financial transparency. Having started publishing open spending data on international development in 2012, they will be launching a (Tableau) based viewer for that data on June 11th, which includes the possibility to drill down to project level information and can link to external sources such as project descriptions published by NGO’s. A viewer like this serves as a replacement for yearly paper based reporting, makes a step towards visualizing impact and not just spending, as well as is a means to motivate more NGO’s towards bigger spending transparency.

    Finance Ministry: following Audit Authority’s lead
    The Finance Ministry until now has done little towards open spending, but during the session in the Parliament they showed how the work done by the audit authority mentioned above has prodded them into action as well. Triggered by the open data trend report last March, they have now opened up aggregated spending for the first time (update from Rense Posthumus in the comments: data is located at opendata.rijksbegroting.nl). Also the Finance Ministry announced that subsidies data and basic financial data of independent government agencies is available in a viewer in sneak preview, though no URL was given yet. It wasn’t indicated when this would be made publicly available. (UPDATE: see comment by Rense Posthumus) The plan to publish departmental spending for all ministries by 2016 was announced, but made dependent on ‘creating a standard reporting method’ first. That met with resistance in the audience: if the data is good enough for the Finance Ministry to work with, why isn’t it good enough to publish? That argument did seem to resonate with the Ministry director present.

    Interior Affairs: very disappointing
    A very disappointing contribution was made by the Ministry for the Interior’s deputy director-general. This Ministry is nominally responsible for the open government and open data efforts of the government, as well as in the lead to reform the Freedom of Information Act in light of the new EU Directive on the re-use of public sector information, but in this session showed a shocking lack of vision and no will to act. In 20 minutes nothing was said about open government at all, leaving the attending Members of Parliament confused. Even the actions the Ministry hás taken, such as the launch of the national data portal in 2011, and joining the Open Government Partnership (albeit with an Action Plan that adroitly avoids formulating action), weren’t mentioned. From this presentation one can only conclude that nothing much can be expected from this Ministry in the near future. This means other public sector bodies are left largely to their own devices, which is a shame as it means lots of time will be lost clearing up confusion and raising the general level of knowledge on how to do open government data well. The Ministry for the Interior, being in charge of the open government dossier, is the only one inside government who could claim a much needed role of ‘lighthouse’ and beacon for established good practice, but they’re not on the ball, nor seem to aim to be.

    OECD Regional Well-Being Index

    At Re:Publica in a session on data visualization to make sense of globalization, the release of a very cool dataviz project was announced for next week: The OECD Regional Well-Being Index. ‘Truth and beauty operator’ Moritz Stefaner, who contributed to the visual aspects, made this announcement during the session and gave a sneak preview.

    It is a follow-up of the OECD Better Life Index (also very cool), and a new incarnation of the statistical regional explorer.

    What it allows you to do is explore regional data, on the basis of what you deem relevant, and then find out which regions in other OECD countries have similar profiles. This is important, as until now OECD data was mostly presented on national level, but the more profound differences are usually found within a country, or when comparing regions, not countries.

    If you do such a comparison for Berlin, as shown in the pictures, you find out why Peter Rukavina likes Berlin so much: it is statistically similar to his home Prince Edward Island, just more urban and with a wider variety of things on offer.

    Re Publica 2014 Berlin
    Berlin, with Prince Edward Island mentioned as similar region

    Re Publica 2014 Berlin
    PEI, statistically similar to Berlin

    The existing OECD Regional Well-Being Index is already a great and beautiful project. It moves away from ranking countries, as that has no real meaning (in the sense of scope of interventions or policy consequences). You can create your own set of important indicators, and your choice as well as those of other visitors is used again as data to improve the visualization of the project itself. The top layer of the index is playful, and doesn’t throw all of the statistics in your face at the beginning. If you want you can dig much deeper and get much richer detailed numbers.

    For more OECD data visualizatons see their Data Lab. Also check out the dataviz portfolio of Moritz Stefaner, who created the key elements of the OECD visualizations.

    Update on Local Spending Data FOIA Requests

    Four weeks ago I asked all 25 municipalities in my Province for their spending data, as reported in so called IV3 files to the Dutch national statistics office. As all municipalities use the same format, this makes it possible to compare spending and budgets across communities, for instance as is done at openspending.nl

    Because I asked 25 government bodies the same question at the same time, it also makes for interesting comparisons on how each of them deals with requests for information, and how that compares to the legal obligations in place in the Freedom of Information Act (WOB, FOIA).

    Today is day 28, and that is the end of the initial period, stated in the law, government bodies have to respond to requests. So how did the 25 municipalities do?

    As of today I have received 15 out of 25 requested data sets (60%). The shortest response time was 4 days, and the last week, as the deadline was approaching, saw most activity.
    Just over half (14 out of 25, 56%) turn out to only accept FOIA requests on paper, and not through e-mail. This is an mostly unnecessary obstructive effort to reduce the number of citizen requests received, and especially to prevent overlooking requests and thus penalties.

    Five municipalities have announced postponing their answer with (the legally defined) additional 4 weeks. Four have a few days of the first 4 weeks remaining (the days used for me responding on paper where the original e-mail wasn’t accepted). One municipality is now officially late.

    All in all a pretty good result thusfar in my opinion.

    An Exercise In Freedom of Information: Local Spending Data

    I have approached all 25 municipalities in my province with a freedom of information (foia) request for local spending data. This is a little side project that serves two purposes:

  • Bringing together spending data for the entire region
  • Establishing the FOIA readiness and processes of municipalities


  • Where does my money go
    Where does my money go? The first financial transparency open data project.

    OpenSpending: getting local spending data
    The main trigger for this is the OpenSpending project which exists as a global project, but also has a separate national Dutch clone at openspending.nl by the Open State Foundation. All Dutch municipalities report their spending and revenue in a fixed format, called IV3, to the Dutch Statistics Office CBS on a quarterly basis. If this data would be available for all municipalities, it would enable great comparison opportunities. Right now, only the data for the city of Amsterdam is available.

    So last October I did a FOIA request in my home town Enschede, to get the spending data, and promptly received it within a week. That data is now findable through the Enschede city data portal. Now that openspending.nl announced it is ready for more data, I decided to try and get some for my entire region. Last Monday I sent out 24 FOIA requests to municipalities in my province for their IV3 files.

    FOIA readiness and process assessment
    Now that I have send out 24 identical FOIA requests for spending data, and have the original one as benchmark, this provides good opportunity to compare the way municipalities deal with FOIA requests. So that provides the second purpose of this exercise.

    I will track the progress of my 24 FOIA requests, and document the results. Thusfar 5 out of 24 have let me know their digital communication path is closed for FOIA, so I have posted letters to those. One (1) municipality quickly confirmed my request, properly recognizing it as a FOIA request and stating it had been forwarded to the right person internally, a handful of others automatically confirmed reception of my e-mail.