Today I gave short presentation at the Citizen Science Koppelting conference in Amersfoort. Below is the transcript and the slidedeck.

Using open data for citizen science, by Ton Zijlstra at Koppelting Mee Je Stad

I’ve worked on opening data, mainly with governments worldwide for the past decade. Since 2 years I’ve been living in Amersfoort, and since then I’ve been a participant in the Measure Your City network, with a sensor kit. I also run a LoRaWan gateway to provide additional infrastructure to people wanting to collect sensor data. Today I’d like to talk to you about using open data. What it is, what exists, where to find it, and how to get it. Because I think it can be a useful resource in citizen science.

What is open data? It is data that is published by whoever collected it in such a way, so that anyone is permitted to use it. Without any legal, technical or financial barriers.

This means an open license, such as Creative Commons 0, open standards, and machine readable formats.
Anyone can publish open data, simply by making it available on the internet. And plenty people, academics, and companies do. But mostly open data means we’re looking at government for data.

That’s because we all have a claim on our government, we are all stakeholders. We already paid for the data as well, so it’s all sunk costs, while making it available to all as infrastructure does not increase the costs a lot. And above all: governments have many different tasks, and therefore lots of different data. Usually over many years and at relatively good quality.

The legal framework for open data consists of two parts. The national access to information rules, in NL the WOB, which says everything government has is public, unless it is not.
And the EU initiated regulation on re-using, not just accessing, government material. That says everything that is public can be re-used, unless it can’t. Both these elements are passive, you need to request material.

A new law, the WOO, makes publication mandatory for more things. (For some parts publication is already mandated in laws, like in the WOB, the Cadastre law, and the Company Register)

Next to that there are other elements that play a role. Environmental data must be public (Arhus convention), and INSPIRE makes it mandatory for all EU members to publish certain geographic data. A new EU directive is in the works, making it mandatory for more organisations to publish data, and for some key data sets to be free of charge (like the company register and meteo data)

Next to the legal framework there are active Dutch policies towards more open data: the Data Agenda and the Open Government action plan.

The reason open data is important is because it allows people to do new things, and more importantly it allows new people, who did not have that access before, to do new things. It democratises data sources, that were previously only available to a select few, often those big enough to be able to pay for access. This has now been a growing movement for 10-15 years.

That new agency has visible effects. Economically and socially.In fact you probably already use open data on a daily basis without noticing. When you came here today by bike, you probably checked Buienradar. Which is based on the open data of the KNMI. Whenever in Wikipedia you find additional facts in the right hand column, that informations doesn’t come from Wikipedia but is often directly taken from government databases. The same is true for a lot of the images in Wikipedia, of monuments, historic events etc. They usually come from the open collections of national archives, etc.

When Google presents you with traffic density, like here the queues in front of the traffic lights on my way here, it’s not Google’s data. It’s government data, that is provided in near real-time from all the sensors in the roads. Google just taps into it, and anyone could do the same.You could do the same.

There are many big and small data sets that can be used for a new specific purpose. Like when you go to get gas for the car. You may have noticed at manned stations it takes a few seconds for the gas pump to start? That’s because they check your license plate against the make of the car, in the RDW’s open database. Or for small practical issues. Like when looking for a new house, how much sunshine does the garden get. Or can I wear shorts today (No!).

But more importantly for today’s discussion, It can be a powerful tool for citizen scientists as well. Such as in the public discussion about the Groningen earth quakes. Open seismological data allowed citizens to show their intuition that the strength and frequency of quakes was increasing was real. Using open data by the KNMI.Or you can use it to explore the impact of certain things or policies like analysing the usage statistics of the Utrecht bicycle parking locations.A key role open data can play is to provide context for your own questions. Core registers serve as infrastructure, key datasets on policy domains can be the source for your analysis. Or just a context or reference.

Here is a range of examples. The AHN gives you heights of everything, buildings, landscape etc.
But it also allows you to track growth of trees etc. Or estimate if your roof is suitable for solar panels.This in combination with the BAG and the TOP10NL makes the 3d image I started with possible. To construct it from multiple data sources: it is not a photograph but a constructed image.

The Sentinel satellites provide you with free high resolution data. Useful for icebreakers at sea, precision agriculture, forest management globally, flooding prevention, health of plants, and even to see if grasslands have been damaged by feeding geese or mice. Gas mains maintainer Stedin uses this to plan preventative maintenance on the grid, by looking for soil subsidence. Same is true for dams, dikes and railroads. And that goes for many other subjects. The data is all there. Use it to your advantage. To map your measurements, to provide additional proof or context, to formulate better questions or hypotheses.

It can be used to build tools that create more insight. Here decision making docs are tied to locations. 38 Amersfoort council issues are tied to De Koppel, the area we are in now. The same is true for many other subjects. The data is all there. Use it to your advantage. To map your measurements, to provide additional proof or context, to formulate better questions or hypotheses.

Maybe the data you need isn’t public yet. But it might be. So request it. It’s your right. Think about what data you need or might be useful to you.
Be public about your data requests. Maybe we can for a Koppelting Data Team. Working with data can be hard and disappointing, doing it together goes some way to mitigate that.

[This post was created using a small hack to export the speaking notes from my slidedeck. Strangely enough, Keynote itself does not have such an option. Copying by hand takes time, by script it is just a single click. It took less than 10 minutes to clean up my notes a little bit, and then post the entire thing.]

This weekend the grassroots FabLab conference ‘Koppelting‘ is taking place in Amersfoort, Netherlands. Together with Dirk van Vreeswijk I’ll be doing a session this morning on how to leave Gmail and other walled gardens.

In this session I try to summarize the way I constructed my path out of Gmail in such a manner that it becomes a guide that may enable others to act for themselves. The talk explains why I wanted to leave Gmail, how I finally found a way, and what the replacement solution(s) are I now use. It ends with a ‘recipe’, based on how I found a way out of Gmail, to help you think about what keeps you in your own walled gardens, so it becomes easier to explore alternatives.

Outline and slides
Setting the scene:

  • Using gmail since July 2004
  • 250.000 conversations, across 770.000 messages. 21GB total.
  • 12 years the central hub for all my personal and work e-mail

Why I wanted to leave

  • In part: everything was on US servers
  • In part: because Google with my Gmail and all other data has a very extensive profile of me
  • But most of all: Gmail was a single point of failure. Losing access would mean losing everything concerning mail communications

How I left Gmail

  • From early 2014 started seriously considering it
  • Getting to action was hard as it is extremely easy to use what you have, to stick in your routine. Ease of use keeps you locked in
  • Finding “The Alternative” seemed impossible. Until I thought about the specific aspects that made Gmail so easy for me
    • Multiple addresses into 1 inbox
    • Cross device availability
    • Great filtering and tagging
    • A generic mail address as throw away mail
    • Spam filtering
    • Large free storage space
    • Great search
  • Two core things stood out after making the list
    1. GMail makes it easy to be lazy (piling not filing). I needed to treat myself to a better process: spend a few seconds now (delete, file, delegate), to save more time on search later
    2. What made Gmail great to me in 2004, is now widely available functionality and technology

What I have now
This is described in more detail in my earlier posting that triggered this session. For each item that made Gmail attractive to me I searched for an alternative. Recombining them into a new workflow is a viable alternative for my Gmail usage as a whole. Apart from the technology replacements, key part is up front contemplation and more continuous reflection on my working process. I’m a piler, not a filer, but adding a few seconds during e-mail triage to at least decide putting it in a pile that is not my Inbox, makes all the difference.

The slides are available in PDF on this site, and will be embedded below (currently upload is failing).

Leaving a walled garden planning aid
Although the path for me leaving Gmail took quite a bit of time, I think the journey can be abstracted into a recipe to make it easier to spot your own path out of a walled garden (Gmail, Dropbox, etc.)
The basic steps are:

  1. Pick the walled garden you want to leave
  2. List all the things that make it so convenient for you
  3. Reflect on what that list tells you, about your process and your tools
  4. Find replacements for each element, then recombine them into your new workflow
  5. Share what you found and did, so it is easier for others to follow in your footsteps

The outline and collaborative notes from the session are online on one of the etherpads of the Koppelting conference.

About Koppelting
Koppel is an old Dutch word for communal fields, Ting a Germanic word for a meeting of the free. Organized by the Amersfoort FabLab, a fully opensourced bootstrapped FabLab, Koppelting is the annual grassroots festival about peer production and free/libre alternatives for society.

Germanische-ratsversammlung_1-1250x715Germanic Ting, after the Marcus Aurelius column in Rome, public domain