Brakman poem
A poem by Willem Brakman on the university’s steps: philosphy makes sense, science explains. But art shows, shows what it can’t say.

I facilitated two unconferences this week, on Monday and Thursday. The Industrial Design professorate at the Saxion University for Applied Sciences in Enschede celebrates its 15th anniversary this year. Karin van Beurden who has been leading the professorate from the start wanted to have a celebratory event. Not to look back, but to look forward to the next 15 years. She also wanted to do it in a slightly unconventional way. Karin participated in one of our birthday unconferences, and asked me to help her shape the event. In the past 2 months, Karin, her colleague Nienke and I collaborated on this. It was unconventional in the eyes of the university’s board, as well as for the network Karin invited. So we had some explaining and managing of expectations to do in the run-up to the event.

When the professorate started, the theme of Karin’s inaugural speech was how “oysters turn their irritants into pearls”. Now after 15 years it was time to not just look at the pearls created during that period, but mostly at what the pearls of the future would be and thus the issues of today. Under this broad theme some 50 people participated in the unconference, and it was a pleasure to facilitate the process.

After opening up the space, making everyone feel at ease and explaining the process, we created a program for the afternoon in BarCamp style, listing 15 sessions across four spaces, in a 2 hour program.

the program on a whiteboard

What followed (the way I experienced it) was a carroussel of amazing stories, ranging from financing challenges for research projects, enabling alternative energy provision discussion, the psychological impact of turning breast prostheses from a medically framed issue into a fashion issue, and the use of 3d printing to reduce time needed in operation rooms. Afterwards we had a pleasant bbq and further conversations nearby, and during the train ride back I had further good conversation with one of the participants. It was a pleasant day to be back in Enschede.

FabLab Session
One of the sessions, in the FabLab space
A session in the FabLab Enschede space

Discussing the energy grid
Using pluggable hexagons to discuss energy grid issues

Medical 3d prints
3d printed elements for bone reconstruction

What stood out for me was how various participants encountering the format for the first time, immediately realised its potential for their own work. The university’s chair mentioned how she would like to do this with her board to more freely explore issues and options for the university. A professor remarked how it might be a good way to have better, more varied project evaluation sessions with students in his courses. Also, judging by the conversations I had, we succeeded apparently in creating a space and set-up that felt safe for a range of very personal stories and details to be shared.

20190627_205326
As I had a few minutes before my train left, I got to visit our favourite ice cream parlor in Enschede, our home town until 2 years ago. We haven’t found a comparably good ice cream vendor in Amersfoort.

(At CaL earlier this month in Canada, someone asked me if I did unconference facilitation as work. I said no, but then realised I had two events lined up this week putting the lie to that ‘no’. This week E suggested we might start offering training on how to host and facilitate an unconference.)

Students from a minor ‘big data’ at the local university of applied sciences presented their projects a few weeks ago. As I had done a session with them on open data as a guest lecturer, I was invited to the final presentations. From those presentations in combination several things stood out for me. Things that I later repeated to a different group of students at the Leeuwarden university of applied sciences at the begining of their week of working on local open data projects for them to avoid. I thought I’d share them here too.

The projects students created
First of all let me quickly go through the presented projects. They were varied in types of data used, and types of issues to address:

  • A platform consulting Lithuanian businesses to target other EU markets, using migration patterns and socio-economic and market data
  • A route planner comparing car and train trips
  • A map combining buildings and address data with income per neighborhood from the statistics office to base investment decisions on
  • A project data mining Riot Games online game servers to help live-tweak game environments
  • A project combining retail data from Schiphol Airport with various other data streams (weather, delays, road traffic, social media traffic) to find patterns and interventions to increase sales
  • A project using the IMDB moviedatabase and ratings to predict whether a given team and genre have a chance of success

Patterns across the projects
Some of these projects were much better presented than others, others were more savvy in their data use. Several things stood out:

1) If you make an ‘easy’ decision on your data source it will hurt you further down your development path.

2) If you want to do ‘big data’ be really prepared to struggle with it to understand the potential and limitations

To illustrate both those points:
The Dutch national building and address database is large and complicated, so a team had opted to use the ‘easier’ processed data set released by a geodata company. Later they realized that the ‘easier’ dataset was updated only twice per year (the actual source being updated monthly), and that they needed a different coordinates system (present in the source, not in the processed data) to combine it with the data from the statistical office.

Similarly the route planner shied away from using the open realtime database on motorway traffic density and speed, opting for a derivative data source on traffic jams and then complaining that came in a format they couldn’t really re-use and did not cover all the roads they wanted to cover.
That same project used Google Maps, which is a closed data source, whereas a more detailed and fully open map is available. Google Maps comes with neat pre-configured options and services but in this case they were a hindrance, because they do not allow anything outside of it.

3) You must articulate and test your own assumptions

4) Correlation is not causation (duh!)

The output you get from working with your data is colored by the assumptions you build into your queries. Yes average neighbourhood income can likely be a predictor for certain investment decisions, but is there any indication that is the case for your type of investment, in this country? Is entering the Swedish market different for a Lithuanian company from let’s say a Greek one? What does it say about the usefulness of your datasource?

Data will tell you what happened, but not why. If airport sales of alcohol spike whenever a flight to Russia arrives or leaves (actual data pattern) can that really be attributed to the 2-300 people on that plane, or are other factors at work that may not be part of your data (intercontinental flights for instance that have roughly the same flight schedule but are not in the data set)?

Are you playing around enough with the timeline of your data, to detect e.g. seasonal patterns (like we see in big city crime), zooming out and zooming in enough, to notice that what seems a trend maybe isn’t.

5) Test your predictions, use your big data on yourself

The ‘big’ part of big data is that you are not dealing with a snapshot or a small subset (N= is a few) but with a complete timeline of the full data set (N = all). This means you can and need to test your model / algorithm / great idea on your own big data. If you think you can predict the potential of a movie, given genre and team, then test it with a movie from 2014 where you know the results (as they’re in your own dataset) on the database from before 2014 and see if your algorithm works. Did Lithuanian companies that already have entered the Swedish market fail or flourish in line with your data set? Did known past interventions into the retail experience have the impact your data patterns suggest they should?

6) Your data may be big, but does it contain what you need?

One thing I notice with government data is that most data is about what government knows (number of x, maps, locations of things, environmental measurements etc), and much less about what government does (decisions made, permits given, interventions made in any policy area). Often those are not available at all in data form but hidden somewhere in wordy meeting minutes or project plans. Financial data on spending and procurement is what comes closest to this.

Does your big data contain the things that tell what various actors around the problem you try to solve did to cause the patterns you spot in the data? The actual transactions of liquor stores connected to Russian flight’s boarding passes? The marketing decisions and their reasons for the Schiphol liquor stores? The actions of Lithuanian companies that tried different EU markets and failed or succeeded?

Issue-driven, not data-driven, and willing to do the hard bits
It was fun to work with these students, and there are a range of other things that come into play. Technical savviness, statistical skills, a real understanding of what problem you are trying to solve. It’s tempting to be data-driven, not issue-driven even if in the end that brings more value. With the former the data you have is always the right data, but with the latter you must acknowledge the limitations of your data and your own understanding.

Like I mentioned I used these lessons in a session for a different group of students in a different city, Leeuwarden. There a group worked for a week on data-related projects to support the city’s role as cultural capital of Europe in 2018. The two winning teams there both stood out because they had focussed very much on specific groups of people (international students in Leeuwarden, and elderly visitors to the city), and really tried to design solutions starting with the intended user at the center. That user-centered thinking really turned out to be the hardest part. Especially if you already have a list of available data sets in front of you. Most of the teacher’s time was spent on getting the students to match the datasets to use cases, and not the other way around.