To me there seems to be something fundamentally wrong with plans I come across where companies would pay people for access to their personal data. This is not a well articulated thing, it just feels like the entire framing of the issue is off, so the next paragraphs are a first attempt to jot down a few notions.

To me it looks very much like a projection by companies on people of what companies themselves would do: treating data as an asset you own outright and then charging for access. So that those companies can keep doing what they were doing with data about you. It doesn’t strike me as taking the person behind that data as the starting point, nor their interests. The starting point of any line of reasoning needs to be the person the data is about, not the entity intending to use the data.

Those plans make data release, or consent for using it, fully transactional. There are several things intuitively wrong with this.

One thing it does is put everything in the context of single transactions between individuals like you and me, and the company wanting to use data about you. That seems to be an active attempt to distract from the notion that there’s power in numbers. Reducing it to me dealing with a company, and you dealing with them separately makes it less likely groups of people will act in concert. It also distracts from the huge power difference between me selling some data attributes to some corp on one side, and that corp amassing those attributes over wide swaths of the population on the other.

Another thing is it implies that the value is in the data you likely think of as yours, your date of birth, residence, some conscious preferences, type of car you drive, health care issues, finances etc. But a lot of value is in data you actually don’t have about you but create all the time: your behaviour over time, clicks on a site, reading speed and pauses in an e-book, minutes watched in a movie, engagement with online videos, the cell towers your phone pinged, the logs about your driving style of your car’s computer, likes etc. It’s not that the data you’ll think of as your own is without value, but that it feels like the magician wants you to focus on the flower in his left hand, so you don’t notice what he does with his right hand.
On top of that it also means that whatever they offer to pay you will be too cheap: your data is never worth much in itself, only in aggregate. Offering to pay on individual transaction basis is an escape for companies, not an emancipation of citizens.

One more element is the suggestion that once such a transaction has taken place everything is ok, all rights have been transferred (even if limited to a specific context and use case) and that all obligations have been met. It strikes me as extremely reductionist. When it comes to copyright authors can transfer some rights, but usually not their moral rights to their work. I feel something similar is at play here. Moral rights attached to data that describes a person, which can’t be transferred when data is transacted. Is it ok to manipulate you into a specific bubble and influence how you vote, if they paid you first for the type of stuff they needed to be able to do that to you? The EU GDPR I think takes that approach too, taking moral rights into account. It’s not about ownership of data per se, but the rights I have if your data describes me, regardless of whether it was collected with consent.

The whole ownership notion is difficult to me in itself. As stated above, a lot of data about me is not necessarily data I am aware of creating or ‘having’, and likely don’t see a need for to collect about myself. Unless paying me is meant as incentive to start collecting stuff about me for the sole purpose of selling it to a company, who then doesn’t need my consent nor make the effort to collect it about me themselves. There are other instances where me being the only one able to determine to share some data or withhold it mean risks or negative impact for others. It’s why cadastral records and company beneficial ownership records are public. So you can verify that the house or company I’m trying to sell you is mine to sell, who else has a stake or claim on the same asset, and to what amount. Similar cases might be made for new and closely guarded data, such as DNA profiles. Is it your sole individual right to keep those data closed, or has society a reasonable claim to it, for instance in the search for the cure for cancer? All that to say, that seeing data as a mere commodity is a very limited take, and that ownership of data isn’t a clear cut thing. Because of its content, as well as its provenance. And because it is digital data, meaning it has non-rivalrous and non-excludable characteristics, making it akin to a public good. There is definitely a communal and network side to holding, sharing and processing data, currently conveniently ignored in discussions about data ownership.

In short talking about paying for personal data and data lockers under my control seem to be a framing that presents data issues as straightforward but doesn’t solve any of data’s ethical aspects, just pretends that it’s taken care of. So that things may continue as usual. And that’s even before looking into the potential unintended consequences of payments.

Some links I thought worth reading the past few days

  • Peter Rukavina pointed me to this excellent posting on voting, in the context of violence as a state monopoly and how that vote contributes to violence. It’s this type of long form blogging that I often find so valuable as it shows you the detailed reasoning of the author. Where on FB or Twitter would you find such argumentation, and how would it ever surface in a algorithmic timeline? Added Edward Hasbrouck to my feedreader : The Practical Nomad blog: To vote, or not to vote?
  • This quote is very interesting. Earlier in the conversation Stephen Downes mentions “networks are grown, not constructed”. (true for communities too). Tanya Dorey adds how from a perspective of indigenous or other marginalised groups ‘facts’ my be different, and that arriving a truth therefore is a process: “For me, “truth growing” needs to involve systems, opportunities, communities, networks, etc. that cause critical engagement with ideas, beliefs and ways of thinking that are foreign, perhaps even contrary to our own. And not just on the content level, but embedded within the fabric of the system et al itself.“: A conversation during on truth, data, networks and graphs.
  • This article has a ‘but’ title, but actually is a ‘yes, and’. Saying ethics isn’t enough because we also need “A society-wide debate on values and on how we want to live in the digital age” is saying the same thing. The real money quote though is “political parties should be able to review technology through the lens of their specific world-views and formulate political positions accordingly. A party that has no position on how their values relate to digital technology or the environment cannot be expected to develop any useful agenda for the challenges we are facing in the 21st century.” : Gartner calls Digital Ethics a strategic trend for 2019 – but ethics are not enough
  • A Dutch essay on post-truth. Says it’s not the end of truth that’s at issue but rather that everyone claims it for themselves. Pits Foucault’s parrhesia, speaking truth to power against the populists : Waarheidsspreken in tijden van ‘post-truth’: Foucault, ‘parrèsia’ en populisme
  • When talking about networked agency and specifically resilience, increasingly addressing infrastructure dependencies gets important. When you run decentralised tools so that your instance is still useful when others are down, then all of a sudden your ISP and energy supplier are a potential risk too: | a disaster-resilient communications network powered by the sun
  • On the amplification of hate speech. It’s not about the speech to me, but about the amplification and the societal acceptability that signals, and illusion of being mainstream it creates: Opinion | I Thought the Web Would Stop Hate, Not Spread It
  • One of the essential elements of the EU GDPR is that it applies to anyone having data about EU citizens. As such it can set a de facto standard globally. As with environmental standards market players will tend to use one standard, not multiple for their products, and so the most stringent one is top of the list. It’s an element in how data is of geopolitical importance these days. This link is an example how GDPR is being adopted in South-Africa : Four essential pillars of GDPR compliance
  • A great story how open source tools played a key role in dealing with the Sierra Leone Ebola crisis a few years ago: How Open Source Software Helped End Ebola – iDT Labs – Medium
  • This seems like a platform of groups working towards their own networked agency, solving issues for their own context and then pushing them into the network: GIG – we are what we create together
  • An article on the limits on current AI, and the elusiveness of meaning: Opinion | Artificial Intelligence Hits the Barrier of Meaning

Some links I thought worth reading the past few days

Some links I think worth reading today.

Ethics by design is adding ethical choices and values to a design process as non-functional requirements, that then are turned into functional specifications.

E.g. when you want to count the size of a group of people by taking a picture of them, adding the value of safeguarding privacy into the requirements might mean the picture will be intentionally made grainy by a camera. A more grainy pic still allows you to count the number of people in the photo, but you never captured and stored their actual faces.

When it comes to data governance and machine learning Europe’s stance towards safeguarding civic rights and enlightenment values is a unique perspective to take in a geopolitical context. Data is a very valuable resource. In the US large corporations and intelligence services have created enormous data lakes, without much restraints, resulting in a tremendous power asymmetry, and an objectification of the individual. This is surveillance capitalism.
China, and others like Russia, have created or are creating large national data spaces in which the individual is made fully transparent and described by connecting most if not all data sources and make them accessible to government, and where resulting data patterns have direct consequences for citizens. This is data driven authoritarian rule.
Europe cannot compete with either of those two models, but can provide a competing perspective on data usage by creating a path of responsible innovation in which all data is as much combined and connected as elsewhere in the world, yet with values and ethical boundaries designed into its core. With the GDPR the EU is already setting a new de-facto global standard, and doing more along similar lines, not just in terms of regulations, but also in terms of infrastructure (Estonia’s X-road for instance) is the opportunity Europe has.

Some pointers:
My blogpost Ethics by Design
A naive exploration of ethics around networked agency.
A paper (PDF) on Value Sensitive Design
The French report For a Meaningful Artificial Intelligence (PDF), that drive France’s 1.5 billion investment in value based AI.

Last week I had the pleasure to attend and to speak at the annual FOSS4G conference. This gathering of the community around free and open source software in the geo-sector took place in Bonn, in what used to be the German parliament. I’ve posted the outline, slides and video of my keynote already at my company’s website, but am now also crossposting it here.

Speaking in the former German Parliament
Speaking in the former plenary room of the German Parliament. Photo by Bart van den Eijnden

In my talk I outlined that it is often hard to see the real impact of open data, and explored the reasons why. I ended with a call upon the FOSS4G community to be an active force in driving ethics by design in re-using data.

Impact is often hard to see, because measurement takes effort
Firstly, because it takes a lot of effort to map out all the network effects, for instance when doing micro-economic studies like we did for ESA or when you need to look for many small and varied impacts, both socially and economically. This is especially true if you take a ‘publish and it will happen’ approach. Spotting impact becomes much easier if you already know what type of impact you actually want to achieve and then publish data sets you think may enable other stakeholders to create such impact. Around real issues, in real contexts, it is much easier to spot real impact of publishing and re-using open data. It does require that the published data is serious, as serious as the issues. It also requires openness: that is what brings new stakeholders into play, and creates new perspectives towards agency so that impact results. Openness needs to be vigorously defended because of it. And the FOSS4G community is well suited to do that, as openness is part of their value set.

Impact is often hard to see, because of fragmentation in availability
Secondly, because impact often results from combinations of data sets, and the current reality is that data provision is mostly much too fragmented to allow interesting combinations. Some of the specific data sets, or the right timeframe or geographic scope might be missing, making interesting re-uses impossible.
Emerging national data infrastructures, such as the Danish and the Dutch have been creating, are a good fix for this. They combine several core government data sets into a system and open it up as much as possible. Think of cadastral records, maps, persons, companies, adresses and buildings.
Geo data is at the heart of all this (maps, addresses, buildings, plots, objects), and it turns it into the linking pin for many re-uses where otherwise diverse data sets are combined.

Geo is the linking pin, and its role is shifting: ethics by design needed
Because of geo-data being the linking pin, the role of geo-data is shifting. First of all it puts geo-data in the very heart of every privacy discussion around open data. Combinations of data sets quickly can become privacy issues, with geo-data being the combinator. Privacy and other ethical questions arise even more now that geo-data is no longer about relatively static maps, but where sensors are making many more objects as well as human beings objects on the map in real time.
At the same time geo-data is becoming less visible in these combinations. ‘The map’ is not neccessarily a significant part of the result of combining data sets, just a catalyst on the way to get there. Will geo-data be a neutral ingredient, or will it be an ingredient with a strong attitude? An attitude that aims to actively promulgate ethical choices, not just concerning privacy, but also concerning what are statistically responsible combinations, and what are and are not legal steps in getting to an in itself legal result again? As with defending openness itself, the FOSS4G community is in a good position to push the ethical questions forward in the geo community as well as find ways of incorporating them directly in the tools they build and use.

The video of the keynote has been published by the FOSS4G conference organisers.
Slides are available from Slideshare and embedded below: