Favorited I’m banned for life from advertising on Meta. Because I teach Python. by Reuven Lerner

The Python programming language is over 30 yrs old, the Pandas data analysis library for it is 15 yrs old. It’s not unlikely Meta’s advert checking AI was created using both somewhere in the process. But the programming context of both words was definitely not in the training set for it.

Provide Python and Pandas training, advertise on FB. Get blocked because Meta’s AI spits out a high probability it is about illegal animal trade. Appeal the decision. Have the same AI, not a person, look at it again and conclude the same thing. Get blocked for all time. Have insiders check and conclude this can’t be reversed.

Computer says no…‘ and Kafka had a child and it’s Meta’s AI. And Meta has no human operated steering wheel that is connected to anything meaningful.

Via Ben Werdmuller

I’m a full-time instructor in Python and Pandas, teaching in-person courses at companies around the world … Meta’s AI system noticed that I was talking about Python and Pandas, assumed that I was talking about the animals […], and banned me. The appeal that I asked for wasn’t reviewed by a human, but was reviewed by another bot, which (not surprisingly) made a similar assessment.

Reuven Lerner

ODRL, Open Digital Rights Language popped up twice this week for me and I don’t think I’ve been aware of it before. Some notes for me to start exploring.

Rights Expression Languages

Rights Expression Languages, RELs, provide a machine readable way to convey or transfer usage conditions, rights, restraints, granularly w.r.t. both actions and actors. This can then be added as metadata to something. ODRL is a rights expression language, and seems to be a de facto standard.

ODRL is a W3C recommendation since 2018, and thus part of the open web standards. ODRL has its roots in the ’00s and Digital Rights Management (DRM): the abhorred protections media companies added to music and movies, and now e-books, in ways that restrains what people can do with media they bought to well below the level of what was possible before and commonly thought part of having bought something.

ODRL can be expressed in JSON or RDF and XML. A basic example from Wikipedia looks like this:


{
"@context": "http://www.w3.org/ns/odrl.jsonld",
"uid": "http://example.com/policy:001",
"permission": [{
"target": "http://example.com/mysong.mp3",
"assignee": "John Doe",
"action": "play"
}]
}

In this JSON example a policy describes that example.com grants John permission to play mysong.

ODRL in the EU Data Space

In the shaping of the EU common market for data, aka the European common data space, it is important to be able to trace provenance and usage conditions for not just data sets, but singular pieces of data, as it flows through use cases, through applications and their output back into the data space.
This week I participated in a webinar by the EU Data Space Support Center (DSSC) about their first blueprint of data space building blocks, and for federation of such data spaces.

They propose ODRL as the standard to describe usage conditions throughout data spaces.

The question of enactment

It wasn’t the first time I talked about ODRL this week. I had a conversation with Pieter Colpaert. I reached out to get some input on his current view of the landscape of civic organisations active around the EU data spaces. We also touched upon his current work at the University of Gent. His research interest is on ODRL currently, specifically on enactment. ODRL is a REL, a rights expression language. Describing rights is one thing, enacting them in practice, in technology, processes etc. is a different thing. Next to that, how do you demonstrate that you adhere to the conditions expressed and that you qualify for using the things described?

For the EU data space(s) this part sounds key to me, as none of the data involved is merely part of a single clear interaction like in the song example above. It’s part of a variety of flows in which actors likely don’t directly interact, where many different data elements come together. This includes flows through applications that tap into a data space for inputs and outputs but are otherwise outside of it. Such applications are also digital twins, federated systems of digital twins even, meaning a confluence of many different data and conditions across multiple domains (and thus data spaces). All this removes a piece of data lightyears from the neat situation where two actors share it between them in a clearly described transaction within a single-faceted use case.

Expressing the commons

It’s one thing to express restrictions or usage conditions. The DSSC in their webinar talked a lot about business models around use cases, and ODRL as a means for a data source to stay in control throughout a piece of data’s life cycle. Luckily they stopped using the phrase ‘data ownership’ as they realised it’s not meaningful (and confusing on top of it), and focused on control and maintaining having a say by an actor.
An open question for me is how you would express openness and the commons in ODRL. A shallow search surfaces some examples of trying to express Creative Commons or other licenses this way, but none recent.

Openness, can mean an absence of certain conditions, although there may be some (like adding the same absence of conditions to re-shared material or derivative works), which is not the same as setting explicit permissions. If I e.g. dedicate something to the public domain, an image for instance, then there are no permissions for me to grant, as I’ve removed myself from that role of being able to give permission. Yet, you still want to express it to ensure that it is clear for all that that is what happened, and especially that it remains that way.

Part of that question is about the overlap and distinction between rights expressed in ODRL and authorship rights. You can obviously have many conditions outside of copyright, and can have copyright elements that may be outside of what can be expressed in RELs. I wonder how for instance moral authorship rights (that an author in some (all) European jurisdictions cannot do away with) can be expressed after an author has transferred/sold the copyrights to something? Or maybe, expressing authorship rights / copyrights is not what RELs are primarily for, as it those are generic and RELs may be meant for expressing conditions around a specific asset in a specific transaction. There have been various attempts to map all kinds of licenses to RELs though, so I need to explore more.

This is relevant for the EU common data spaces as my government clients will be actors in them and bringing in both open data and closed and unsharable but re-usable data, and several different shades in between. A range of new obligations and possibilities w.r.t. data use for government are created in the EU data strategy laws and the data space is where those become actualised. Meaning it should be possible to express the corresponding usage conditions in ODRL.

ODRL gaps?

Are there gaps in the ODRL standard w.r.t. what it can cover? Or things that are hard to express in it?
I came across one paper ‘A critical reflection on ODRL’ (PDF Kebede, Sileno, Van Engers 2020), that I have yet to read, that describes some of those potential weaknesses, based on use cases in healthcare and logistics. Looking forward to digging out their specific critique.

Bookmarked Mechanisms of Techno-Moral Change: A Taxonomy and Overview (by John Danaher and Henrik Skaug Sætra)

Via Stephen Downes. Overview of how, through what mechanisms, technology changes work moral changes. At first glance seems to me a sort of detailing of Smits’ 2002 PhD thesis Monster theory, looking at how tech changes can challenge cultural categories, and diving into the specific part where cultural categories are adapted to fit new tech in. The citations don’t mention Smits or the anthropological work of Mary Douglas it is connected to. It does cite references by Peter-Paul Verbeek and Marianne Boenink (all three from the PSTS department I studied at), so no wonder I sense a parallel here.

The first example mentioned in the table explaining the six identified mechanisms points in this direction of a parallel too: the 70s redefinition of death as brain death was a redefinition of cultural concepts to assimilate tech change was also used as example in Smits’ work. The third example is a direct parallel to my 2008 post on empathy as shifting cultural category because of digital infrastructure, and how I talked about hyperconnected individuals and the impact on empathy in 2010 when talking about the changes bringing forth MakerHouseholds.

Where Monster theory was meant as a tool to understand and diagnose discussions of new tech, wherein the assmilation part (both cultural categories and technology get adapted) is the pragmatic route (the mediation theory of Peter Paul Verbeek is located there too), it doesn’t as such provide ways to act / intervene. Does this taxonomy provide options to act?
Or is this another descriptive way to locate where moral effects might take place, and the various types of responses to Monsters still determine the potential moral effect?

The paper is directly available, added it to my Zotero library for further exploration.

Many people study the phenomenon of techno-moral change but, to some extent, the existing literature is fragmented and heterogeneous – lots of case studies and examples but not enough theoretical unity. The goal of this paper is to bring some order to existing discussions by proposing a taxonomy of mechanisms of techno-moral change. We argue that there are six primary mechanisms..

John Danaher

Bookmarked 1.2 billion euro fine for Facebook as a result of EDPB binding decision (by European Data Protection Board)

Finally a complaint against Facebook w.r.t. the GDPR has been judged by the Irish Data Protection Authority. This after the EDPB instructed the Irish DPA to do so in a binding decision (PDF) in April. The Irish DPA has been extremely slow in cases against big tech companies, to the point where they became co-opted by Facebook in trying to convince the other European DPA’s to fundamentally undermine the GDPR. The fine is still mild compared to what was possible, but still the largest in the GDPR’s history at 1.2 billion Euro. Facebook is also instructed to bring their operations in line with the GDPR, e.g. by ensuring data from EU based users is only stored and processed in the EU. This as there is no current way of ensuring GDPR compliance if any data gets transferred to the USA in the absence of an adequacy agreement between the EU and the US government.

A predictable response by FB is a threat to withdraw from the EU market. This would be welcome imo in cleaning up public discourse and battling disinformation, but is very unlikely to happen. The EU is Meta’s biggest market after their home market the US. I’d rather see FB finally realise that their current adtech models are not possible under the GDPR and find a way of using the GDPR like it is meant to: a quality assurance tool, under which you can do almost anything, provided you arrange what needs to be arranged up front and during your business operation.

This fine … was imposed for Meta’s transfers of personal data to the U.S. on the basis of standard contractual clauses (SCCs) since 16 July 2020. Furthermore, Meta has been ordered to bring its data transfers into compliance with the GDPR.

EDPB

Bookmarked Project Tailwind by Steven Johnson

Author Steven Johnson has been working with Google and developed a prototype for Tailwind. Tailwind, an ‘AI first notebook’, is intended to bring an LLM to your own source material, and then you can use it to ask questions of the sources you give it. You point it to a set of resources in your Google Drive and what Tailwind generates will be based just on those resources. It shows you the specific source of the things it generates as well. Johnson explicitly places it in the Tools for Thought category. You can join a waiting list if you’re in the USA, and a beta should be available in the summer. Is the USA limit intended to reduce the number of applicants I wonder, or a sign that they’re still figuring things like GDPR for this tool? Tailwind is prototyped on PaLM API though, which is now generally available.

This, from its description, gets to where it becomes much more interesting to use LLM and GPT tools. A localised (not local though, it lives in your Google footprint) tool, where the user defines the corpus of sources used, and traceable results. As the quote below suggests a personal research assistant. Not just for my entire corpus of notes as I describe in that linked blogpost, but also on a subset of notes for a single topic or project. I think there will be more tools like these coming in the next months, some of which likely will be truly local and personal.

On the Tailwind team we’ve been referring to our general approach as source-grounded AI. Tailwind allows you to define a set of documents as trusted sources …, shaping all of the model’s interactions with you. … other types of sources as well, such as your research materials for a book or blog post. The idea here is to craft a role for the LLM that is … something closer to an efficient research assistant, helping you explore the information that matters most to you.

Steven Johnson

Bookmarked Will A.I. Become the New McKinsey? by Ted Chiang in the New Yorker

Ted Chiang realises that corporates are best positioned to leverage the affordances of algorithmic applications, and that that is where the risk of the ‘runaway AIs’ resides. I agree that they are best positioned, because corporations are AI’s non-digital twin, and have been recognised as such for a decade.

Brewster Kahle said (in 2014) that corporations should be seen as the 1st generation AIs, and Charlie Stross reinforced it (in 2017) by dubbing corporations ‘Slow AI’ as corporations are context blind, single purpose algorithms. That single purpose being shareholder value. Jeremy Lent (in 2017) made the same point when he dubbed corporations ‘socio-paths with global reach’ and said that the fear of runaway AI was focusing on the wrong thing because “humans have already created a force that is well on its way to devouring both humanity and the earth in just the way they fear. It’s called the Corporation“. Basically our AI overlords are already here: they likely employ you. Of course existing Slow AI is best positioned to adopt its faster young, digital algorithms. It as such can be seen as the first step of the feared iterative path of run-away AI.

The doomsday scenario is … A.I.-supercharged corporations destroying the environment and the working class in their pursuit of shareholder value.

Ted Chiang

I’ll repeat the image I used in my 2019 blogpost linked above:

Your Slow AI overlords looking down on you, photo Simone Brunozzi, CC-BY-SA