In reply to Creating a custom GPT to learn about my blog (and about myself) by Peter Rukavina

It’s not surprising that GPT-4 doesn’t work like a search engine and has a hard time surfacing factual statements from source texts. Like one of the commenters I wonder what that means for the data analysis you also asked for. Perhaps those too are merely plausible, but not actually analysed. Especially the day of the week thing, as that wasn’t in the data, and I wouldn’t expect GPT to determine all weekdays for posts in the process of answering your prompt.

I am interested in doing what you did, but then with 25 years of notes and annotations. And rather with a different model with less ethical issues attached. To have a chat about my interests and links between things. Unlike the fact based questions he’s asked the tool that doesn’t necessarily need it to be correct, just plausible enough to surface associations. Such associations might prompt my own thinking and my own searches working with the same material.

Also makes me think if what Wolfram Alpha is doing these days gets a play in your own use of GPT+, as they are all about interpreting questions and then giving the answer directly. There’s a difference between things that face the general public, and things that are internal or even personal tools, like yours.

Have you asked it things based more on association yet? Like “based on the posts ingested what would be likely new interests for Peter to explore” e.g.? Can you use it to create new associations, help you generate new ideas in line with your writing/interests/activities shown in the posts?

So my early experiments show me that as a data analysis copilot, a custom GPT is a very helpful guide… In terms of the GPT’s ability to “understand” me from my blog, though, I stand unimpressed.

Peter Rukavina

Favorited EDPB Urgent Binding Decision on processing of personal data for behavioural advertising by Meta by EDPB

This is very good news. The European Data Protection Board, at the request of the Norwegian DPA, has issued a binding decision instructing the Irish DPA and banning the processing of personal data for behavioural targeting by Meta. Meta must cease processing data within two weeks. Norway already concluded a few years ago that adtech is mostly illegal, but European cases based on the 2018 GDPR moved through the system at a glacial pace, in part because of a co-opted and dysfunctional Irish Data Protection Board. Meta’s ‘pay for privacy‘ ploy is also torpedoed with this decision. This is grounds for celebration, even if this will likely lead to legal challenges first. And it is grounds for congratulations to NOYB and Max Schrems whose complaints filed the first minute the GDPR enforcement started in 2018 kicked of the process of which this is a result.

…take, within two weeks, final measures regarding Meta Ireland Limited (Meta IE) and to impose a ban on the processing of personal data for behavioural advertising on the legal bases of contract and legitimate interest across the entire European Economic Area (EEA).

European Data Protection Board

In 1967 French literary critic Roland Barthes declared the death of the author (in English, no less). An author’s intentions and biography are not the means to explain definitively what the meaning of a text (of fiction) is. It’s the reader that determines meaning.

Barthes reduces the author to merely a scriptor, a scribe, who doesn’t exist other than for their role of penning the text. It positions the work fully separate of its maker.

I don’t disagree with the notion that readers glean meaning in layers from a text, far beyond what an author might have intended. But thinking about the author’s intent, in light of their biography or not, is one of those layers for readers to interpret. It doesn’t make the author the sole decider on meaning, but the author’s perspective can be used to create meaning by any reader. Separating the author from their work entirely is cutting yourself of from one source of potential meaning. Even when reduced to the role of scribe, such meaning will leak forth: the monks of old who tagged the transcripts they made and turned those into Indexes that are a common way of interpreting on which topics a text touches or puts emphasis. So despite Barthes pronouncement, I never accepted the brain death of the author, yet also didn’t much care specifically about their existence for me to find meaning in texts either.

With the advent of texts made by generative AI I think bringing the author and their intentions in scope of creating meaning is necessary however. It is a necessity as proof of human creation. Being able to perceive the author behind a text, the entanglement of its creation with their live, is the now very much needed Reverse Turing test. With algorithmic text generation there is indeed only a scriptor, one incapable of conveying meaning themselves.
To determine the human origin of a text, the author’s own meaning, intention and existence must shine through in a text, or be its context made explicit. Because our default assumption must be that it was generated.

The author is being resurrected. Because we now have fully automated scriptors. Long live the author!

In discussions about data usage and sharing and who has a measure of control over what data gets used and shared how, we easily say ‘my data’ or get told about what you can do with ‘your data’ in a platform.

‘My data’.

While it sounds clear enough, I think it is a very imprecise thing to say. It distracts from a range of issues about control over data, and causes confusion in public discourse and in addressing those issues. Such distraction is often deliberate.

Which one of these is ‘my data’?

  • Data that I purposefully collected (e.g. temperature readings from my garden), but isn’t about me.
  • Data that I purposefully collected (e.g. daily scale readings, quantified self), that is about me.
  • Data that is present on a device I own or external storage service, that isn’t about me but about my work, my learning, my chores, people I know.
  • Data that describes me, but was government created and always rests in government databases (e.g. birth/marriage registry, diploma’s, university grades, criminal records, real estate ownership), parts of which I often reproduce/share in other contexts while not being the authorative source (anniversaries, home address, CV).
  • Data that describes me, but was private sector created and always rests in private sector databases (e.g. credit ratings, mortgage history, insurance and coverage used, pension, phone location and usage, hotel stays, flights boarded)
  • Data that describes me, that I entered into my profiles on online platforms
  • Data that I created, ‘user generated content’, and shared through platforms
  • Data that I caused to be through my behaviour, collected by devices or platforms I use (clicks through sites, time spent on a page, how I drive my car, my e-reading habits, any IoT device I used/interacted with, my social graphs), none of which is ever within my span of control, likely not accessible to me, and I may not even be aware it exists.
  • Data that was inferred about me from patterns in data that I caused to be through my behaviour, none of which is ever within my span of control, and which I mostly don’t know about or even suspect exists. Which may say things I don’t know about myself (moods, mental health) or that I may not have made explicit anywhere (political or religious orientation, sexual orientation, medical conditions, pregnancy etc)

Most of the data that holds details about me wasn’t created by me, and wasn’t within my span of control at any time.
Most of the data I purposefully created or have or had in my span of control, isn’t about me but about my environment, about other people near me, things external and of interest to me.

They’re all ‘my data’. Yet, whenever someone says ‘my data’, and definitely when someone says ‘your data’, that entire scope isn’t what is indicated. My data as a label easily hides the complicated variety of data we are talking about. And regularly, specifically when someone says ‘your data’, hiding parts of the list is deliberate.
The last bullets, data that we created through our behaviour and what is inferred about us, is what the big social media platforms always keep out of sight when they say ‘your data’. Because that’s the data their business models run on. It’s never part of the package when you click ‘export my data’ in a platform.

The core issues aren’t about whether it is ‘my data’ in terms of control or provenance. The core issues are about what others can/cannot will/won’t do with any data that describes me or is circumstantial to me. Regardless in whose span of control such data resides, or where it came from.

There are also two problematic suggestions packed into the phrase ‘my data’.
One is that with saying ‘my data’ you are also made individually responsible for the data involved. While this is partly true (mostly in the sense of not carelessly leaving stuff all over webforms and accounts), almost all responsibility for the data about you resides with those using it. It’s other’s actions with data that concern you, that require responsibility and accountability, and should require your voice being taken into account. "Nothing about us, without us" holds true for data too.
The other is that ‘my data’ is easily interpreted and positioned as ownership. That is a sleight of hand. Property claims and citizen rights are very different things and different areas of law. If ‘your data’ is your property, all that is left is to haggle about price, and each context is framed as merely transactional. It’s not in my own interest to see my data or myself as a commodity. It’s not a level playing field when I’m left to negotiating my price with a global online platform. That’s so asymmetric that there’s only one possible outcome. Which is the point of the suggestion of ownership as opposed to the framing as human rights. Contracts are the preferred tool of the biggest party, rights that of the individual.

Saying ‘my data’ and ‘your data’ is too imprecise. Be precise, don’t let others determine the framing.

Bookmarked Mechanisms of Techno-Moral Change: A Taxonomy and Overview (by John Danaher and Henrik Skaug Sætra)

Via Stephen Downes. Overview of how, through what mechanisms, technology changes work moral changes. At first glance seems to me a sort of detailing of Smits’ 2002 PhD thesis Monster theory, looking at how tech changes can challenge cultural categories, and diving into the specific part where cultural categories are adapted to fit new tech in. The citations don’t mention Smits or the anthropological work of Mary Douglas it is connected to. It does cite references by Peter-Paul Verbeek and Marianne Boenink (all three from the PSTS department I studied at), so no wonder I sense a parallel here.

The first example mentioned in the table explaining the six identified mechanisms points in this direction of a parallel too: the 70s redefinition of death as brain death was a redefinition of cultural concepts to assimilate tech change was also used as example in Smits’ work. The third example is a direct parallel to my 2008 post on empathy as shifting cultural category because of digital infrastructure, and how I talked about hyperconnected individuals and the impact on empathy in 2010 when talking about the changes bringing forth MakerHouseholds.

Where Monster theory was meant as a tool to understand and diagnose discussions of new tech, wherein the assmilation part (both cultural categories and technology get adapted) is the pragmatic route (the mediation theory of Peter Paul Verbeek is located there too), it doesn’t as such provide ways to act / intervene. Does this taxonomy provide options to act?
Or is this another descriptive way to locate where moral effects might take place, and the various types of responses to Monsters still determine the potential moral effect?

The paper is directly available, added it to my Zotero library for further exploration.

Many people study the phenomenon of techno-moral change but, to some extent, the existing literature is fragmented and heterogeneous – lots of case studies and examples but not enough theoretical unity. The goal of this paper is to bring some order to existing discussions by proposing a taxonomy of mechanisms of techno-moral change. We argue that there are six primary mechanisms..

John Danaher

Bookmarked Disinformation and its effects on social capital networks (Google Doc) by Dave Troy

This document by US journalist Dave Troy positions resistance against disinformation not as a matter of factchecking and technology but as one of reshaping social capital and cultural network topologies. I plan to read this, especially the premises part looks interesting. Some upfront associations are with Valdis Krebs’ work on the US democratic / conservative party divide where he visualised it based on cultural artefacts, i.e. books people bought (2003-2008), to show spheres and overlaps, and with the Finnish work on increasing civic skills which to me seems a mix of critical crap detection skills woven into a social/societal framework. Networks around a belief or a piece of disinformation for me also point back to what I mentioned earlier about generated (and thus fake) texts, how attempts to detect such fakes usually center on the artefact not on the richer tapestry of information connections (last 2 bullet points and final paragraph) around it (I called it provenance and entanglement as indicators of authenticity recently, entanglement being the multiple ways it is part of a wider network fabric). And there’s the more general notion of Connectivism where learning and knowledge are situated in networks too.

The related problems of disinformation, misinformation, and radicalization have been popularly misunderstood as technology or fact-checking problems, but this ignores the mechanism of action, which is the reconfiguration of social capital. By recasting these problems as one problem rooted in the reconfiguration of social capital and network topology, we can consider solutions that might maximize public health and favor democracy over fascism …

Dave Troy