Elke eerste zondag van de maand is het Doei!-dag, voor een dikke doei aan Big Tech.

Er zijn drie grote intrinsieke problemen met BigTech:
1) grote techbedrijven zijn data-graaiers. In grote techbedrijven gaat het in plaats van digitale dienstverlening bovenal om het verzamelen en verhandelen van data. Data over jou. Om er advertenties mee te verkopen en je gedrag mee te beïnvloeden. In weerwil van Europese privacywetgeving.
2) grote techbedrijven zijn silo’s en de-facto monopolisten. Ze houden je in hun silo door nog resterende verbindingen met andere digitale diensten of platformen of het open web stelselmatig af te breken en onmogelijk te maken (verticale integratie, en walled gardens). Hun de-facto monopolistisch karakter maakt dat ze hun diensten vervolgens kunnen uitkleden of nog meer richten op data-graaien, omdat je toch nergens anders heen kunt, en andere bedrijven die via hen iets verkopen afknijpen. De verkakking (enshittification) van digitale diensten.
3) grote techbedrijven staan meestal onder invloed van vreemde mogendheden waar jij geen democratische controle over hebt. Daarmee wordt het een pressiemiddel ten koste van jouw vrijheid van handelen. VS-bedrijven moeten toegang bieden aan hun overheid, en die overheid ook een uit-knop toestaan. (Eerst met de Patriot Act van 2000 en nu met de Cloud Act van 2018). Recenter coöpteren ze ook graag Europese toezichthouders en mengen ze zich in binnenlandse democratische processen. Chinese tech-bedrijven zijn integraal onderdeel van het Chinese surveillance en repressie-systeem, en component van het buitenlandse beïnvloedingsbeleid.

De eerste twee redenen zijn voldoende aanleiding voor ieder individu en bedrijf om af te zien van BigTech ‘diensten’, de derde maakt het ook urgent voor alle Europese overheden. Als digitale autonomie (ik bepaal hoe digitale gereedschappen voor me werken en welke ik gebruik) en digitale soevereiniteit (ik bepaal wat ik doe, niemand heeft aan mijn digitale gereedschap een pressiemiddel) je lief is.
De oplossing is overigens niet om gelijksoortige grote bedrijven in Europa te hebben. Die zijn dan net zo problematisch namelijk. Om dezelfde intrinsieke redenen als hierboven, en om de eveneens aanzienlijke problematische externaliteiten die BigTech ook veroorzaakt, t.a.v. klimaat, uitbuiting, haat en geweld.
Je moet er dus ‘gewoon’ helemaal vanaf, en niet op zoek naar een kloon maar dan met een EU vlag. Je moet overstappen naar alternatieven, en iets is pas een alternatief als het ook echt anders is. Qua omvang, financiering, besturing, structuur, functionaliteit en principes.

In ons huishouden zijn we in 2014 begonnen met Doei! zeggen tegen BigTech. Vooral vanwege de eerste reden (al waren enkelen in mijn netwerk toen ook al zeer stellig over de 3e reden). E is dan ook een van de mensen die in navolging van het Duitse Digital Independence Day (dat november vorig jaar startte) op de FOSDEM conferentie in januari bedachten dat een Nederlandse tegenhanger nodig is: Doei! dag.

Doei is technisch makkelijker gezegd dan psychologisch gedaan. Je bent gewend aan hoe het is, hoe je het nu doet. En als je verandert van software, platform of alleen al het uiterlijk van iets, vinden we dat snel erg lastig. Dus is wat hulp en aanmoediging welkom. Dat doet Doei! dag met recepten. Elk recept is een stappenplan om beetje bij beetje BigTech te vervangen door digitaal gereedschap dat alleen jou ten dienste staat. Bijvoorbeeld Signal gaan gebruiken (al is het een Amerikaanse non-profit, op deels Amerikaanse cloud infrastructuur van Amazon) in plaats van WhatsApp (van Meta).

Elke stap weg van BigTech is daarin beter, dan wachten op een perfecte overgang. Natuurlijk zit niet ‘iedereen’ op Signal, en voor je gevoel wel op WhatsApp. Maar sommigen wel, en met iedere overstap wordt de muur om WhatsApp heen doorlatender. Denk aan Hyves, dat was kort na 2010 binnen een paar maanden leeg, terwijl ‘iedereen’ er ‘op’ zat. Nog ‘even’ wachten op het perfecte alternatief en zeggen ‘maar ik weet wel dat ik eigenlijk iets anders moet gebruiken’, is hoe je brein de inspanning die iedere verandering is probeert te voorkomen. Je brein zegt eigenlijk ‘ik wil alleen veranderen als het geen verandering meer is’. D.w.z. je wacht op het redden van je digitale autonomie tot iemand anders dat voor je regelt. Niet heel autonoom, digitaal of anderszins. Je claimt je digitale autonomie door zélf kleine stapjes te zetten, en dat te blijven doen.

Mijn ervaring is dat bijna elke stap vooraf lastiger lijkt dan het meteen nadat je hem gezet hebt bleek te zijn.

Ik heb in 2014 bijvoorbeeld flink zitten puzzelen over hoe ik uit Gmail weg kon, gaf er ook presentaties over, en meteen nadat ik mijn e-mail anders was gaan doen vroeg ik me af waarom ik het niet veel eerder had gedaan.

Ik heb begin vorig jaar besloten dat ik geen e-boeken meer wil kopen bij Amazon. Dat was best even zoeken, waar koop ik die boeken dan wel, en hoe houd ik overzicht (dat ik bij Amazon ook niet had, ik dacht dat alleen maar), maar uiteindelijk was het eenvoudig, en ik ben nu ruim een jaar niet eens meer op hun site geweest. Ook als ik nieuwe boeken zoek, kom ik niet bij Amazon op de site, en in zoekresultaten negeer ik als vanzelf Amazon en Goodreads links. Ik ben veel blijven lezen, en doe dat nu gevarieerder dan voorheen. Ik heb bovendien nu ook meer plezier in het online neuzen naar nieuwe boeken, veel meer een verlengstuk van zoals het ook in een goede boekwinkel kan voelen.

Op mijn laptop (Apple! Maar mijn dochter heb ik een Linux laptop gegeven….) komt alle software die ik dagelijks gebruik van buiten BigTech, en van alle BigTech social media ben ik af. In mijn bedrijf heb ik dat ook grotendeels zo ingericht. Mijn telefoon is nog grotendeels BigTech afhankelijk (een Fairphone met standaard Android van Google nog, al heb ik een collega een ontgooglede mobiel gegeven) maar hopelijk ontvang ik later dit jaar een telefoon van Jolla met Linux (en dan kijken hoe goed bijv bank apps daarop draaien).

Het is ook niet alleen een individuele zaak, al helpt ‘stemmen met je voeten’ echt. In organisaties, op systeem-niveau, en in inkoopprocessen van overheden en bedrijven moet ook iets veranderen. Daarom ben ik bijvoorbeeld in mijn werk nu ook actief in hoe Europese en internationale standaarden voor datatransacties in data spaces en cloud en voor soevereiniteit in cloud platformen worden geformuleerd. Want alleen wie daar meedoet heeft invloed op die standaarden (terwijl iedereen ze moet gaan gebruiken). BigTech zit er áltijd, die hebben daar (overigens zeer kundige) mensen voor en weten dat ze zo de hele markt kunnen beïnvloeden. Ik zit er daarom ook, want tegengas is nodig op de uitgangspunten en aannames die BigTech hanteert.

Perfect is het dus niet/nooit, systemisch niet, en individueel niet. Zo zit ik ook nog wel op LinkedIn (van Microsoft), al heb ik de tijdslijn daar uitgezet voor mezelf toen het inhoudelijk een soort Facebook-kloon begon te worden. Maar dat ik me nu langzaam drukker begin te maken over de hardware die ik gebruik, in plaats van alleen de software en online diensten, is een teken dat ik en ons huishouden al flink wat stappen hebben gezet.

Stap je mee?

Doei!

In the past week I’ve started three personal experiments that use AI (in this case Claude Code). For each, the experiment lies in automating steps in my cognitive work that are useful or necessary but not the actual cognitive work itself. They’re helper activities, supporting the main task. For two of the three that is the clear focus, the third is slightly different.

The three experiments are:

  • Filtering on interests in my feed reader, let’s call it ‘Weak-tAIs’.
  • ‘Slopsidian’, lifting concepts and argumentation from papers into Obsidian notes, and linking them iteratively.
  • Explore questions with pre-existing ‘recipes’ that take a specific philosophical perspective. Perhaps I should dub this type of language game ‘WittgenstAIn III’.

It started from an automation task, which I mentioned here: manipulating non-fiction e-books. I have a script that I can point to an e-book in my Calibre collection, and then will populate a note with elements from the book: foreword, index and literature list, content overview, all if present, and for each chapter of the book the first and last few paragraphs. This is what I look at and skim whenever I want to gain a first impression and understanding what a book is about, and what questions it addresses or what it proposes. All very Mortimer Adler. From it I can then decide which parts of a book to read more closely, which parts likely contain things I am already familiar with or fall outside my current interest in the book. From those skims I jot down things in my note for the book. This quickly turned out to be useful to me, because it removed the wall between the e-book and my notes by bringing parts of the e-book into my notes temporarily where I could more quickly go through them in preparation for ‘proper’ reading (although in fact it is part of reading).

It got me thinking what other helper activities in reading and filtering I could identify.
Helper activities are tasks that support a main task by making it easier or providing guard rails. Checklists are an example, they ensure that you don’t skip important steps. In most cases nothing will immediately go wrong if you don’t do the helper activity but if you do them the main task gets a little easier to do well. A lot of helper tasks can be regularly automated, like the e-book excerpt script above. Others less so because they contain elements of processing actual texts, like the three experiments I describe here. There perhaps using a model like Claude Code can be of value (and hopefully soon, through local model deployment).

A brief description of the three experiments:

Weak tAIs
I order my RSS feeds by social distance for reading. Part of the reasoning is that I want to be well informed about what close ties write, but I am aware that interesting information likely comes from a wider social distance. This practice has been in place for some two decades and enormously valuable all that time. The most interesting stuff usually comes from the third layer, a folder named ‘c150’, in my feedreader: close enough to know who the author is, and engage in interaction if I want, disconnected enough for them to encounter things I am less likely to have already seen myself. That is the The Strength of Weak Ties (1973) as Granovetter called it.

I also keep a list of current interests, a bit like Feynman’s dozen or so currently favourite problems. For each interest I have formulated a few aspects:

  • what is conceptually interesting to me in a topic (e.g. my interest in EU digital and data policy conceptually is that it forms a geopolitical proposition externally, while being a quality improvement instrument internally that takes rights and societal values as yardstick),
  • am I theoretically interested or more practically,
  • do I have a knowledge fundament for the topic or am I a newbie,
  • is there a link with any long term goals,
  • can it be put into a specific context or tied to a specific issue/question,
  • can I shape or create an enduring practice around it,
  • can I build a bridge to outputs, like blogposts, presentations, or client proposals

My feedreader tracks just under five hundred people writing on the open web. That can easily amount to two thousand postings in a week. I can have several intentions to start reading, one of them is to find and read material relevant to my list of current interests. A reading intention does not do away with items, it’s not a filter to remove material. It’s essentially just a view on the entire set of incoming items in the feed reader that I usually construct in my mind. What if I can construct those views on my screen too?
The ‘c150’ social layer, the weak ties, what do they write about that connects to the fields of interest from my list? Such filtering does not lend itself to text based search based on fixed terms. I usually skim titles for first impressions, and click opportunistically through the postings. What if I can have a model weigh the postings and compare them to my list of current interests, to mark them for my attention? In aid of that one specific reading intention.

That’s what the first experiment does: label postings that seem to fit my interests, and express why. So that I can skim the folder of weak ties by interest, and read those items first if my intention is to explore those interests. I limited it to the c150 folder as feeding all rss feeds into the model is consuming a lot of time and tokens, so I started with the part most likely to bring useful results.
The labeling works now as part of my feedreader. I am not yet convinced of the quality of it though. The motivation for the labels usually is along the lines of "it fits interest X but not in the way you’re looking for", which to me means it actually doesn’t really fit.

Slopsidian
This week I read an article about AI documenting its own actions and output in a wiki, and saw one or two similar efforts described. I applied that to a different helper task, which is the preparation of reading a paper and helping me to decide to dig deeper. This is similar to skimming a non-fiction book, but more involved. Can AI reliably pull from a paper the concepts used and introduced, and the line of argumentation? Saving them both in a single note for the resource, and in separate notes for each of the concepts? Additionally can it logically link concepts from different resources? This is what an ‘ingestion skill’ now does for me. I let it store the output it generates in a folder that I can also open as an Obsidian vault, hence the name Slopsidian. The papers come from my Zotero collection, meaning I previously saved them. That original step of curation also means I have a line or two about why I thought them interesting at the time. Feeding that curating decision and the paper into the ingestion skill allows a second order look at a paper. What are the concepts discussed, and, reading the output, do I think some of those are of interest to me? If so, I can look at the paper more closely and do my own note making and paraphrasing and placement in my actual Obsidian collection. Lifting out concepts works rather well, the linking is less useful in the first experiences (too obvious, not sparse enough) and can seem forced when you look at why some concepts get linked.

WittgenstAIn III
The third experiment is a bit more on the edge I think. Here the probabilistic language games that LLMs are have more of a free rein. Part of the university courses on philosophy of science I did 25 years ago was using different philosophical schools of thought as lenses to approach a question. Not to answer the question, that is hardly ever the point after all, but to holding it, and holding it differently. Plato’s essentialism, Kant’s transcendence, dialectics (Hegel), phenomenology (Husserl), Wittgenstein II’s analytical method, hermeneutics (Heidegger), deconstruction (Derrida), and Rorty’s pragmatism. For each of these, for over 2 decades, I’ve had a recipe in my notes to apply to a question.
I put together a ‘language game’ in which I pose a question, which a ‘router’ prompt tries to match to one or more of the 8 recipes, or to a combination of recipes chained together (e.g. first look at a question from an analytical perspective and then feed the results in to a deconstruction exercise.)
My existing multi-step recipes are followed, and output is generated for each of those steps, into a resulting note.
I read those resulting notes, lift out what catches my eye or what resonates and I use it to flesh it out more, for me to hold the question still longer. Models are language games of a sort, so hence the name WittgenstAIn III, a third iteration, extending the second Wittgenstein’s language games to and with AI.
The output here makes me more uncomfortable than the other two. Reasoning is being mimicked, with the usual overconfident wrongness we’ve come to expect from generative AI, and that works out in odd ways sometimes. Still there is utility that can be lifted from the output. It is a good kickstart for exploring questions to quickly see if a recipe might yield something or not, judging by my first attempts in this experiment. It does certainly lower the threshold, as helper task, to engage with the recipes. I’ve used it more in the past days than in the past months. Part of that is the novelty of the experiment, and that may wear off quickly, but perhaps it carries the kernel of more habitual use.

My grandfather Klaas Zijlstra (1905-1993) was a farmer and cattle raiser. He grew up in Fryslân and always wanted to be a farmhand it seems (his father was a housepainter). There was ambition too, from leaving school at 12 and moving out on his 16th, he sought out farmers to work for that had a reputation in cattle raising. In his early twenties he had a choice of job offers to run a cattle farm in Argentina and to run a cattle farm in Twente, in the eastern part of the Netherlands. His mother wanted to be able to visit him by train, so the Argentina offer was refused. He worked on the farm Stepelerveld near Haaksbergen, Twente, since its founding in 1928, which was meant as a model farm. It already had mechanised milking from the start for instance. The farm’s owner, Ebs van Heek, son of textile barons, and my grandfather had a strong interest in cattle raising, trying to increase milk production per cow. Before the farm was constructed in 1928 (now a national monument) work had already been underway to bring together and raise cattle for it on a nearby farm. I don’t know when my grandfather was hired exactly, he may already have had some role before the farm’s construction. Cattle was my grandfathers passion. After the farm was sold in 1963 and my grandparents retired to the nearby village Boekelo, there were photos of us grandchildren on the living room dresser right next to similarly framed photos of price winning cows. Central on the mantel piece was a photo of a bull. It remained there for over 30 years.

It may have been the same bull he took a train trip with.

The farm had a locally famous bull, named Adolf (this was the 1920s, so no stigma attached to that name yet). There was a cattle fair in The Hague, on the other side of the country. My grandfather walked the bull to the station, and joined it inside a cattle car, hired for the purpose, for the train ride to The Hague. When he arrived he sent a postcard to the farm, saying ‘gakz’, meaning ‘goed aangekomen, Klaas Zijlstra‘, arrived well. Postage was based on the number of words. This kept it to half a cent. Then he spent three days at the cattle fair on the Malieveld (the largest field in The Hague, used for fairs and demonstrations for some 400 years), where he shared straw with the bull to sleep on in the open air. The bull won first prize. He walked back to the station boarded a cattle car again with the bull for the trip home, and showed up on foot with the bull and a victory cup at the farm.

In the story, the station was sometimes Haaksbergen (the nearest, about an hour’s walk from the farm) sometimes Hengelo station (a 3 hour walk). Although Haaksbergen connected to Hengelo, it was a different station from the one on the line towards The Hague, so it may have been easier to go to Hengelo as they’d otherwise had needed two cattle cars, one for each line. Still, as the railroad company for the Haaksbergen-Hengelo connection was founded and owned by the same textile barons, to connect the factories, it may well have been Haaksbergen, or the also nearby Boekelo on the same line.

As a child I heard the story repeatedly but never really knew when that happened. Thanks to digitised archives I now have more details.

Earlier this week I came across a version of this story online, written by the farm owner’s daughter, and she placed it in 1929. Having a year I then searched the digitised news paper archives for cattle fairs in The Hague, and found it was actually 1928.
In 1928 the Netherlands hosted the Olympics in Amsterdam, from 28 July to 12 August. It was the first edition to be called ‘the summer olympics’. The national cattle fair and exhibition took place just before, from 23 to 25 July, and was dubbed the ‘Olympic cattle fair’ in the press. It was a big event (I found 230 paper articles across the country about it for that week). Opened by two government ministers giving speeches, visited by members of the royal family on each day, the queen mother and the prince consort, though not the queen herself. Prizes were awarded for many different categories of cows, horses, pigs and goats. A special mention in the press talks about a new ‘contraption to measure the pulling strength of a horse’ being demonstrated. Amidst all that was my grandfather, two months before his 23rd birthday, with bull Adolf on a leash. And won first prize.

Which fact ended up in the papers with a photo:

Klaas Zijlstra and the bull, Malieveld 25 July 1928, published in the Utrecht Daily on 27 July 1928, photographer and copyright unknown.

Look at that enormous and muscled beast, coming to shoulder height of my grandfather. And then imagine traveling and sleeping next to it for 5 days!

Favorited AI Policy and Human.json by Claudine Chionh
Favorited Adding human.json to WordPress by Terence Eden

Claudine Chionh and Terence Eden both mention human.json, a data file that lists people and sites you know are written by humans, as opposed to generated by AI. A rekindling of FOAF?

In these days of needing to assume anything you encounter is machine generated unless proven to be human made, we continuously have to apply a Reverse Turing test: do I have enough indications to assume something was created by a human.

When I first wrote a Reverse Turing page I mentioned much the same things as Terence Eden does about vouching for other people to be human authors.

Not sure if having a machine readable file makes the right point here though, ironic as it is. Blogrolls, webrings come to mind too, because Long Live the Author.

One element I think we’d need to contemplate is to not just list, but also provide URI’s to some supporting evidence. Expose the depth of a connection. Only met at a vouching party countersigning your credentials, or two decades of in person and online encounters and proof thereof are different in depth and quality, and may well impact how the Reverse Turing test turns out for others perusing your human.json file.

Favorited I used AI. It worked. I hated it. by Michael Taggart

An excellent post by Michael Taggart on how it felt to him to make a much needed bit of code with the help of Claude Code. The results worked, but he hated how it made him feel. He explores those opposing outcomes without trying to resolve the tension. Much in here that I recognise from my own experiences, as well as what I see others do and how they talk about it. Towards the end he talks about ‘the real monster’ here, and I think that is the right frame: we have created a technology monster once more, and Smits’ monster theory (2003) is a tool to bring to bear again. Where will we adapt the monster to our tastes? Where will we shift our cultural understanding of ourselves and the world to make room for the monster? Once we’re done embracing it until the bubble bursts, or rejecting it outright no matter what.

I hated writing software this way. Forget the output for a moment; the process was excruciating. Most of my time was spent reading proposed code changes and pressing the 1 key to accept the changes, which I almost always did. I was basically Homer’s drinking bird.

Michael Taggart

My website is now part of the web archive in the Dutch Royal Library. It took some experimenting to get it in there. Blogs will be blogs and the amount of links in mine choked the harvester it seems.

Since 2007 the Royal Library has been archiving websites, and now stores some 25.000 websites. My blog, even though it is one of the oldest still maintained in the Netherlands, never was part of that effort. Mostly because it’s not very visible as a Dutch blog, as it is mostly written in English and resides on a .org domain (when I registered zylstra.org, private persons could not yet register .nl domains, only companies could). At an Internet Archive event organised by the Royal Library last year September I asked about archiving and they told me how to suggest my website for archiving.

Late last January I received a message that my website would be included in their archives from now on.

What followed were several test-runs with their harvester Heritrix, which is also used by the Internet Archive. I wondered about how some of my website’s peculiarities would be dealt with by the harvester. Not every posting is listed on my site for instance, although each does have a direct URL. The years’ worth of weekly notes for instance are not listed in this site. Also many postings are never shown on the front page, and if you page through postings on the front page you will never encounter them. This is true for categories of posts like books, photos, and day to day topics. I discussed this with the web-archivist, who ran some tests. My week notes seemed to be included, but the pagination of the category of day to day stalled out at 180 pages, although there were more still.

To my surprise they also ran into volume limits. Apparently because of ‘bycatch’, things they archive from other sites because I reference them or embed them. In the past few years I have stopped embedding things, like photos, except for my slides, which are hosted on a separate domain I have registered. While it was normal that a site’s additional catch is larger than the site itself, for my site it was very different from what they were used to.
First they limited bycatch to 20GB in a test, and they ran out of space, then they set it at 40GB in a test, and still ran out of space. Raising the limits further did not help. In the end they decided to harvest just what is on my zylstra.org domain and not include any bycatch at all. Which is completely fine by me, precisely because I’ve made the effort to bring all kinds of external content ‘home’ to this domain.

Nevertheless it did surprise me that bycatch turned out to be a problem, as they are using a tool the Internet Archive itself uses too. I asked for some examples of the bycatch. They told me it wasn’t even possible to dump a URL list from the bycatch into a spreadsheet as it hit the maximum number of rows (around 65k iirc). I did get some of the URLs that contributed bigger volumes of bycatch. To my surprise I did not even recognise the links, except one.

One was obvious, 2800 attempts to harvest a page on live.staticflickr.com, as I link a lot to my Flickr hosted images, although I no longer embed them but have local versions on this domain.
Others were not obvious to me at all, theguardian.tv, vp.nyt.com and various content delivery networks. I link to none of them in this site. I do link to The Guardian, about 100 times, and to the NYT about 40 times, and I suppose if the harvester follows those links it will find additional material there that explains the bycatch more fully, if it harvests all the targets I link to too.

If that is the case, that it harvests everything I’ve linked to, then it is the long history of this blog that is the issue and makes the harvester hit its limits.

There are some 20.000 external links in this blog’s articles, as far as I can quickly estimate based on a full content export I made this week.
It basically means that if the harvester attempts to harvest all those links and what resources they include, it adds a number of pages to the archive, roughly equivalent to the current archive itself.

A weblog embraces what the world wide web is, a bunch of links to other websites. The name weblog says it. A web-log is a curation hub for web readers, pointing out other interesting stuff, and not trying to keep you here too long. Over 23 years of blogging yielded some 20.000 links to other websites. In terms of linking a blog becomes the web itself as much as it becomes its author’s avatar in terms of its content given enough time.

From now on my site will be updated in the Royal Library’s archives every year on March 5th.


The facade of the Royal Library in The Hague, photo by Ferdi de Gier, license CC-BY-SA