Bookmarked 1.2 billion euro fine for Facebook as a result of EDPB binding decision (by European Data Protection Board)

Finally a complaint against Facebook w.r.t. the GDPR has been judged by the Irish Data Protection Authority. This after the EDPB instructed the Irish DPA to do so in a binding decision (PDF) in April. The Irish DPA has been extremely slow in cases against big tech companies, to the point where they became co-opted by Facebook in trying to convince the other European DPA’s to fundamentally undermine the GDPR. The fine is still mild compared to what was possible, but still the largest in the GDPR’s history at 1.2 billion Euro. Facebook is also instructed to bring their operations in line with the GDPR, e.g. by ensuring data from EU based users is only stored and processed in the EU. This as there is no current way of ensuring GDPR compliance if any data gets transferred to the USA in the absence of an adequacy agreement between the EU and the US government.

A predictable response by FB is a threat to withdraw from the EU market. This would be welcome imo in cleaning up public discourse and battling disinformation, but is very unlikely to happen. The EU is Meta’s biggest market after their home market the US. I’d rather see FB finally realise that their current adtech models are not possible under the GDPR and find a way of using the GDPR like it is meant to: a quality assurance tool, under which you can do almost anything, provided you arrange what needs to be arranged up front and during your business operation.

This fine … was imposed for Meta’s transfers of personal data to the U.S. on the basis of standard contractual clauses (SCCs) since 16 July 2020. Furthermore, Meta has been ordered to bring its data transfers into compliance with the GDPR.

EDPB

Bookmarked Project Tailwind by Steven Johnson

Author Steven Johnson has been working with Google and developed a prototype for Tailwind. Tailwind, an ‘AI first notebook’, is intended to bring an LLM to your own source material, and then you can use it to ask questions of the sources you give it. You point it to a set of resources in your Google Drive and what Tailwind generates will be based just on those resources. It shows you the specific source of the things it generates as well. Johnson explicitly places it in the Tools for Thought category. You can join a waiting list if you’re in the USA, and a beta should be available in the summer. Is the USA limit intended to reduce the number of applicants I wonder, or a sign that they’re still figuring things like GDPR for this tool? Tailwind is prototyped on PaLM API though, which is now generally available.

This, from its description, gets to where it becomes much more interesting to use LLM and GPT tools. A localised (not local though, it lives in your Google footprint) tool, where the user defines the corpus of sources used, and traceable results. As the quote below suggests a personal research assistant. Not just for my entire corpus of notes as I describe in that linked blogpost, but also on a subset of notes for a single topic or project. I think there will be more tools like these coming in the next months, some of which likely will be truly local and personal.

On the Tailwind team we’ve been referring to our general approach as source-grounded AI. Tailwind allows you to define a set of documents as trusted sources …, shaping all of the model’s interactions with you. … other types of sources as well, such as your research materials for a book or blog post. The idea here is to craft a role for the LLM that is … something closer to an efficient research assistant, helping you explore the information that matters most to you.

Steven Johnson

Bookmarked Will A.I. Become the New McKinsey? by Ted Chiang in the New Yorker

Ted Chiang realises that corporates are best positioned to leverage the affordances of algorithmic applications, and that that is where the risk of the ‘runaway AIs’ resides. I agree that they are best positioned, because corporations are AI’s non-digital twin, and have been recognised as such for a decade.

Brewster Kahle said (in 2014) that corporations should be seen as the 1st generation AIs, and Charlie Stross reinforced it (in 2017) by dubbing corporations ‘Slow AI’ as corporations are context blind, single purpose algorithms. That single purpose being shareholder value. Jeremy Lent (in 2017) made the same point when he dubbed corporations ‘socio-paths with global reach’ and said that the fear of runaway AI was focusing on the wrong thing because “humans have already created a force that is well on its way to devouring both humanity and the earth in just the way they fear. It’s called the Corporation“. Basically our AI overlords are already here: they likely employ you. Of course existing Slow AI is best positioned to adopt its faster young, digital algorithms. It as such can be seen as the first step of the feared iterative path of run-away AI.

The doomsday scenario is … A.I.-supercharged corporations destroying the environment and the working class in their pursuit of shareholder value.

Ted Chiang

I’ll repeat the image I used in my 2019 blogpost linked above:

Your Slow AI overlords looking down on you, photo Simone Brunozzi, CC-BY-SA

I can now share an article directly from my feed reader to my Hypothes.is account, annotated with a few remarks.

One of the things I often do when feed reading is opening some articles up in the browser with the purpose of possibly saving them to Hypothes.is for (later) annotation. You know how it goes with open tabs in browsers, hundreds will be opened up and then neglected, until you give up and quite the entire session.

My annotation of things I read starts with saving the article to Hypothes.is, and provide a single annotation for the entire page that includes a web archive link to the article and a brief motivation or some first thoughts about why I think it is of interest to me. Later I may go through the article in more detail and add more annotations, which end up in my notes. (I also do this outside of Hypothes.is, saving an entire article directly to my notes in markdown, when I don’t want to read the article in browser.)

Until now this forces me to leave my feed reader to store an article in Hypothes.is. However, in my personal feed reader I have already the opportunity to post directly from there to my websites or to my personal notes collection in Obsidian.
Hypothes.is has an API, which much like how I post to my sites from my feed reader can make it possible to directly share to Hypothes.is from inside my feed reader. This way I can continue to read, while leaving breadcrumbs in Hypothes.is (which always also end up in the inbox of my notes).

The Hypothes.is API is documented and expects JSON payloads. To read public material through the API is possible for anyone, to post you need an API key that is connected to your account (find it when logged in).

I use JSON payloads to post from my feedreader (and from inside my notes) to this site, so I copied and adapted the script to talk to the Hypotes.is API.
The result is an extremely basic and barebones script that can do only a single thing: post a page wide annotation (so no highlights, no updates etc). For now this is enough as it is precisely my usual starting point for annotation.

The script expects to receive 4 things: a URL, the title of the article, an array of tags, and my remarks. That is sent to the Hypothes.is API. In response I will get the information about the annotation I just made (ID etc.) but I disregard any response.

To the webform I use in my feedreader I added an option to send the information to Hypothes.is, rather than my websites through MicroPub, or my local notes through the filesystem. That option is what ensures the little script gets called with the right variables.

It now looks like this:


In my feed reader I have the usual form I use to post replies and bookmarks, now with an additional radio button to select ‘H.’ for Hypothes.is


Submitting the form above gets it posted to my Hypothes.is account

I have installed AutoGPT and started playing with it. AutoGPT is a locally installed and run piece of software (in a terminal window) that you theoretically can set a result to achieve and then let run to achieve it. It’s experimental so it is good advice to actually follow its steps along and approve individual actions it suggests doing.
It interacts with different generative AI tools (through your own API keys) and can initiate different actions, including online searches as well as spawning new interactions with LLM’s like GPT4 and using the results in its ongoing process. It chains these prompts and interactions together to get to a result (‘prompt chaining’).

I had to tweak some of the script a little bit (it calls python and pip but it needs to call python3 and pip3 on my machine) but then it works.

Initially I have it set up with OpenAI’s API, as the online guide I found were using that. However in the settings file I noticed I can also choose to use other LLM’s like the publicly available models through Huggingface, as well as image generating AIs.

I first attempted to let it write scripts to interact with the hypothes.is API. It ended up in a loop about needing to read the API documentation but not finding it. At that time I did not yet provide my own interventions (such as supplying the link to the API documentation). When I did so later it couldn’t come up with next steps, or not ingesting the full API documentation (only the first few lines) which also led to empty next steps.

Then I tried a simpler thing: give me a list of all email addresses of the people in my company.
It did a google search for my company’s website, and then looked at it. The site is in Dutch which it didn’t notice, and it concluded there wasn’t a page listing our team. I then provided it with the link to the team’s page, and it did parse that correctly ending up with a list of email addresses saved to file, while also neatly summarising what we do and what our expertise is.
While this second experiment was successfully concluded, it did require my own intervention, and the set task was relatively simple (scrape something from this here webpage). This was of limited usefulness, although it did require less time than me doing it myself. It points to the need of having a pretty clear picture of what one wants to achieve and how to achieve it, so you can provide feedback and input at the right steps in the process.

As with other generative AI tools, doing the right prompting is key, and the burden of learning effective prompting lies with the human tool user, the tool itself does not provide any guidance in this.

I appreciate it’s an early effort, but I can’t reproduce the enthusiastic results others claim. My first estimation is that those claims I’ve seen are based on hypothetical things used as prompts and then being enthusiastic about the plausible outcomes. Whereas if you try an actual issue where you know the desired result it easily falls flat. Similar to how ChatGPT can provide plausible texts except when the prompter knows what good quality output looks like for a given prompt.

It is tempting to play with this thing nevertheless, because of its positioning as a personal tool, as potential step to what I dubbed narrow band digital personal assistants earlier. I will continue to explore, first by latching onto the APIs of more open models for generative AI than OpenAI’s.

Bookmarked WordPress AI: Generative Content & Blocks (by Joe Hoyle, found via Chuck Grimmett)

As many others I am fascinated by what generative algorithms like ChatGPT for texts and Stable Diffusion for images can do. Particularly I find it fascinating to explore what it might do if embedded in my own workflows, or how it might change my workflows. So the link above showing an integration of ChatGPT in WordPress’ Gutenberg block editor drew my attention.

The accompanying video shows a mix of two features. First having ChatGPT generate some text, or actually a table with specific data, and having ChatGPT in ‘co-pilot’ style generate code for Gutenberg blocks. I think the latter might be actually useful, as I’ve seen generative AI put to good use in that area. The former, having ChatGPT write part of your posting is clearly not advisable. And the video shows it too, although the authors don’t point it out or haven’t reflected on the fact that ChatGPT is not a search engine but geared to coming up with plausible stuff without being aware of its actual information (the contrast with generating code is that code is much more highly structured in itself so probabilities collapse easier to the same outcome).

The blogpost in the video is made by generating a list of lunar missions, and then turning them into a table, adding their budgets and sorting them chronologically. This looks very cool in the vid, but some things jump out as not ok. Results jump around the table for instance: Apollo 13 moves from 1970 to 2013 and changes budget. See image below. None of the listed budgets for Apollo missions, nor their total, match up with the detailed costs overview of Apollo missions (GoogleDocs spreadsheet). The budget column being imaginary and the table rows jumping around makes the result entirely unfit for usage of course. It also isn’t a useful prompt: needing to fact check every table field is likely more effort and less motivating than researching the table yourself from actual online resources directly.

It looks incredibly cool ‘see me writing a blogpost by merely typing in my wishes, and the work being done instantly’, and there are definitely times I’d wish that to be possible. To translate a mere idea or thought into some output directly however means I’d skip confronting such an idea with reality, with counter arguments etc. Most of my ideas only look cool inside my head, and need serious change to be sensibly made manifest in the world outside my head. This video is a bit like that, an idea that looks cool in one’s head but is great rubbish in practice. ChatGPT is hallucinating factoids and can’t be trusted to create your output. Using it in the context of discovery (as opposed to the justification context of your output such as in this video) is possible and potentially useful. However this integration within the Gutenberg writing back-end of WordPress puts you in the output context directly so it leads you to believe the generated plausible rubbish is output and not just prompting fodder for your writing. ‘Human made’ is misleading you with this video, and I wouldn’t be surprised if they’re misleading themselves as well. A bit like staging the ‘saw someone in half and put them together again’ magician’s trick in an operating room and inviting surgeons to re-imagine their work.

Taking a native-first approach to integrating generative AI into WordPress, we’ve been experimenting with approaches to a “WordPress Copilot” that can “speak” Gutenberg / block-editor.

Copy-pasting paragraphs between ChatGPT and WordPress only goes so far, while having the tools directly embedded in the editor … open up a world of possibilities and productivity wins…

Joe Hoyle


An android robot is filling out a table listing Apollo missions on a whiteboard, generated image using Midjourney