Bookmarked The Expanding Dark Forest and Generative AI by Maggie Appleton

I very much enjoyed this talk that Maggie Appleton gave at Causal Islands in Toronto, Canada, 25-27 April 2023. It reminds me of the fun and insightful keynotes at Reboot conferences a long time ago, some of which shifted my perspectives longterm.

This talk is about the impact on how we will experience and use the web when generative algorithms create most of its content. Appleton explores the potential effects of that and the futures that might result. She puts human agency at the center when it comes to how to choose our path forward in experimenting and using ‘algogens’ on the web, and how to navigate an internet where nobody believes you’re human.

Appleton is a product designer with Ought, on products that use language models to augment and extend human (cognitive) capabilities. Ought makes Elicit, a tool that surfaces (and summarises) potentially useful papers for your research questions. I use Elicit every now and then, and really should use it more often.

An exploration of the problems and possible futures of flooding the web with generative AI content

Maggie Appleton

Bookmarked Inside the secret list of websites that make AI like ChatGPT sound smart (by By Kevin Schaul, Szu Yu Chen and Nitasha Tiku in the Washington Post)

The Washington Post takes a closer look at Google’s C4 dataset, which is comprised of the content of 15 million websites, and has been used to train various LLM’s. Perhaps also the one used by OpenAI for e.g. ChatGPT, although it’s not known what OpenAI has been using as source material.

They include a search engine, which let’s you submit a domain name and find out how many tokens it contributed to the dataset (a token is usually a word, or part of a word).

Obviously I looked at some of the domains I use. This blog is the 102860th contributor to the dataset, with 200.000 tokens (1/10000% of the total).


Screenshot of the Washington Post’s search tool, showing the result for this domain, zylstra.org.

Bookmarked WordPress AI: Generative Content & Blocks (by Joe Hoyle, found via Chuck Grimmett)

As many others I am fascinated by what generative algorithms like ChatGPT for texts and Stable Diffusion for images can do. Particularly I find it fascinating to explore what it might do if embedded in my own workflows, or how it might change my workflows. So the link above showing an integration of ChatGPT in WordPress’ Gutenberg block editor drew my attention.

The accompanying video shows a mix of two features. First having ChatGPT generate some text, or actually a table with specific data, and having ChatGPT in ‘co-pilot’ style generate code for Gutenberg blocks. I think the latter might be actually useful, as I’ve seen generative AI put to good use in that area. The former, having ChatGPT write part of your posting is clearly not advisable. And the video shows it too, although the authors don’t point it out or haven’t reflected on the fact that ChatGPT is not a search engine but geared to coming up with plausible stuff without being aware of its actual information (the contrast with generating code is that code is much more highly structured in itself so probabilities collapse easier to the same outcome).

The blogpost in the video is made by generating a list of lunar missions, and then turning them into a table, adding their budgets and sorting them chronologically. This looks very cool in the vid, but some things jump out as not ok. Results jump around the table for instance: Apollo 13 moves from 1970 to 2013 and changes budget. See image below. None of the listed budgets for Apollo missions, nor their total, match up with the detailed costs overview of Apollo missions (GoogleDocs spreadsheet). The budget column being imaginary and the table rows jumping around makes the result entirely unfit for usage of course. It also isn’t a useful prompt: needing to fact check every table field is likely more effort and less motivating than researching the table yourself from actual online resources directly.

It looks incredibly cool ‘see me writing a blogpost by merely typing in my wishes, and the work being done instantly’, and there are definitely times I’d wish that to be possible. To translate a mere idea or thought into some output directly however means I’d skip confronting such an idea with reality, with counter arguments etc. Most of my ideas only look cool inside my head, and need serious change to be sensibly made manifest in the world outside my head. This video is a bit like that, an idea that looks cool in one’s head but is great rubbish in practice. ChatGPT is hallucinating factoids and can’t be trusted to create your output. Using it in the context of discovery (as opposed to the justification context of your output such as in this video) is possible and potentially useful. However this integration within the Gutenberg writing back-end of WordPress puts you in the output context directly so it leads you to believe the generated plausible rubbish is output and not just prompting fodder for your writing. ‘Human made’ is misleading you with this video, and I wouldn’t be surprised if they’re misleading themselves as well. A bit like staging the ‘saw someone in half and put them together again’ magician’s trick in an operating room and inviting surgeons to re-imagine their work.

Taking a native-first approach to integrating generative AI into WordPress, we’ve been experimenting with approaches to a “WordPress Copilot” that can “speak” Gutenberg / block-editor.

Copy-pasting paragraphs between ChatGPT and WordPress only goes so far, while having the tools directly embedded in the editor … open up a world of possibilities and productivity wins…

Joe Hoyle


An android robot is filling out a table listing Apollo missions on a whiteboard, generated image using Midjourney

Bookmarked So long, Twitter API, and thanks for all the fish (by Ryan Barrett)

The Twitter API is moving to paid tiers for anything but the tiniest use cases by the end of the month. I’ve been using Brid.gy to get back webmention notifications about interactions on Twitter with my blogposts here. Brid.gy depends on the Twitter API obviously, and the scale of their needs puts them above the free and affordable tier, even if well below the more expensive tier above it. Therefore Brid.gy will stop supporting Twitter. Silo’s gonna silo I suppose.
It does help remove an action from my backlog: changing the way I show such backfeed on my blog without going counter to Twitter users’ common expectations of where their interaction and avatar might end up.
Unless Musk changes his mind once again, or can’t find an employee capable of implementing the changes.

…assuming it sticks, Bridgy Twitter will stop working on April 29, if not before.

Ryan Barrett

Bookmarked Feedly launches strikebreaking as a service (by Molly White)

Molly White does a good write-up of the extremely odd and botched launch by Feedly of a service to keep tabs on protests that might impact your brand, assets or people. Apparently in that order too. When E first mentioned this to me I was confused. What’s the link with a feedreader after all? Feedly’s subsequent excuse ‘we didn’t consider abuse of this service’ sounds rather hollow, as their communications around it seem to precisely focus on the potential abuse being the service announced.

The question ‘how did Feedly end-up here?’ kept revolving in my mind. Turns out the starting point is logged in my own blog:

Machines can have a big role in helping understand the information, so algorithms can be very useful, but for that they have to be transparent and the user has to feel in control. What’s missing today with the black-box algorithms is where they look over your shoulder, and don’t trust you to be able to tell what’s right.

Edwin Khodabakchian cofounder and CEO of RSS reader Feedly, in Wired, March 2018

In a twisted way I can see the reflection of that quote in the service Feedly announced. Specifically w.r.t. the first part, using algorithms to better understand information. The second part seems to have gone missing in the past 5 years though, the bit about transparency, avoiding black boxes, and putting users in control. Especially the ‘not trusting people to tell what’s right’ grates. It seems to me Feedly users in the past days very much could tell what’s right and Feedly hoped they wouldn’t.

I do agree with the 2018 quote though, but ‘algorithmic interpretation as a service‘ isn’t what follows to me. That’s just a different way of commoditising your customers.
Algorithmic spotting of emergent patterns is relevant if I can define the context and network of people (and perhaps media sources) whose feeds I follow. For that I need to be in control of the algorithm, and need to be the one who defines what specific emergent patterns I am interested in. That is on my list for my ideal feed reader. But this botched Feedly approach isn’t that.

Bookmarked The push to AI is meant to devalue the open web so we will move to web3 for compensation (by Mita Williams)

Adding this interesting perspective from Mita Williams to my notes on the effects of generative AI. She positions generative AI as bypassing the open web entirely (abstracted away into the models the AIs run on). Thus sharing is disincentivised as sharing no longer brings traffic or conversation, if it is only used as model-fodder. I’m not at all sure if that is indeed the case, but from as early as YouTube’s 2016 Flickr images database being used for AI model training, such as IBM’s 2019 facial recognition efforts, it’s been a concern. Leading to questions about whether existing (Creative Commons) licenses are fit for purpose anymore. Specifically Williams pointing to not only the impact on an individual creator but also on the level of communities they form, are part of and interact in, strikes me as worth thinking more about. The erosion of (open source, maker, collaborative etc) community structures is a whole other level of potential societal damage.

Mita Williams suggests the described erosion is not an effect but an actual aim by tech companies, part of a bait and switch. A re-siloing, an enclosing of commons, where being able to see something in return for online sharing again is the lure. Where the open web may fall by the wayside and become even more niche than it already is.

…these new systems (Google’s Bard, the new Bing, ChatGPT) are designed to bypass creators work on the web entirely as users are presented extracted text with no source. As such, these systems disincentivize creators from sharing works on the internet as they will no longer receive traffic…

Those who are currently wrecking everything that we collectively built on the internet already have the answer waiting for us: web3.

…the decimation of the existing incentive models for internet creators and communities (as flawed as they are) is not a bug: it’s a feature.

Mita Williams