Swiss author and playwright, picked the book up in Zurich in 2024. I thoroughly enjoyed Die Erfindung des Ungehorsams (2021), the invention of disobedience, and read it in one sitting. Well told, many beautiful sentences. Three women in NYC, China and England are followed as they try to understand the world. Their stories are interwoven through the emergence of AI driven automatons grasping their true autonomy. One because she sees the future in Babbage’s machinery and determines how to program them, one making sex dolls in China that get fitted with AI, one hosting Manhattan dinner parties where she tells, invents?, a story and only the others eat. All three finding a way to break their constraints, and become disobedient to their surroundings. A multilayered work, as one critic Daniela Janser wrote, a poetic homage to the oldest programming language of all, imagination. I will probably buy her more recent work Vor aller Augen, before all eyes, soon.
Bookmarked Coding on Copilot: 2023 Data Suggests Downward Pressure on Code Quality by William Harding and Matthew Kloster
Gitclear takes a look at how the use of Copilot is impact coding projects on GitHub. They signal several trends that impact the overall code quality negatively. Churn is increasing (though by the looks of it, that trend started earlier), meaning the amount of code very quickly being corrected or discarded is rising. And more code is being added to projects, rather than updated or (re)moved, indicating a trend towards bloat (my words). The latter is mentioned in the report I downloaded as worsening the asymmetry between writing/generating code and time needed for reading/reviewing it. This increases downward quality pressure on repositories. I use GitHub Copilot myself, and like Github itself reports it helps me generate code much faster. My use case however is personal tools, not a professional coding practice. Given my relatively unskilled starting point CoPilot makes a big difference between not having and having such personal tools. In a professional setting more code however does not equate better code. The report upon first skim highlights where benefits of Copilot clash with desired qualities of code production, quality and team work in professional settings.
Via Karl Voit
To investigate, GitClear collected 153 million changed lines of code,
authored between January 2020 and December 2023….. We find disconcerting trends for maintainability. Code churn — the
percentage of lines that are reverted or updated less than two weeks after
being authored — is projected to double in 2024 compared to its 2021,
pre-AI baseline. We further find that the percentage of “added code” and
“copy/pasted code” is increasing in proportion to “updated,” “deleted,” and
A final draft of the European AI Regulation is circulating (here’s an almost 900 page PDF). The coming days I will read it with curiosity.
With this the ambitious legal framework for everything digital and data that the European Commission set out to create in 2020 has been finished within this Commission period. That’s pretty impressive.
In 2020 there was no Digital Markets Act, Digital Services Act, AI Regulation, Data Governance Act, Data Act, nor an Open Data Directive/High Value Data implementing regulation.
Before the European elections coming spring, they are all in place. I’ve closely followed the process (and helped create a very small part of it), and I think the result is remarkably consistent and level headed. DG CNECT has done well here in my opinion. It’s a set of laws that are very useful in themselves that which simultaneously forms a geo-political proposition.
The coming years will be dedicated to implementing these novel instruments.
I imported hundreds of Amazon e-book purchases into my book notes, using a script I wrote with the assistance of GitHub Co-Pilot.
As a home-cooking coder, coding things often takes me a long time. I know how to cut things up in order to be able to code the pieces. I know a bit of my default coding language PHP, and can read it to see what it does. But actual coding is a different thing. It’s more like a passive fluency rather than an active one. Mostly because I don’t do it often enough to become actively fluent, even though I have been coding my own things since the early 1980s. So coding for me means a lot of looking up how statements work, what terms they expect etc., which is time consuming.
Over time I’ve collected pieces of functionality I reuse in various other projects, and I have a collection of notes on how to do things and why (it’s not really a coding journal, but it could grow into one.) Yet it is usually very time consuming.
Earlier this year I took a subscription on Github Co-pilot. I installed two plugins in Visual Studio Code, the text-editor I use for coding: Co-pilot and co-pilot chat. I thought it might help me make it easier to create more personal tools.
It took until yesterday before I both had the urge and the time to test that assumption.
I am backfilling different types of information into my Obsidian notes. Such as my Google calendar items from 2008-2020, earlier.
Another is ensuring I have a book note for every book I bought for Amazon Kindle. I’ve bought just over 800 books since December 2010 for Kindle (vs 50 physical books since 2008, as I usually use local bookshops for those). For a number of them I have book notes in Obsidian, for others I don’t. I wanted to add notes for all Kindle books I bought over the years.
And this gave me my first personal tool project to try Co-pilot on.
The issue is having a list of Amazon Kindle purchases (title, author, date) and a list of existing book notes, where the title is usually shorter than the one on the Amazon list (no sub title e.g.). I set out to make a script that checks every existing book note against the Amazon list, and writes the remaining Amazon purchases to a new list. Then in a next step that new list is used to create a book note with a filled out template for each entry.
Using Co-pilot and especially the chat function made the coding quick and easy. It was also helping me learn as the chat provides reasons for its suggestions and I could go back and forth with it to understand various elements better. A very useful effect was that from having to write prompts for the chat bot and following up on the answers allowed me to much better clarify to myself what I was trying to do and coming up with ideas how to do it. So it sped up my thinking and creation process, next to providing helpful code suggestions that I only needed to tweak a bit for my use case (rather than find various solutions on stack-overflow that don’t really address my issue). It also helped me make useful notes for my coding journal and code snippet collection.
It was still time consuming, but not because of coding: data cleaning is always a big chore, and will remain so because it needs human inspection.
I now have a folder with some 475 automatically made book notes, in the right structure, derived from the 800 Kindle book purchases over 13 years using my existing book notes as filter.
Next projects to have a go at will be the physical book purchases through Amazon (50), and my old Calibre library of all books I owned before 2012 (over 1000, when we did away with most of them, after I scanned their barcodes all into a database.)
I am pleased with how helpful GitHub Co-Pilot was for me in this. It energises me to think of more little coding projects for personal tools. And who knows, maybe it will increase my coding skills too, or have me branch out in programming languages I don’t know, like python, or help me understand other people’s code like in WordPress plugins I might want to tweak.
In reply toby Peter Rukavina
It’s not surprising that GPT-4 doesn’t work like a search engine and has a hard time surfacing factual statements from source texts. Like one of the commenters I wonder what that means for the data analysis you also asked for. Perhaps those too are merely plausible, but not actually analysed. Especially the day of the week thing, as that wasn’t in the data, and I wouldn’t expect GPT to determine all weekdays for posts in the process of answering your prompt.
I am interested in doing what you did, but then with 25 years of notes and annotations. And rather with a different model with less ethical issues attached. To have a chat about my interests and links between things. Unlike the fact based questions he’s asked the tool that doesn’t necessarily need it to be correct, just plausible enough to surface associations. Such associations might prompt my own thinking and my own searches working with the same material.
Also makes me think if what Wolfram Alpha is doing these days gets a play in your own use of GPT+, as they are all about interpreting questions and then giving the answer directly. There’s a difference between things that face the general public, and things that are internal or even personal tools, like yours.
Have you asked it things based more on association yet? Like “based on the posts ingested what would be likely new interests for Peter to explore” e.g.? Can you use it to create new associations, help you generate new ideas in line with your writing/interests/activities shown in the posts?
So my early experiments show me that as a data analysis copilot, a custom GPT is a very helpful guide… In terms of the GPT’s ability to “understand” me from my blog, though, I stand unimpressed.
In reply toby Robin Rendle
This specific argument sounds way too much like the ‘every music download is a missed sale‘ trope of the music industry, 2-3 decades ago. And that was about actual recordings, not generated ones.
There are many good reasons to not use algogens, or to opt for different models for such generation than the most popular public facing tools.
Missed income for illustrators by using them in blog posts as Rendle posits isn’t a likely one.
Like with music downloads there’s a whole world of potential users underneath the Cosean floor. Several billion people, I assume.
My blog or presentations will never use bought illustrations.
I started making lots of digital photos for that reason way back in 2003, and have been using open Creative Commons licenses whenever I used something by another creator since they existed.
These days I may try to generate a few images, if it’s not too work intensive. Which it quickly is (getting a good result is hard and takes more time than a Flickr search for an image) and then paying someone who can manipulate the tools better than I (like now with Adobe Illustrator) might be in order, except for my first point at the start of this paragraph. Let’s also not pretend that all current illustrators do everything by hand or have done since the 1990’s. There’s tons of art work out there that is only possible because of digital tools. Algogens aren’t such a big jump that you get to treat them as a different species.
Not to say that outside the mentioned use case of blogs and other sites (the ones that already now are indistinguishable from generated texts and only have generating ad eyeballs as purpose), the lower end of the existing market won’t get eroded unless the lower end of the market ups their game with these tools (they’re actually better positioned for it than me in terms of skills and capabilities).
I bet that at the same time there will be a growing market for clearly human made artefacts as status symbol too. The Reverse Turing effect in play, as the default assumption will be, must be, that anything is generated. I’ve paid more for prints of artwork, both graphics and photos, made by or at the behest of the artist, than one printed after their death for instance. Such images adorn the walls at home rather than my blog though.
I always thought those AI-generated images in blog posts and elsewhere around the web were tacky and lame, but I hadn’t seriously considered the future economic impact on illustrators.
Image generated with Stable Diffusion using the prompt “A woman is sitting behind a computer making an image of a woman drawing, sketchbook style”. The result is an uncanny image. Public domain image.