Bookmarked Coding on Copilot: 2023 Data Suggests Downward Pressure on Code Quality by William Harding and Matthew Kloster

Gitclear takes a look at how the use of Copilot is impact coding projects on GitHub. They signal several trends that impact the overall code quality negatively. Churn is increasing (though by the looks of it, that trend started earlier), meaning the amount of code very quickly being corrected or discarded is rising. And more code is being added to projects, rather than updated or (re)moved, indicating a trend towards bloat (my words). The latter is mentioned in the report I downloaded as worsening the asymmetry between writing/generating code and time needed for reading/reviewing it. This increases downward quality pressure on repositories. I use GitHub Copilot myself, and like Github itself reports it helps me generate code much faster. My use case however is personal tools, not a professional coding practice. Given my relatively unskilled starting point CoPilot makes a big difference between not having and having such personal tools. In a professional setting more code however does not equate better code. The report upon first skim highlights where benefits of Copilot clash with desired qualities of code production, quality and team work in professional settings.
Via Karl Voit

To investigate, GitClear collected 153 million changed lines of code,
authored between January 2020 and December 2023….. We find disconcerting trends for maintainability. Code churn — the
percentage of lines that are reverted or updated less than two weeks after
being authored — is projected to double in 2024 compared to its 2021,
pre-AI baseline. We further find that the percentage of “added code” and
“copy/pasted code” is increasing in proportion to “updated,” “deleted,” and
“moved” code.

Gitclear report

I imported hundreds of Amazon e-book purchases into my book notes, using a script I wrote with the assistance of GitHub Co-Pilot.

As a home-cooking coder, coding things often takes me a long time. I know how to cut things up in order to be able to code the pieces. I know a bit of my default coding language PHP, and can read it to see what it does. But actual coding is a different thing. It’s more like a passive fluency rather than an active one. Mostly because I don’t do it often enough to become actively fluent, even though I have been coding my own things since the early 1980s. So coding for me means a lot of looking up how statements work, what terms they expect etc., which is time consuming.
Over time I’ve collected pieces of functionality I reuse in various other projects, and I have a collection of notes on how to do things and why (it’s not really a coding journal, but it could grow into one.) Yet it is usually very time consuming.

Earlier this year I took a subscription on Github Co-pilot. I installed two plugins in Visual Studio Code, the text-editor I use for coding: Co-pilot and co-pilot chat. I thought it might help me make it easier to create more personal tools.
It took until yesterday before I both had the urge and the time to test that assumption.

I am backfilling different types of information into my Obsidian notes. Such as my Google calendar items from 2008-2020, earlier.
Another is ensuring I have a book note for every book I bought for Amazon Kindle. I’ve bought just over 800 books since December 2010 for Kindle (vs 50 physical books since 2008, as I usually use local bookshops for those). For a number of them I have book notes in Obsidian, for others I don’t. I wanted to add notes for all Kindle books I bought over the years.
And this gave me my first personal tool project to try Co-pilot on.

The issue is having a list of Amazon Kindle purchases (title, author, date) and a list of existing book notes, where the title is usually shorter than the one on the Amazon list (no sub title e.g.). I set out to make a script that checks every existing book note against the Amazon list, and writes the remaining Amazon purchases to a new list. Then in a next step that new list is used to create a book note with a filled out template for each entry.

Using Co-pilot and especially the chat function made the coding quick and easy. It was also helping me learn as the chat provides reasons for its suggestions and I could go back and forth with it to understand various elements better. A very useful effect was that from having to write prompts for the chat bot and following up on the answers allowed me to much better clarify to myself what I was trying to do and coming up with ideas how to do it. So it sped up my thinking and creation process, next to providing helpful code suggestions that I only needed to tweak a bit for my use case (rather than find various solutions on stack-overflow that don’t really address my issue). It also helped me make useful notes for my coding journal and code snippet collection.
It was still time consuming, but not because of coding: data cleaning is always a big chore, and will remain so because it needs human inspection.

I now have a folder with some 475 automatically made book notes, in the right structure, derived from the 800 Kindle book purchases over 13 years using my existing book notes as filter.
Next projects to have a go at will be the physical book purchases through Amazon (50), and my old Calibre library of all books I owned before 2012 (over 1000, when we did away with most of them, after I scanned their barcodes all into a database.)

I am pleased with how helpful GitHub Co-Pilot was for me in this. It energises me to think of more little coding projects for personal tools. And who knows, maybe it will increase my coding skills too, or have me branch out in programming languages I don’t know, like python, or help me understand other people’s code like in WordPress plugins I might want to tweak.