Bookmarked Coding on Copilot: 2023 Data Suggests Downward Pressure on Code Quality by William Harding and Matthew Kloster

Gitclear takes a look at how the use of Copilot is impact coding projects on GitHub. They signal several trends that impact the overall code quality negatively. Churn is increasing (though by the looks of it, that trend started earlier), meaning the amount of code very quickly being corrected or discarded is rising. And more code is being added to projects, rather than updated or (re)moved, indicating a trend towards bloat (my words). The latter is mentioned in the report I downloaded as worsening the asymmetry between writing/generating code and time needed for reading/reviewing it. This increases downward quality pressure on repositories. I use GitHub Copilot myself, and like Github itself reports it helps me generate code much faster. My use case however is personal tools, not a professional coding practice. Given my relatively unskilled starting point CoPilot makes a big difference between not having and having such personal tools. In a professional setting more code however does not equate better code. The report upon first skim highlights where benefits of Copilot clash with desired qualities of code production, quality and team work in professional settings.
Via Karl Voit

To investigate, GitClear collected 153 million changed lines of code,
authored between January 2020 and December 2023….. We find disconcerting trends for maintainability. Code churn — the
percentage of lines that are reverted or updated less than two weeks after
being authored — is projected to double in 2024 compared to its 2021,
pre-AI baseline. We further find that the percentage of “added code” and
“copy/pasted code” is increasing in proportion to “updated,” “deleted,” and
“moved” code.

Gitclear report

Bookmarked A quick survey of academics, teachers, and researchers blogging about note taking practices and zettelkasten-based methods by Chris Aldrich

Chris Aldrich provides a nice who’s who around studying note taking practices. There are some names in here that I will add to my feeds. Also will need to go through the reading list, with an eye on practices that may fit with my way of working. Perhaps one or two names are relevant for the upcoming PKM summit in March too.

Chris actively collects historical examples of people using index card systems or other note taking practices for their personal learning and writing. Such as his recent find of Martin Luther King’s index of notes. If you’re interested in this, his Hypothes.is profile is a good place to follow for more examples and finds.

I thought I’d put together a quick list focusing on academic use-cases from my own notes

Chris Aldrich

Favorited EDPB Urgent Binding Decision on processing of personal data for behavioural advertising by Meta by EDPB

This is very good news. The European Data Protection Board, at the request of the Norwegian DPA, has issued a binding decision instructing the Irish DPA and banning the processing of personal data for behavioural targeting by Meta. Meta must cease processing data within two weeks. Norway already concluded a few years ago that adtech is mostly illegal, but European cases based on the 2018 GDPR moved through the system at a glacial pace, in part because of a co-opted and dysfunctional Irish Data Protection Board. Meta’s ‘pay for privacy‘ ploy is also torpedoed with this decision. This is grounds for celebration, even if this will likely lead to legal challenges first. And it is grounds for congratulations to NOYB and Max Schrems whose complaints filed the first minute the GDPR enforcement started in 2018 kicked of the process of which this is a result.

…take, within two weeks, final measures regarding Meta Ireland Limited (Meta IE) and to impose a ban on the processing of personal data for behavioural advertising on the legal bases of contract and legitimate interest across the entire European Economic Area (EEA).

European Data Protection Board

Bookmarked The 100 Year Plan (by Automattic/WordPress)

WordPress is offering a century of managed hosting for 38.000USD, I presume upfront.

In reply to I’d love to understand what prompted Automattic to offer a hosting plan for $38K. by Ben Werdmuller

I don’t think this is a serious proposition by Automattic / WordPress.

  1. Who is in a position to put 38.000USD on the table right now, that they can’t use more usefully elsewhere? (even if in terms of monthly rates it’s not a large sum)
  2. Who believes Automattic, or any company, is likely to be around anno 2123 (unless they pivot to brewing or banking)? Or that they or their successor will honor such century old commitments (State guaranteed Russian railway shares are now just over 100 years old)?

I think it’s a way of getting attention for the last part of Matt’s quote at the end:

I hope this plan gets people and other companies thinking about building for the long term.

Matt Mullenweg

That is a relevant thing to talk about. People’s digital estates after they pass are becoming more important. I know how much time it took me to deal with it after my parents died, even with their tiny digital footprint, and even when it wasn’t about digital preservation mostly. Building code, hardware and systems to last is a valuable topic.

However if I want to ensure my blog can still be read in 100 years there is an easy fix: I would submit it to the national library. I don’t think my blog is in the subset of sites the Dutch Royal Library already automatically tracks and archives, even though at 20+ years it’s one of the oldest still existing blogs (at the same url too). However I can register an ISBN number for my collected postings. Anything published in the Netherlands that has an ISBN number will be added to the national library’s collection and one can submit it digitally (preferably even).

I think I just saved myself 38.000 USD in exchange for betting the Royal Library will still exist in 2123! Its founding was in 1798, 225 years ago, so the Lindy effect suggests it’s likely a good bet to give it another century or two.

Bookmarked Wildlife surveys using ‘DNA vacuums’! by Dr. Christina Lynggaard

Environmental DNA sampling sounds very cool: capturing DNA from the air (or other environments), and not needing to sample DNA directly from organisms. Dr. Christina Lynggaard says in three days she captured DNA from the air from dozens of animals in a natural setting. Downloaded the cited paper to read (DOI). I wonder if something like this is within reach of citizen science group’s capabilities? Perhaps just the sampling, or maybe even the sequencing and determination?

This was our first exploration of airborne eDNA in a natural setting and we were especially surprised by the high number of bird taxa detected

Christina Lynggaard

Bookmarked Disinformation and its effects on social capital networks (Google Doc) by Dave Troy

This document by US journalist Dave Troy positions resistance against disinformation not as a matter of factchecking and technology but as one of reshaping social capital and cultural network topologies. I plan to read this, especially the premises part looks interesting. Some upfront associations are with Valdis Krebs’ work on the US democratic / conservative party divide where he visualised it based on cultural artefacts, i.e. books people bought (2003-2008), to show spheres and overlaps, and with the Finnish work on increasing civic skills which to me seems a mix of critical crap detection skills woven into a social/societal framework. Networks around a belief or a piece of disinformation for me also point back to what I mentioned earlier about generated (and thus fake) texts, how attempts to detect such fakes usually center on the artefact not on the richer tapestry of information connections (last 2 bullet points and final paragraph) around it (I called it provenance and entanglement as indicators of authenticity recently, entanglement being the multiple ways it is part of a wider network fabric). And there’s the more general notion of Connectivism where learning and knowledge are situated in networks too.

The related problems of disinformation, misinformation, and radicalization have been popularly misunderstood as technology or fact-checking problems, but this ignores the mechanism of action, which is the reconfiguration of social capital. By recasting these problems as one problem rooted in the reconfiguration of social capital and network topology, we can consider solutions that might maximize public health and favor democracy over fascism …

Dave Troy