I realised I had an ical file of all my appointments from the period I used Google Calendar from January 2008 when I started as an independent consultant, until February 2020 when I switched to the calendar in my company’s NextCloud.
I never search through that file even though I sometimes wonder what I did at a certain moment in that period. After a nudge by Martijn Aslander who wrote on a community platform we both frequent about back filling his daily activities into Obisidian for instance based on his photos of a day through the years in his archive, I thought to import that ical file and turn it into day logs listing my appointments for a date.

I tried to find some ready made parsers or libraries I could use in PHP, but most I could find is aimed at importing a live calendar rather than an archived file, and none of them were aimed at creating an output of that file in a different format. Looking at the ical file I realised that making my own simple parser should be easy enough.

I write a small PHP script that reads the ical file line by line until it finds one that says BEGIN:VEVENT. Then it reads the lines until it finds the closing line END:VEVENT. It then interprets the information between those lines, lifting out the date, location, name and description, while ignoring any other information.
After each event it finds, it writes to a file ‘Daylog [date].md’ in a folder ./year/month (creating the file or appending the event as a new line if the file exists). It uses the format I use for my current Day logs.
Let it repeat until it processed all 4.714 events in my calendar from 2008 to 2020.

A screenshot of all the folders with Daylogs created from 2008-2020

Screenshot of the newly created Daylog for 20 February 2008, based on the appointments I had that day, taken from the calendar archive. This one mentions a preparatory meeting for the open GovCamp I helped organise that year in June, which kicked off my work in open data since then.

Received this yesterday. It’s a 1987 collection of interviews with German sociologist Niklas Luhmann, on the occasion of his 60th birthday. I bought it after I saw Chris Aldrich sharing some annotations.

Started exploring. The context is important to take into account. 1980s Germany, after ’68 and before the Wall came down, the contrast between Habermas and Luhmann while both being ‘famous’ in left and intellectual circles at the same time.

Last Friday our 7yo daughter could bring some toys to school. This as it was the last day before a week off, and they would spend the last hour or so playing.
The evening before she thought about what toys she would take to school. And made a list after we brought her to bed…

This is how personal knowledge management starts.
The list also has a few icons (such as for playmobil 6 figurines and 3 animal figures). She wanted to also bring a book (in case it would get boring at some point), but added 0% and an image of a battery. Because the teacher had said anything with a screen or battery wasn’t allowed. So it had to be a paper book. The list also mentions earplugs, because ‘it will likely get noisy’.

Friday morning when she got up she showed me the list, as I was making my own notes, about ODRL.

I marvel at the level of detail in her list as she thought it through the evening before. In the morning she decided against the earplugs and book in the end. I was an active notes writer from early on in primary school. Not so much focused on the school work, that was usually a boring breeze, but I focused on what I saw happening around me, very often social connections I noticed between others too, things I found puzzling or stood out. I had this notion things and people would make sense more if I could suss out the connections between them.

Last weeked E and I visited Groningen together, and spent some 90 minutes or more browsing the Godert Walter bookstore. We had the place mostly to ourselves and it was a pleasure to browse the shelves unhurriedly. In the back in a reduced price box I found the 900+ pages tome ‘Information: A Historical Companion‘. It aims to dive deeply in the information strategies and personal knowledge management practices from Roman times until now. It was published in 2021 by Princeton University Press. With chapters covering specific historical periods, and chapters covering specific concepts (cards, memory techniques, accounting, databases etc.), plus a detailed Index, this promises to provide me with lots of interesting material to browse.

The book Information: A Historical Companion, after I brought it back to the hotel.

These index cards provide improvisation prompts. They contain words to use and suggestions for actions to use in a game of improvisation. One grouping of words and actions per index card. Seeing them laid out next to each other obviously reminded me of the use of index cards in personal learning/knowledge systems that are based on physical cards or made digitally (keeping one thing per note file), as well as of flash cards (like for spaced repetition). And it made me think of Chris Aldrich who collects examples of using index cards like these, as well as of Peter who is part of an improv group.

This set contains 108 cards with ‘nuclei’ of words and actions for improv. They were created by Jackson Mac Low in 1961 as ‘nuclei for Simone Forti‘ after seeing her perform in Yoko Ono’s loft. They were used by her as well as by Trisha Brown.

I came across this set of cards at the ‘Fondation du doute‘, the institute of doubt, in Blois, in a exhibition on the postmodern ‘Fluxus‘ movement that Jackson Mac Low participated in for some time.

I have a little over 25 years worth of various notes and writings, and a little over 20 years of blogposts. A corpus that reflects my life, interests, attitude, thoughts, interactions and work over most of my adult life. Wouldn’t it be interesting to run that personal archive as my own chatbot, to specialise a LLM for my own use?

Generally I’ve been interested in using algorithms as personal or group tools for a number of years.

For algorithms to help, like any tool, they need to be ‘smaller’ than us, as I wrote in my networked agency manifesto. We need to be able to control its settings, tinker with it, deploy it and stop it as we see fit.
Me, April 2018, in Algorithms That Work For Me, Not Commoditise Me

Most if not all of our exposure to algorithms online however treats us as a means to manipulate our engagement. I see them as potentially very valuable tools in working with lots of information. But not in their current common incarnations.

Going back to a less algorithmic way of dealing with information isn’t an option, nor something to desire I think. But we do need algorithms that really serve us, perform to our information needs. We need less algorithms that purport to aid us in dealing with the daily river of newsy stuff, but really commodotise us at the back-end.
Me, April 2018, in Algorithms That Work For Me, Not Commoditise Me

Some of the things I’d like my ideal RSS reader to be able to do are along such lines, e.g. to signal new patterns among the people I interact with, or outliers in their writings. Basically to signal social eddies and shifts among my network’s online sharing.

LLMs are highly interesting in that regard too, as in contrast to the engagement optimising social media algorithms, they are focused on large corpora of text and generation thereof, and not on emergent social behaviour around texts. Once trained on a large enough generic corpus, one could potentially tune it with a specific corpus. Specific to a certain niche topic, or to the interests of a single person, small group of people or community of practice. Such as all of my own material. Decades worth of writings, presentations, notes, e-mails etc. The mirror image of me as expressed in all my archived files.

Doing so with a personal corpus, for me has a few prerequisites:

  • It would need to be a separate instance of whatever tech it uses. If possible self-hosted.
  • There should be no feedback to the underlying generic and publicly available model, there should be no bleed-over into other people’s interactions with that model.
  • The separate instance needs an off-switch under my control, where off means none of my inputs are available for use someplace else.

Running your own Stable Diffusion image generator set-up as E currently does complies with this for instance.

Doing so with a LLM text generator would create a way of chatting with my own PKM material, ChatPKM, a way to interact (differently than through search and links, as I do now) with my Avatar (not just my blog though, all my notes). It might adopt my personal style and phrasing in its outputs. When (not if) it hallucinates it would be my own trip so to speak. It would be clear what inputs are in play, w.r.t. the specialisation, so verification and references should be easier to follow up on. It would be a personal prompting tool, to communicate with your own pet stochastic parrot.

Current attempts at chatbots in this style seem to focus on things like customer interaction. Feed it your product manual, have it chat to customers with questions about the product. A fancy version of ‘have you tried switching it off and back on?‘ These services allow you to input one or a handful of docs or sources, and then chat about its contents.
One of those is Chatbase, another is ChatThing by Pixelhop. The last one has the option of continuously adding source material to presumably the same chatbot(s), but more or less on a per file and per URL basis and limited in number of words per month. That’s not like starting out with half a GB in markdown text of notes and writings covering several decades, let alone tens of GBs of e-mail interactions for instance.

Pixelhop is currently working with Dave Winer however to do some of what I mention above: use Dave’s entire blog archives as input. Dave has been blogging since the mid 1990s, so there’s quite a lot of material there.
Checking out ChatThing suggests that they built on OpenAI’s ChatGPT 3.5 through its API. So it wouldn’t qualify per the prerequisites I mentioned. Yet, purposely feeding it a specific online blog archive is less problematic than including my own notes as all the source material involved is public anyway.
The resulting Scripting News bot is a fascinating experiment, the work around which you can follow on GitHub. (As part of that Dave also shared a markdown version of his complete blog archives (33MB), which for fun I loaded into Obsidian to search through. Also for comparison with the generated outputs from the chatbot, such as the question Dave asked the bot when he first wrote about the iPhone on his blog.)

Looking forward to more experiments by Dave and Pixelhop. Meanwhile I’ve joined Pixelhop’s Discord to follow their developments.