How could I not buy these small notebooks? Made by my friend Peter from paper cut-offs from boxes he made and printed in Tuscany, they are made from Magnani 1404 paper. Magnani started making paper in Pescia in 1404 (they ceased operation in recent years, but another Magnani is still making paper, since 1481), right at the moment in time that the literate population of Tuscany started using paper notebooks to make everyday notes, and lots of them. Paper had become affordable and available enough roughly a century earlier, with Tuscany being at the heart of that, and Florentine merchants used their book keeping system and the paper notebooks needed for it to build a continent spanning trade network. After the Black Death personal note taking took off too, and from 1400 onwards it had become commonplace:

At the end of the Middle Ages, urban Tuscans seemed stricken with a writing fever, a desire to note down everything they saw.’ But they remained a peculiarly local phenomenon: there was something uniquely Florentine (or more accurately ‘Tuscan’ as examples also survive from Siena and Lucca) about them,…

Allen, Roland. The Notebook: A History of Thinking on Paper (p. 61).”

Around the turn of the year I gave The Notebook as a present to Peter thinking it would be something to his liking. My own notes have helped me learn and work for decades. E and I when we lived in Lucca for a month, passed through Pescia by train en route to Firenze.

Tuscany, paper from a company that was there from the start of everyday note taking, The Notebook, personal knowledge management, and friendship, all coming together in this piece of craftsmanship. How could I not buy them? So I did.

I have a little over 25 years worth of various notes and writings, and a little over 20 years of blogposts. A corpus that reflects my life, interests, attitude, thoughts, interactions and work over most of my adult life. Wouldn’t it be interesting to run that personal archive as my own chatbot, to specialise a LLM for my own use?

Generally I’ve been interested in using algorithms as personal or group tools for a number of years.

For algorithms to help, like any tool, they need to be ‘smaller’ than us, as I wrote in my networked agency manifesto. We need to be able to control its settings, tinker with it, deploy it and stop it as we see fit.
Me, April 2018, in Algorithms That Work For Me, Not Commoditise Me

Most if not all of our exposure to algorithms online however treats us as a means to manipulate our engagement. I see them as potentially very valuable tools in working with lots of information. But not in their current common incarnations.

Going back to a less algorithmic way of dealing with information isn’t an option, nor something to desire I think. But we do need algorithms that really serve us, perform to our information needs. We need less algorithms that purport to aid us in dealing with the daily river of newsy stuff, but really commodotise us at the back-end.
Me, April 2018, in Algorithms That Work For Me, Not Commoditise Me

Some of the things I’d like my ideal RSS reader to be able to do are along such lines, e.g. to signal new patterns among the people I interact with, or outliers in their writings. Basically to signal social eddies and shifts among my network’s online sharing.

LLMs are highly interesting in that regard too, as in contrast to the engagement optimising social media algorithms, they are focused on large corpora of text and generation thereof, and not on emergent social behaviour around texts. Once trained on a large enough generic corpus, one could potentially tune it with a specific corpus. Specific to a certain niche topic, or to the interests of a single person, small group of people or community of practice. Such as all of my own material. Decades worth of writings, presentations, notes, e-mails etc. The mirror image of me as expressed in all my archived files.

Doing so with a personal corpus, for me has a few prerequisites:

  • It would need to be a separate instance of whatever tech it uses. If possible self-hosted.
  • There should be no feedback to the underlying generic and publicly available model, there should be no bleed-over into other people’s interactions with that model.
  • The separate instance needs an off-switch under my control, where off means none of my inputs are available for use someplace else.

Running your own Stable Diffusion image generator set-up as E currently does complies with this for instance.

Doing so with a LLM text generator would create a way of chatting with my own PKM material, ChatPKM, a way to interact (differently than through search and links, as I do now) with my Avatar (not just my blog though, all my notes). It might adopt my personal style and phrasing in its outputs. When (not if) it hallucinates it would be my own trip so to speak. It would be clear what inputs are in play, w.r.t. the specialisation, so verification and references should be easier to follow up on. It would be a personal prompting tool, to communicate with your own pet stochastic parrot.

Current attempts at chatbots in this style seem to focus on things like customer interaction. Feed it your product manual, have it chat to customers with questions about the product. A fancy version of ‘have you tried switching it off and back on?‘ These services allow you to input one or a handful of docs or sources, and then chat about its contents.
One of those is Chatbase, another is ChatThing by Pixelhop. The last one has the option of continuously adding source material to presumably the same chatbot(s), but more or less on a per file and per URL basis and limited in number of words per month. That’s not like starting out with half a GB in markdown text of notes and writings covering several decades, let alone tens of GBs of e-mail interactions for instance.

Pixelhop is currently working with Dave Winer however to do some of what I mention above: use Dave’s entire blog archives as input. Dave has been blogging since the mid 1990s, so there’s quite a lot of material there.
Checking out ChatThing suggests that they built on OpenAI’s ChatGPT 3.5 through its API. So it wouldn’t qualify per the prerequisites I mentioned. Yet, purposely feeding it a specific online blog archive is less problematic than including my own notes as all the source material involved is public anyway.
The resulting Scripting News bot is a fascinating experiment, the work around which you can follow on GitHub. (As part of that Dave also shared a markdown version of his complete blog archives (33MB), which for fun I loaded into Obsidian to search through. Also for comparison with the generated outputs from the chatbot, such as the question Dave asked the bot when he first wrote about the iPhone on his blog.)

Looking forward to more experiments by Dave and Pixelhop. Meanwhile I’ve joined Pixelhop’s Discord to follow their developments.

Having created a working flow to generate OPML booklists directly from the individual book notes in my PKM system, I did the first actual run in production of those scripts today.

It took a few steps to get to using the scripts in production.

  • I have over 300 book note files in my Obsidian vault.
  • Of course most lacked the templated inline data fields that allow me to create lists. For the 67 fiction books I read in 2021 I already had a manual list with links to the individual files. Where needed I added the templated data fields.
  • Having added those inline fields where they were missing I can easily build lists in Obsidian with the Dataview plugin. Using this code


    results in

  • The same inline data fields are used by my scripts to read the individual files and build the same list in OPML
  • That gets automatically posted to my website where the file is both machine and human readable.

Doing this in production made me discover a small typo in the script that builds the OPML, now fixed (also in the GitHub repository). It also made me realise I want to add a way of ordering the OPML outline entries by month read.

Lists to take into production next are those for currently reading (done), non-fiction 2021, and the anti-library. That last one will be the most work, I have a very long list of books to potentially read. I will approach that not as a task of building the list, but as an ongoing effort of evaluating books I have and why they are potentially of interest to me. A way, in short, to extend my learning, with the list as a useful side effect. The one for currently reading is the least work, and from it the lists for fiction 2022 and non-fiction 2022 will automatically follow. The work is in the backlog, getting history to conform to the convention I came up with, not in moving forward from this point.

In parallel it is great to see that Tom Critchlow is also looking at creating such book lists, in JSON, and at digesting such lists from others. The latter would implement the ‘federated’ part of federated bookshelves. Right now I just point to other people’s list and rss feeds in my ‘list of lists‘. To me getting to federation doesn’t require a ‘standard’. Because JSON, OPML and e.g. schema.org have enough specificity and overlap between them to allow both publishers of lists and parsers or such lists enough freedom to use or discard data fields as they see fit. But there is definitely a discussion to be had on identifying that overlap and how to use it best. Chris Aldrich is planning an IndieWeb event on this and other personal libraries related topics next month. I look forward to participating in that, quite a number of interesting people have expressed interest, and I hope we’ll get to not just talk but also experiment with book lists.

As a form of WAB* I’ve made it easier for myself to update my OPML book lists. I created those lists earlier this year as a proof of concept of publishing federated bookshelves. Updating OPML files residing on my hosted webserver is not a fun manual task. Ultimately I want to automate pushing lists from my personal working environment (notes in Obsidian) to my site. Given my limited coding skills I made a first easier step and created a webform in a php script that allows me to add a book to an opml list. It has a drop-down menu for the various OPML lists I keep (e.g. fiction2021, non-fiction2021, currently reading, anti-library), provides the right fields to add the right OPML data attributes, and then writes them to the correct list (each list is a separate file).

That now works well. Having a way to post to my book lists by submitting a form now, I can take the next step of generating such form submissions to replace manually filling out the form.

* Work Avoiding Behaviour, a continuation of the SAB, Study Avoiding Behaviour that I excelled in at university. WAB seems to fit very well with the current locked down last days until the end of year. The Dutch terms ‘studie/werk ontwijkend gedrag’ SOG/WOG lend themselves to the verb to ‘sog’ and to ‘wog’. Yesterday when Y asked E what she had been doing today, E said ‘I’ve been wogging’, and I realised I had been too.

Could one redo any useful app, for that matter, that now fills the start-up cemetery?

I was reminded of this as Peter mentioned Dopplr, a useful and beautifully designed service in the years 2007-2010. The Dopplr service died because it was acquired by Nokia and left to rot. Its demise had nothing to do with the use value of the service, but everything with it being a VC funded start-up that exited to a big corporation in an identity crisis which proved unequipped to do something useful with it.

Some years ago I kept track of hundreds of examples of open data re-use in applications, websites and services. These included many that at some point stopped to exist. I had them categorised by the various phases of when they stalled. This because it was not just of interest which examples were brought to market, but also to keep track of the ideas that materialised in the many hackathons, yet never turned into an app or service, Things that stalled during any stage between idea and market. An idea that came up in France but found no traction, might however prove to be the right idea for someone in Lithuania a year later. An app that failed to get to market because it had a one-sided tech oriented team, might have succeeded with another team, meaning the original idea and application still had intrinsic use value.

Similarly Dopplr did not cease to exist because its intrinsic value as a service was lost, but because everything around it was hollowed out. Hollowed out on purpose, as a consequence of its funding model.

I bet many of such now-lost valuable services could lead a healthy live if not tied to the ‘exit-or-bust’ cycle. If they can be big enough in the words of Lee Lefever, if they can be a Zebra, not aiming to become a unicorn.

So, what are the actual impediments to bring a service like Dopplr back. IP? If you would try to replicate it, perhaps yes, or if you use technology that was originally created for the service you’re emulating. But not the ideas, which aren’t protected. In the case of Dopplr it seems there may have been an attempt at resurrection in 2018 (but it looked like a copy, not a redo of the underlying idea).

Of course you would have to rethink such a service-redo for a changed world, with new realities concerning platforms and commonly used hardware. But are there actual barriers preventing you to repeat something or create variations?

Or is it that we silently assume that if a single thing has failed at some point, there’s no point in trying something similar in new circumstances? Or that there can ever only be one of something?



Repetitions and Variations, a beautiful Matisse exhibit we saw in 2012 in the Danish national art gallery in Copenhagen. Image by Ton Zijlstra, license CC BY-NC-SA


12 stages, 1 painting. I’m thinking the reverse, 1 sketch, 12 paintings. Image by Ton Zijlstra, license CC BY-NC-SA


Normandy Cliff with fish, times 3. Matisse ‘Repetitions and Variations’ exhibit. Image by Ton Zijlstra, license CC BY-NC-SA

First in Peter’s favourites from his feedreader, then from Matt Webb’s feed directly, which both showed up right beneath eachother when I opened my feedreader this morning, I read Personal Software vs Factory Produced Software.

In that posting Matt points to Rev Dan Catt’s recent week notes, in which he describes the types of tools he makes for himself. Like Matt I love this kind of stuff. I have some small tools for myself like that, and it is the primary reason I have been running a local webserver on my laptop: it allows me to do anything I could do online right on my laptop, as home cooking. Transposing code snippets into safe HTML output for instance. Or converting bank statements into something I can import in my accounting spreadsheet. Those are however somewhat of a mechanical nature. They’re by me, but not about me. And that is the qualitative difference specifically of the letter/cards tracking tool described in Rev Dan Catt’s post.

That is more akin to what I am trying to slowly build for myself since forever. Something that closely follows my own routines and process, and guides me along. Not just as a reference, like my notes or wiki, or as a guide like my todo-lists and weekly overviews. But something that welcomes me in the morning by starting me on my morning routine “Shall I read some feeds first, or shall I start with a brief review of today’s agenda.” and nudges me kindly “it’s been 15mins, shall I continue with …?”, or “shall I review …, before it becomes urgent next week?”. A coach and PA rolled into one, that is bascially me, scripted, I suppose. I’ve always been an avid note taker and lists keeper, even way before I started using computers in 1983. Those lists weren’t always very kind I realised in 2016, it became more a musts/shoulds thing than mights/coulds. Too harsh on myself, which reduces its effectiveness (not just to 0 at times, but an active hindrance causing ineffectiveness). I wanted a kinder thing, a personal operating system of sorts. Rev Dan Scott’s correspondence tool feels like that. I reminds me of what Rick Klau described earlier about his contacts ‘management’, although that stays closer to the mechanical, the less personal I feel, and skirts closer to the point where it feels inpersonal (or rather it challenges the assumption ‘if you don’t know it yourself and keep a list it’s not authentic’ more).

Building personalised tools, that are synchronised with the personality and routines of the person using it, not as an add-on (you can add your own filter rules to our e-mail client!), but as its core design, is mostly unexplored terrain I think. Because from a business perspective it doesn’t obviously “scale”, so no unicorn potential. That sort of generic scaling is unneeded anyway I think, and there is a very much available other path for scaling. Through the invisible hand of networks, where solutions and examples are replicated and tweaked across contexts, people and groups. That way lie the tools that are smaller than us, and therefore really provide agency.

It’s also why I think the title of Matt’s post Personal Software versus Factory Produced Software is a false dilemma. It’s not just a choice between personal and mass, between n=1 and statistics. There is a level in between, which is also where the complexity lives that makes us search for new tools in the first place: the level of you and your immediate context of relationships and things relevant to them. It’s the place where the thinking behind IndieWeb extends to all technology and methods. It’s where federation of tools live, and why I think you should run personal instances of tools that federate, not join someone else’s server, unless it is a pre-existing group launching a server and adopting it as their collective hang-out. Running personal or group tools, that can talk to others if you want it to and are potentially more valuable when connected to others, that have the network effect built in as an option.