I extended the capabilities of my microsub feed reader with the option to save web articles directly from the reader to my Obsidian notes in markdown format.

Until now if I wanted to save an entire article I found in my feed reader, I would open it in the browser and then use the markdownclipper browser add-on to add some context and then save the article in markdown in my notes. I wanted to cut out that step of opening it in the feed reader, by saving it directly to my markdown notes. In my feedreader I already have a response form to e.g. post a reply to a posting on my own site. Posting it to my notes means adding a path to how I process that form.

I had to find a suitable script for converting HTML to MarkDown first. Which I found in PHP League’s HTML-to-Markdown, as suggested by Jan Boddez. It requires Composer which I already had installed on my laptop.

I tweaked my feed reader’s response form to also (as a hidden field) include the original HTML of a posting (using htmlentities to stuff it into a form field value). The script that processes the form I altered to both have a path for posting to websites (using micropub) and a new path to make a note in Obsidian, which is then saved as a .md file to the folder I store all clipped articles in.
To make a note I shape the available input the same way I template clipping things from the browser. At the top is my rationale for clipping something and reference to the source, followed by the original posting after which I add some keywords as tags and again the reference to the source.

In the images below you see the corresponding elements marked both as they appear in the reader as well as the resulting note.

The article as shown in my feed reader:

1: the original HTML content from a feed
2: title of the article (prefilled by my feed reader)
3: name of the author (prefilled by my feed reader)
4: original article’s URL (prefilled by my feed reader)
5: the reason and context why I am saving this to notes (also used to write a reply to a post, or the reason for bookmarking something if it will be posted on my site)
6: a quote I want to highlight
7: keywords that will become tags or categories on my site, and tags in my notes
8: selector for which site to post to (zyl is my blog), or ‘obs’ for making a note in Obsidian

Except for that last one those numbers are marked on the image of the resulting markdown note.

The resulting note in Obsidian:

1: the original HTML content from a feed shown in Markdown as the main body of the note
2: title of the article, both shown as part of the content of the note, as well as the title of the note (where a timestamp is added)
3: name of the author (mentioned with the source both at the top and bottom)
4: original article’s URL (mentioned with the author both at the top and bottom)
5: the reason and context why I am saving this, always at the top as it helps me process the content better
6: a quote I wanted to highlight
7: keywords that have become hashtags

(This posting was also written in my notes and, except for the images, posted directly from Obsidian to my site. Meaning I can both automatically move material into Obsidian, as well as automatically move material out of Obsidian. I quite enjoy the feeling of using that ‘magic’.)

I have a working proof of concept to take individual book notes from Obsidian, turn them into an OPML list of books, and publish them on this webserver. As I had time off these past days I’ve allowed myself to do some code tinkering, resulting in the set-up shown in the image below.


A sketch of my set-up, made in Excalidraw within Obsidian. The blue items now exist, the grey items are still to be done.

The workflow is now as follows:

  • Within Obsidian I have made a template for book notes, which has a number of inline data fields (shaped ‘field:: value’). These fields contain the same attributes that I use in my OPML files, using the data structure I made earlier. It also contains one additional field, the booklist it is part of.
  • When I first create a new book note I use the template and fill out the inline data fields. If it changes status (to read, reading, read) I update the attributes if needed. Next to those data fields it can contain anything else (e.g. my Kindle hihglights and remarks end up in those book notes too.)
  • I can create lists of books in Obsidian using the Dataview plugin, which can find and interpret the inline data fields.
  • I can run a PHP script, on my laptop, that iterates through all the files in the folder that contains my book notes. It reads the inline data fields and turns them into OPML lines with the same attributes. It saves it in the correct OPML file using the booklist field. This means that when I move a book in Obsidian from my “anti-library” to the “currently reading” list and then to the “non-fiction 2021” or “fiction 2021” list by changing that single booklist data field, that will get reflected in the OPML as well. The OPML files are saved in a folder, and both human and machine readable.
  • I have a second PHP script that also runs locally on my laptop, that iterates through the files in the folder. For each of the .opml files it finds that have changed in the past week, it will get the filename and the file content. It then sends those two data fields (and an access code) to a script on my web server as POST form data.
  • The script on my web server accepts POST form data and if the access code is ok, will save the submitted file content using the submitted file name. After that the OPML files on my webserver have the same content as my Dataview overviews within Obsidian, and are fully based on the inline data fields in my individual book notes.

I’ve tested this flow and it works correctly. There’s one important improvement to still make. It currently goes through all my book notes and creates all opml files anew. I want to change that to start from the recently changed book notes and then generate the corresponding opml files. For now it is fast enough locally to not be an issue though that it iterates through the entire folder of book notes.
A second step to take is an addition: to render the same information as JSON files. Dave Winer’s OPMLpackage is likely useful here. Early on there was some discussion on which format to use, and I don’t see a need to choose. I’ve created it using my preference, but the same information can be formatted differently in parallel if it aids usage and federation.
To fully automate this, I still need to set a cron job that calls the first and second local script in turn, every other week or so.

Now that it all works, I will need to see how it goes in practice when I pick a new book, or finish one.
I also need to clean up the code (removing the tests I added in various steps) and translate some of the comments in English. Then I can share I’ve shared the scripts on GitHub, so others can use it for inspiration.

Future steps may include generating book postings in my blog here, directly based on Obsidian notes as well.