As I’ve run into trouble with my TinyTinyRSS install, I’m switching to FreshRSS, to see how that works for me.

My TinyTinyRSS has the issue where many calls to the file backend.php keep timing out. It seems to have as effect that feed updates are not coming through, and worse that the repeated resource use flagged something with my hoster, making them blocking my home IP. That’s a blunt instrument to wield without checking whether that IP is your client’s own IP, but still.

Next to TinyTinyRSS my hoster also supports FreshRSS for self-hosting, so I installed that. I wanted to try FreshRSS out anyway, so this is a good opportunity to make the switch.

Writing it down may help in getting out of the loop…

I’m continuing my tinkering with federated bookshelves, for which I made an OPML based way of publishing both lists of books, as well as point to other people’s lists and to RSS feeds of content about books. I now changed my XSL style sheet to parse my OPML files to be able to also parse mentions of RSS feeds.

Meanwhile I read Matt Webb’s posting on using RSS (and OPML) a few more times, and I keep thinking, “yes, but where do you leave the actual data?”
Then I read Stephen Downes’ recent posting on distributing reading material and entire books for courses through RSS, and realised it gave me the same sense of not sounding quite right, like Matt’s posting. That feeling probably means I’m not fully understanding their argument.

RSS is a by design simple XML format as a way to syndicate web content, including videos and podcasts. Content is an important word here, as is syndication: if you have something where new material gets added regularly, an RSS feed is a good way to push it out to those interested.
OPML is another by design simple XML format as a way to share outlines. Outlines are content themselves, and outlines can contain links to other content (including further outlines). One of the common uses of OPML is to share a list of RSS feeds through it, ‘these are the blogs I follow’.

In Matt’s and Stephen’s posts I think there are examples that fail to satisfy either the content part of RSS, or the syndication of new content part. In Matt’s case he talks about feeds of postings about books, like my book category in this site, which is fine, but also in terms of lists of books, which is where I struggle: a list doesn’t necessarily list pieces of content, let alone pieces of web content which RSS seems to require. It more likely is just a list. At the same time he mentions OPML as ‘library’, to use to point to such lists of books. Why would you use OPML for the list of lists, but not for the lists themselves, when those book lists themselves have no content per book, only a number of data attributes which aren’t the content items but only descriptions of items? And when the whole point of OPML outlines is branching lists? When a library isn’t any different from a list, other than maybe in size? Again it is different for actual postings about books, but you can already subscribe to those feeds as existing rivers of content, and point to those feeds (in the same OPML, as I do in my experimental set-up now as well).
In Stephen’s posting he talks about providing the content of educational resources through RSS. He suggests it for the distribution of complete books, and for course material. I do like the idea of providing the material for a course as a ‘blob’. We’re talking about static material here, a book is a finished artefact. Where then is the point in syndication through RSS (other than maybe if the book is a PDF or EPUB or something that might be an enclosure in a RSS feed)? Why not provide the material from its original web source, with its original (semantic) mark-up? Is it in any way likely that such content is going to be read in the same tool the RSS feed is loaded into? And what is the ‘change’ the RSS feed is supposed to convey here, when it’s a one-off distribution and no further change beyond that moment of distribution is expected?

OPML outlines can have additions and deletions, though at a slower pace than e.g. blogs. You could have an RSS feed for additions to an OPML outline (although OPML isn’t web content). But you could also monitor OPML outlines themselves for changes (both additions and deletions) over time. Or reload and use the current version as is, without caring about the specific changes in them.

The plus side of OPML and RSS is that there are many different pieces of code around that can deal with these formats. But most won’t be able to deal as-is with adding data attributes that we need to describe books as data, but aren’t part of the few basic mandatory attributes RSS and OPML are expected to contain. Both RSS and OPML do allow for the extension of attributes, if you follow existing name spaces, such as e.g. schema.org’s for creative works, which seems applicable here (both for collections of books, i.e. a shelf or a library, as well as books themselves). If the use of RSS (and OPML for lists of RSS files) is suggested because there’s an existing eco-system, but we need to change it in a way that ensures the existing ecosystem won’t be able to use it, then where’s the benefit of doing so? To be able to build readers and to build OPML/RSS creators, it is useful to be able to re-use existing bits and pieces of code. But is that really different from creating ones own XML spec? At what point are our adaptations to overcome the purposeful simplicity of OPML and RSS destroying the ease of use we hope to gain from using that simplicity?

Another thing that I keep thinking about is that book lists (shelves, libraries) and book data, basically anything other than web published reviews of books, don’t necessarily get created or live on the web. I can see how I could easily use my website to create OPML and RSS feeds for a variety book lists. But it would require me to have those books and lists as content in my website first, which isn’t a given. Keeping reading lists, and writing reading notes, are part of my personal knowledge management workflow, and it all lives in markdown textfiles on my local harddrive. I have a database of e-books I own, which is in Calibre. I have an old database of book descriptions/data of physical books I owned and did away with in 2012, which is in Delicious Library. None of that lives on the web, or online in any form. If I am going to consistently share bookshelves/lists, then I probably need to create them from where I use that information already. I think Calibre has the ability to work with OPML, and has an API I could use to create lists.
Putting that stuff first into my website in order to generate one some or all of XML/OPML/RSS/JSON from it there, is work and friction I don’t want. If it is possible to automatically put it in my website from my own local notes and databases, that is fine, but then it is just as possible to automatically create all the XML/OPML/RSS/JSON stuff directly from those local notes and databases as well. Even if I would use my website to generate sharable bookshelves, I wouldn’t work with other people’s lists there.

I also think that it is very unlikely that a ‘standard’ emerges. There will always be differences in how people share data about books, because of the different things they care about when it comes to reading and books. Having namespaces like schema.org is useful of course, but I don’t expect everyone will use them. And even if a standard emerges, I am certain there will be many different interpretations thereof in practice. It is key to me that discoverability, of both people sharing book lists and of new to me books, exists regardless. That is why I think, in order to read/consume other people’s lists, other than through the human readable versions in a browser/reader, and to tie them into my information filtering and internal tools/processes, I likely need to have a way to flexibly map someone else’s shared list to what I internally use.

I’m not sure where that leaves me. I think somewhere along these lines:

  • Discovery, of books and people reading them, is my core aim for federation
  • OPML seems useful for lists (of lists)
  • RSS seems useful for content about books
  • Both depend on using specific book related data attributes which will have limited standardisation, even if they follow existing namespaces. It is impossible to depend on or assume standardisation, something more flexible is needed
  • My current OPML lists points to other lists by me and others, and to RSS feeds by me and others
  • I’m willing to generate OPML, RSS and JSON versions of the same lists and content if useful for others, other than templating there’s no key difference after all
  • Probably my website is not the core element in creating or maintaining lists. It is for publishing things about books.
  • I’m interested in other people’s RSS feeds about books, and will share my list of feeds I follow as OPML
  • I need to figure out ways to create OPM/RSS/JSON etc directly from where that information now lives in my workflow and toolset
  • I need to figure out ways to incorporate what others share with me into my workflow and toolset. Whatever is shared through RSS already fits existing information strategies.
  • For a limited number of sources shared with me by others, it might make sense to create mappings of their content to my own content structures, so I can import/integrate them more fully.

Related postings:
Federated Bookshelves (April 2020)
Federated Bookshelves Revisited (April 2021)
Federated Bookshelves Proof of Concept (May 2021)
Booklist OPML Data Structure (May 2021)

After I built a proof of concept of using OPML to share and federate book lists yesterday (UPDATE: description of the data structure for booklists), Tom Chritchlow asked me about subscribing to OPML lists in the comments. I also reread Matt Webb’s earlier posting about using OPML and RSS for book lists.
That results in a few remarks and questions I’d like to make and ask:

  • OPML serves 2 purposes
    1. In the words of Dave Winer, opml’s creator, OPML is meant as a “transparently simple, self-documenting, extensible and human readable format that’s capable of representing a wide variety of data that’s easily browsed and edited” to create and manipulate outlines, i.e. content structured hiearchically / tree-like.
    2. the format is a way to exchange such outlines between outliner tools.
  • In other words OPML is great for making (nested) lists, and for exchanging them. I use outlines to build my talks and presentations. It could be shopping lists like in Doug Engelbart’s 1968 ‘mother of all demos’. And indeed it can be lists of books.
  • A list I regard as an artefact in itself. A list of something is not just iterating the somethings mentioned, the list itself has a purpose and meaning for its creator. It’s a result of some creative act, e.g. curation, planning, writing, or desk research.
  • A book list I regard as a library, of any size. The list can be as short as the stack on my night reading table is high, as long as a book shelf in my home is wide, or as enormous as the full catalogue of the Royal Library. Judging by Tom Critchlow’s name for his booklist data ‘library.json‘ he sees that similarly.
  • A book list, as I wrote in my posting about the proof of concept, can have books in them, and other book lists by myself or others. That is where the potential for federation lies. I can from a book point to Tom’s list as the source of inspiration. I could include one of Tom’s booklists into my own booklists.
  • A list of books is different from a group of individual postings about books as also e.g. presented on my blog’s reading category page. I blog about books I read, but not always. In fact I haven’t written any postings at all this year, but have read 25 books or so since January 1st. It is easier to keep a list of books, than to write postings about each of the books listed. This distinction is expressed too in Tom Macwright’s set-up. There’s a list of books he’s read, which points to pages with a posting about an entry in that list, but the list is useful without those postings.
  • The difference between booklists as artefacts and groups of postings about books that may also be listed has impact on what it means to ‘subscribe’ to them.
    • A book list, though it can change over time, is a steady artefact. Books may get added or removed just like in a library, but those changes are an expression of the will of its maker, not a direct function of time.
    • My list of blogsposts about books, in contrast is fully determined by time: new entries get added on top, older ones drop off the list because the list has a fixed length.
    • OPML is very suited for my lists as artefacts
    • RSS is very suited for lists as expression of time, providing the x most recent posts
    • Subscribing to RSS feeds is widely available
    • Subscription is not something that has a definition for OPML (that you can use OPML to list RSS subscriptions may be confusing though)
    • Inclusion however is a concept in OPML: I can add a list as a new branch in another list. If you do that once you only clone a list, and go your own seperate way again. You could also do it dynamically, where you always re-import the other list into your own. Doing it dynamically is a de-facto subscription. For both however, changes in the imported list are non-obvious.
    • If you keep a previously seen copy and compare it to the current one, you could monitor for changes over time in an OPML list (Inoreader did that in 2014 so you could see and subscribe to new RSS feeds in other people’s OPML feed lists, also see Marjolein Hoekstra’s posting on the functionality she created.).
  • I am interested in both book lists, i.e. libraries / bookshelves, the way I am interested in browsing a book case when I visit somebody’s home, and in reading people’s reviews of books in the form of postings. With OPML there is also a middle ground: a book list can for each book include a brief comment, without being a full review or opinion. In the shape of ‘I bought this because….’ this is useful input for social filtering for me.
  • While interested in both those types, libraries, and reviews, I think we need to treat them as completely different things, and separate them out. It is fine to have an OPML list of RSS feeds of reviews, but it’s not the same as having an OPML book list, I think.
  • I started at the top with quoting Dave Winer about OPML being a “simple, self-documenting, extensible and human readable format that’s capable of representing a wide variety of data that’s easily browsed and edited“. That is true, but needs some qualification:
    • While I can indeed add all kinds of data attributes, e.g. using namespaces and standardised vocabularies like schema.org, there’s no guarantee nor expectation that any OPML parser/reader/viewer would do anything with them.
    • This is the primary reason I used an XSL template for my OPML book lists, as it allows me to provide a working parser right along with the data itself. Next to looking at the raw file content itself, you can easily view in a browser what data is contained in it.
    • In fact I haven’t seen any regular outliner tool that does anything with imported OPML files beyond looking at the must have ‘text’ attribute for any outline node. Tinderbox, when importing OPML, does look also at URL attributes and a few specific others.
    • I know of no opml viewer that shows you which attributes are available in an OPML list, let alone one that asks you whether to do something with them or not. Yet exploring the data in an OPML file is a key part of discovery of other people’s lists, of the aim to federate booklists, and for adopting better or more widely shared conventions over time.
    • Are there generic OPML attribute explorers, which let you then configure what to pay attention to? Could you create something like an airtable on the fly from an OPML list?
    • Monitoring changes in OPML list you’re interested in is possible as such, but if OPML book lists you follow have different structures it quickly becomes a lot of work. That’s different from the mentioned Inoreader example because OPML lists of RSS feeds have a predefined expected structure and set of attributes right in the OPML specification.
    • Should it be the default to provide XSL templates with OPML files, so that parsing a list as intended by the creator of the list is built right into the OPML list itself?
    • Should we ‘dumb down’ lists by moving data attributes of an outline node to a sub-node each? You will reduce machine readability in favor of having basic OPML outliners show all information, because there are no machines reading everything yet anayway.

I think for the coming weeks I’ll be on the lookout for sites that have book lists and book posting feeds, to see what commonalities and differences I find.

Two years ago I wrote about the features I’d like to see in an ideal feed reader. Today I’ve tried to sketch out the various components, based on the various IndieWeb protocols, and then match the features I listed previously to the component that I think should provide it.

Separating feed reading into IndieWeb components

Feeds from blogs, to the left in the sketch below, end up in a microsub server, whose function is to fetch and store feeds. A microsub client is what presents those fetched feeds to me. A regular feed reader usually does both these things, but splitting them like the IndieWeb does allows multiple clients to use the same collection of feeds and feed items. It allows a wider diversity of readers, if the fetching is dealt with separately.
The microsub client also contains a micropub client. This allows having action buttons underneath each item presented in your microsub client. Hitting a button posts your reply or remarks, or shares something to your own website, and through your site to e.g. Twitter or Mastodon. Ideally it would be possible to have different websites to share to, next to storing locally in a specified format. The website that receives such an action has a micropub endpoint, and may need authentication through IndieAuth.

Such a reader set-up should be able to run locally on my laptop, as well as on a basic hosting package, so likely php / mysql based. Locally because I want to run my stuff local first, but the same thing online, even if a home server, because I’m not always working on my laptop and would like access from my tablet too, and point my Android Indigenous app to my subscriptions. Locally I would not need authentication from the microsub client to the microsub server, but in all other situations I would.

The sketch above is completely congruent with how I sketched my information gathering and filtering process in 2005:

filter1.jpg

Matching the ideal functionality to IndieWeb components

When I match the features of my ideal feed reader to those various IndieWeb components I think this is what results:

Microsub server needs to be able to :
* use various kinds of feeds (rss, atom, json, h-feed)
* allow folders (so I can arrange feeds on social distance)
* recognize and store tags if feed items have them
* allow me to tag _feeds_, really meaning tagging authors
* keep track of posting frequency, last post seen of feeds
* keep track of tags or predefined topics mentions/frequency
* pull in machine translations by default for certain feeds and store them with the orginal item

Microsub client needs to be able to:
* present the feed items in the server’s folder structure as a long list (the classic feed reader view)
* present views based on patterns in current feed items: what’s hot, what’s unique? Also set against social distance. (Topics discussed in my communities today)
* present views based on feed tags (show me all Germany based blog items of this morning, show me every feed of from EU based coders)
* present views based on feed tags and item tags: show me Germany based blog items talking about topic X.
* show full text search results of all items mentioning a certain topic.
* store full text search queries
* present visually which topics seem to be hot in which community, or where the frequency of mentioning a topic has changed
* provide a search of feeds (not feeds content): do I already have this feed in my list, where’s the feed of author Y?
* pull in a machine translation on request

Micropub client needs to be embedded in the microsub client and should support:
* saving an article as markdown or as html, to disk, to Zotero
* creating a todo from it by amending a textfile,
* bookmarking it, either to my blog or some other target
* sharing something about it to my blog, to Mastodon through my blog
* replying to it, through my blog to Mastodon, Twitter, other blogs
* allow configuring new actions.
* choosing from multiple micropub endpoints

What do you think, should some of these features be provided by other components than they’re currently listed? Are features missing that you’d like to see in your ideal feed reader?

Replied to Really Simple Syndication by Erik VisserErik Visser (publicworks.nl)
Vanaf begin dit jaar ben ik RSS weer aan het afstoffen. ... Netvibes weer eens bekeken en 99% van alle rss feeds werkten niet meer. ... Ik ben blij met mijn hervonden feed (die potentieel het hele internet bestrijkt). En als je nog mooie, bijzondere mensen met een site kent. Ik hoor het graag.

Naast RSS heb je, las ik bij Frank, ook Webmention aan dus schrijf ik een reactie vanaf mijn eigen blog. Ik heb 3 jaar geleden toen ik weer vaker ging bloggen en FB achter me liet, mijn feedreader leeg gegooid en opnieuw langzaam gevuld. Ik lees wat mensen schrijven (geen ‘bronnen’) en orden feeds op basis van mijn gevoelde sociale afstand tot hen. De aloude blogrol heb ik ook weer terug, nu in de vorm van een mens- en machineleesbare OPML file, die je in de meeste rss readers kunt importeren om te zien of er schrijvende mensen tussen zitten van je gading.