Dutch Provinces publish open data, but it always looks like it is mostly geo-data, and hardly anything else. When talking to provinces I also get the feeling they struggle to think of data that isn’t of a geographic nature. That isn’t very surprising, a lot of the public tasks carried out by provinces have to do with spatial planning, nature and environment, and geographic data is a key tool for them. But now that we are aiding several provinces with extending their data provision, I wanted to find out in more detail.

My colleague Niene took the API of the Dutch national open data portal for a spin, and made a list of all datasets listed as stemming from a province.
I took that list and zoomed in on various aspects.

At first glance there are strong differences between the provinces: some publish a lot, others hardly anything. The Province of Utrecht publishes everything twice to the national data portal, once through the national geo-register, once through their own dataplatform. The graph below has been corrected for it.

What explains those differences? And what is the nature of the published datasets?

Geo-data is dominant
First I made a distinction between data that stems from the national geo-register to which all provinces publish, and data that stems from another source (either regional dataplatforms, or for instance direct publication through the national open data portal). The NGR is theoretically the place where all provinces share geo-data with other government entities, part of which is then marked as publicly available. In practice the numbers suggest Provinces roughly publish to the NGR in the same proportions as the graph above (meaning that of what they publish in the NGR they mark about the same percentage as open data)

  • Of the over 3000 datasets that are published by provinces as open data in the national open data portal, only 48 don’t come from the national geo-register. This is about 1.5%.
  • Of the 12 provinces, 4 do not publish anything outside the NGR: Noord-Brabant, Zeeland, Flevoland, Overijssel.

Drenthe stands out in terms of numbers of geo-data sets published, over 900. A closer look at their list shows that they publish more historic data, and that they seem to be more complete (more of what they share in the NGR is marked for open data apparantly.) The average is between 200-300, with provinces like Zuid-Holland, Noord-Holland, Gelderland, Utrecht, Groningen, and Fryslan in that range. Overijssel, like Drenthe publishes more, though less than Drenthe at about 500. This seems to be the result of a direct connection to the NGR from their regional geo-portal, and thus publishing by default. Overijssel deliberately does not publish historic data explaining some of the difference with Drenthe. (When something is updated in Overijssel the previous version is automatically removed. This clashes with open data good practice, but is currently hard to fix in their processes.)

If it isn’t geo, it hardly exists
Of the mere 48 data sets outside the NGR, just 22 (46%) are not geo-related. Overall this means that less than 1% of all open data provinces publish is not geo-data.
Of those 22, exactly half are published by Zuid-Holland alone. They for instance publish several photo-archives, a subsidy register, politician’s expenses, and formal decisions.
Fryslan is the only province publishing an inventory of their data holdings, which is 1 of their only 3 non geo-data sets.
Gelderland stands out as the single province that publishes all their geo data through the NGR, hinting at a neatly organised process. Their non-NGR open data is also all non-geo (as it should be). They publish 27% of all open non-geo data by provinces, together with Zuid-Holland account for 77% of it all.

Taking these numbers and comparing them to inventories like the one Fryslan publishes (which we made for them in 2016), and the one for Noord-Holland (which we did in 2013), the dominance of geo-data is not surprising in itself. Roughly 80% of data provinces hold is geo related. Just about a fifth to a quarter of this geo-data (15%-20% of the total) is on average published at the moment, yet it makes up over 99% of all provincial open data published. This lopsidedness means that hardly anything on the inner workings of a province, the effectivity of policy implementation etc. is available as open data.

Where the opportunities are
To improve both on the volume and on the breadth of scope of the data provinces publish, two courses of action stand open.
First, extending the availability of geo-data provinces hold. Most provinces will have a clear process for this, and it should therefore be relatively easy to do. It should therefore be possible for most provinces to get to where Drenthe currently is.
Second, take a much closer look at the in-house data that is not geo-related. About 20% of dataholdings fall in this category, and based on the inventories we did, some 90% of that should be publishable, maybe after some aggregation or other adaptations.
The lack of an inventory is an obstacle here, but existing inventories should at least be able to point the other provinces in the right direction.

Make the provision of provincial open geodata complete, embrace its dominance and automate it with proper data governance. Focus your energy on publishing ‘the rest’ where all the data on the inner workings of the province is. Provinces perpetually complain nobody is aware of what they are doing and their role in Dutch governance. Make it visible, publish your data. Stop making yourself invisible behind a stack of maps only.

We’re just over 3 weeks away from our 31 August event, the Smart Stuff That Matters unconference (#stm18).

With our summer hiatus nearing its end, I built a (still growing) list of currently registered participants. It’s a very nice mix of different backgrounds, ages, origins and interests. Some have been to our very first birthday unconference 10 years ago, others only recently became part of our personal or professional networks. Some live almost next door, some live half a world away. All have interesting stories to share, so if you haven’t registered yet but would like to come, do let me know, and bring your curiosity.

We’re now at some 30 people attending, from half a dozen countries or so. Likely we’ll end up closer to 50 participants for the unconference. Check out the list, and click some links to get a feeling for who’s coming.

#mstm14 crowd
Some of the participants in the 2014 event. Photo: Paolo Valdemarin, CC-BY-NC-SA

August 31st Elmine and I host the 4th Birthday Unconference and BBQ-Party in our home in Amersfoort. The unconference is titled “Smart Stuff that Matters”.

So what is Smart, and what Matters?

A year ago we moved to Amersfoort. A different house, a different neighbourhood, a different city. The city where our daughter will grow up.

A new environment means lots of exploration. What makes a house a home? How can you smartly adapt your house to your needs? Who lives in the neighbourhood, how do you settle in it? What makes a city your city? Which existing initiatives appeal to you, and in what ways can you contribute to them?
Whether it’s a new habit, a new device in your home, your contacts and networks, or your approach: what are smart ways to act and contribute to your residence and environment so it supports you and the others in it? In the context of much wider developments and global issues, that is. Both social and technological, at home, in your neighbourhood, your city. It’s important to approach things in ways that create meaning, enable the important things, both for you and others. Smart Stuff That Matters therefore.

20180518_162141
Our house, in the middle of our street

A full day long we’ll explore ‘smart’ in all its facets.
Smart homes (and around the home), smart neighbourhoods, smart cities.
Socially, how do we learn, communicate, organise and share? How do we act, how do we contribute? How do we find the power of collaborative agency.
And also technologically, which technologies help us, which only pretend to do so, and are these technologies sufficiently ours?
We will have the Frysklab Team joining us again with their mobile FabLab, and have plenty of space to experiment with technology that way. Such as sensors, internet of things and programming. Or to build non-digital hacks for around the home.

Frysklab in da house!
Frysklab’s truck parked at our old home in Enschede during the previous unconference

Together we’ll explore what smart means to you and us.
Bring your small and big experiences and skills, but above all bring your curiosity, and let yourself be surprised with what the others bring.
Do you have ideas about what you’d like to show, discuss, present or do?
Have ideas about what you would like to hear from others about? Let us know! We’ll build the program together!

You’ll find all relevant information about the unconference on this site. You’re also welcome to join our Facebook group for the event.

This reads like a design approach for institutions, for what I call Networked Agency:

This is not the book to convince you that the world
is changing and our systems are currently under
stress. The purpose here is to begin codifying the
practises of innovators who are consciously rethinking
institutions to better meet the challenges of
today. We describe this as stewardship: the art of
getting things done amidst a complex and dynamic
context. Stewardship is a core ability for agents of
change when many minds are involved in conceiving a
course of action, and many hands in accomplishing it.

The Helsinki Design Lab (HDL) wrote this already in 2013, a certain addition to my summer reading list: Legible Practices.
The HDL was in operation from 2008-2013, and maintains their archive on-line under a Creative Commons license (BY-SA). There’s more stuff there to read through, on using projects as probes, on hiring, and how openness isn’t enough to scale.


image Helsinki Design Lab, CC-BY-SA

Peter, like me getting to grips with Webmention, has now used it to send all his own old postings a webmention where he links to them retroactively. So now in his comment database he has a full list of all the links between his own postings.

He says “I wish I had a way of visualizing the interconnections between my posts“.
This type of thing is of interest to me too. In several forms. Like using a network mapping tool for e.g. twitter topics such as NodeXL by Marc Smith/The Social Media Research Foundation. Like having ‘live’ network mappings of how distributed conversations I am part of are shaped, such as the images I recently showed of blog conversations, but then interactively. Like visualising the links between posts as Peter went on to do.


Visualisation of blog conversations (a grey box is a cluster of posts referencing eachother


Peter’s visual of links between blogposts


Anjo Anjewierden’s 2007 visual of Lilia’s blog‘s self references on a time axis

For these types of visualisation Anjo Anjewierden as a researcher did some interesting work 2003-2008, such as building those network maps around my blog. He also looked at visualising self-referencing in blogs. There’s just one dimension there, time, he says. I disagree, as linking to oneself is just as much a distributed conversation as linking between others, and Peter’s experimental visualisation above supports that thought. So I’d be interested to see a network map of self references: which blogposts over time turn out to be more central to our writing/thinking/reflection? Much like citings are a metric in academia, they are of interest in the blogosphere as well. Anjo also released several tools as open source if I remember correctly, so some archive digging is needed.

To do what Peter did, retroactively make all the links between my own blogpostings visible, I would first also need to fix the older links. Those older links are strucured differently than more recent ones and now return 404’s. The corresponding posting still exists but has a different URL now.

I very much appreciate how Sven Knebel extensively responded to my previous posting on some Webmention issues I came across. Some of his responses do make me have new questions.

About the wrong URL, i.e. not the source of the webmention, showing up in a Webmention, Sven writes:

…. There’s a href=”https://news.indieweb.org/nl” class=”u-syndication” as the only top-level link inside his post, and no explicit url property set. This causes the microformats parser to assume that this link points to the canonical location of the post, and it is thus used for comment display. This seems like a problem with the microformats specification, and I’ll follow up on it there, but for now the easy fix would be for Frank’s posts to mark up their permalink, e.g. by adding a class=”u-url” to the link on the headline.

To me this reads as a vulnerability. I would expect my site to always take the source from the webmention message as URL. That is the only one that has been checked from my end for the presence of a reference to my site (the target). If the source page is allowed to set a different URL, even by mistake like here, that feels extremely counterintuitive. It opens it up to spam. In this case the faulty link is to a benign site, but it could have been pills or malware. It is also strange to me that my server in the comments table of the database correctly stores the source url, but in the meta data table stores a url at the discretion of the source’s website. (Meanwhile Frank has fixed it for now on his end as demonstrated by his webmention to my previous post, but my point remains)

About no content being shown of the blogpost that links to my blogposts Sven says:

This is intentional. Frank’s post only mentions your post (=includes a link to it), it is not marked up as an explicit reply. Only replies are shown with content, since for mentions this is often misleading.

This to me doesn’t make a lot of sense. [update: and for my site at least it isn’t true either, I linked back as an explicit reply to my own posting, but it still shows it as a mention].
There is indeed a difference between a direct reply to something (@Frank….) and mentioning that something as part of something else (As Frank says….). Yet that doesn’t warrant a difference in presentation, where a reply would be shown, yet for a mention just the address of the site. It also gives the source control over how something is shown on my site (by setting a different microformat for a link), while I do not have that control.
From the perspective of the reader of my blog it is not enough to only see that ‘some site links to this blogpost’ to click on that link to find out if it might be of interest, it is tremendously helpful to see a piece of that referring page to determine the context in which it refers to my blogpost.

Most if not all of my mentions of others’ blogposts aren’t meant as a direct response but as building or continuing on a line of reasoning, riffing off other people’s ideas. This is the way distributed conversations take place, how ambient humanity is established. Distributed conversations are a fundamental part of blogging to me. It’s not back and forth replies, it’s a jam session. To enjoy the jam session, you need to see the whole band at a glance, not just a list of the line-up while listening to a sole musician. Discoverability and serendipity flow from it.
It used to be that trackbacks did precisely that, show the context in which someone else referred to my blogposts. It is enriching my own posts to show that context underneath them. See below how that looked a long time ago, in a post on information strategies from 2005.

Three trackbacks on an old post of mine, showing context of the linking blogpost



These three posts are not in response to me, but reflections triggered by my posts and extensions of my contribution

So I’d definitely want to show that context for webmentions. What strikes me as odd now is how little control I have over how the Webmention and Semantic Linkbacks plugins actually deal with webmention data. The stuff I’d like to show is stored in my database, but I can’t through the plugins determine how that is shown.
The same is true on the flipside: my site adds microformats so others can machine read my blog, but apparently it doesn’t do it right. Yet I have no control from the mentioned plugins interfaces over how that is done, nor do I have documentation / insight into how the plugins are designed to comply with microformat specifications. So the next step is: read up on microformat specifications, and dive into the code of the plugins to see where it does what, and whether I can change that in ways that won’t be simply overwritten with the first update of WordPress or the plugins. [UPDATE: I installed a different WordPress Theme, called Sempress, as it should be better at adding the correct microformats for this site]

Webmentions is what makes it possible for me to write here about someone else’s blogpost and have my response show up beneath theirs. And vice versa. Earlier mechanisms such as pingback and trackback did the same thing, but slipped under the radar or succumbed to spam. Webmention is a W3C recommendation.

The webmention itself is simple
The core of webmention is straightforward: if I write something here, my webserver will try to let every site I link to in my text know I link to them. This by checking if the sites I link to have an ‘endpoint’, an antenna basically, for webmentions. If a site does, then it will send a simple message to that antenna stating two web addresses, the source (here my blogpost) and the target (here your blogpost). When your site receives a webmention it will do some checking: does my source blogpost indeed link to your target address?

What happens next is less simple
It can quickly get confusing during what happens next.
When my site receives a webmention (this source x links to your target y), all it knows is just the URL of a page that links to me. What my site displays and how it displays that as a consequence of a webmention message depends on multiple factors:

My server will try to read the source blogpost, and see what machine readable information it contains, and what it can know about the source blogpost. These machine readable parts are in the form of microformats.
My server will store some of the information it finds.
Then my website template will show some information from what the server stored when showing the target blogpost on my site.

How well that works depends on multiple factors therefore:

  1. The available machine readable info in the source blogpost, and whether that info is properly encoded
  2. The settings of my server for what it stores
  3. The settings of my site template for what it shows

When something seems to be going wrong, it could be a problem with your site, my site template or my server settings, and it is never obvious which one it is, or if it is the aggregation of multiple issues. It also depends on how easy it is to alter any settings whether you can repair or change things when webmentions are not properly dealt with. Supposedly the Webmention and Semantic Linkbacks plug-ins I use should take care of those issues but it is not obvious that they indeed do.

An example, me and Frank’s sites webmentioning each other
Frank Meeuwsen and I have been mentioning eachother several times and we’ve seen some strange webmention behaviours. For instance in one case Frank’s blog displayed not just a short part of my posting mentioning him, but my entire page including header, footer and sidebar. Clearly something wrong, likely with some of my machine readable encoding, but maybe also something wrong on his end. I suspect my machine readable encoding is indeed faulty but there’s no clear way in which I can change how my webmention plugins deal with that. And if I alter the code, which I could, it is likely the next software update will simply overwrite it.

Yesterday Frank posted about the puzzle webmention is to him in Dutch. Here are some screenshots on how pieces of that puzzle look on my end of things.

Frank’s posting lives at http://diggingthedigital.com//Waar-te-beginnen-met-Webmentions/ In his posting he refers to a posting on my site. He did not send a webmention. But I can do that myself, using a simple form at the bottom of my posting (visible at the bottom of this page too). In that webform I pasted the mentioned url, and that sends the simple webmention message. That message has been received and stored on my server, with the correct source and target address and a timestamp:

What ended up underneath my posting is:

Or as it looks for me as the site’s owner:

A few things stand out:

  • There’s no link to the actual blogpost by Frank (the source), just to his general domain
  • There’s a link to news.indieweb.org, which is a completely different domain
  • There’s no image of the author or an avatar in absence of an image
  • There isn’t any content from Frank’s post shown as part of the mention

So what’s happening? Is this an issue at Frank’s end, is it an issue with what I store on my server, or what I show in my site template? One, two, all three of them?

Puzzling over the pieces in this example

The missing avatar. My site tries to look for an avatar in the source, and if there isn’t one, it shows a general one. Here neither happens, it’s just a blank space. The HTML source of my page reveals it does try to show an avatar, the one that Frank sets in his own blog page as the one to use. His site says in the source code:

<a href="/" class="site-avatar"><img src="/images/dtd-avatar.png" class="u-photo" /></a>

The micro format u-photo is interpreted correctly by my site, and it tries to show the linked image. When you go to that image in your browser it works, but if you try to embed it in your own page it doesn’t.

Frank’s image should be visible below this line,
Frank's avatar
and above this one, but it isn’t.

Probably Frank’s web server prevents bandwidth theft by sending back a white pixel and not the requested image.
[UPDATE] The issue, as Sven points out in the comments, is that this site is https and Frank’s is http. My browser is set-up to reject http material on an otherwise https site. A case of my browser being my castle.[/UPDATE]
Making the avatar fail because my site doesn’t try to store the avatar locally.

The link to news.indieweb.org and the absence of a link to the actual blog post by Frank. The source (Frank’s blogpost) was sent and received correctly as we saw. In the machine readable part of Frank’s site a value is set as ‘canonical’ address for his blogpost.

There is an extra / in that url, and I’m not sure what that might cause, but on my end the canonical that gets saved is very different, it’s that indieweb address.

The odd bit is that indieweb.org address is not mentioned in the source of Frank’s page. At the same time, it seems it isn’t unique to my server, as underneath a posting about webmentions by Sebastiaan Andeweg you see the same thing happening. Frank’s webmention from May 12th shows the indieweb link (and no avatar). Sebastiaan doesn’t use WordPress or the plugins I use as far as I can tell.

So where’s the actual link to Frank’s blogpost? The canonical URL Frank’s posts provides is stored on my server, in the database table for comments as the URL for the author. The indieweb URL however is stored as canonical URL in the comment metadata table in my database. And that gets used for displaying the webmention underneath my blogposting.

The same is true for the absence of the content of Frank’s mention of me. It is collected and stored in the comment table of my site’s database. Yet what is shown underneath my blogpost as mention is constructed only from the comment meta data table, and not the comment table.


Frank’s mention’s content is in my comment database, yet not shown


The metadata fields stored for Frank’s mention in my database

So what’s happening here is a mix of elements from Frank’s site, my webmention plugins and my site template. But how to influence the behaviour of my plugins without seeing that undone with the next update is not clear to me at this point. Nor is how to alter the plugins so I can improve the machine readable microformats on my site.

For the Province of South-Holland we’re currently helping them to extend their open data provision. Next to looking at data they hold relevant to key policy domains, we also look at what other data is available elsewhere for those domains. For instance nationwide datasets with local granular level of detail. In those cases it can be of interest to take the subset relevant for the Province and republish that through their own channels.

One of the relevant topics is energy transition (to sustainable energy sources). Current and historic household usage is of interest here. The companies that maintain the grid publish yearly data per postcode, or at least some of them do. There are seven of these companies.
Luckily all three companies active in South-Holland do publish that data.


In South-Holland three companies are active (number 3, 5 and 6)
(Source: Energielevernanciers.nl

Having this subset of data is useful for any organisation in the region that wants to limit the amount of data they have to dig through to get what they need, for the provincial organisation itself, and for individual citizens. Households that have digital meters have access to their daily energy usage readings online. This data allows them to easily compare their personal usage with their neighbours and wider surrounding area. For instance I established that our usage is lower for both electricity and gas than average in our street. It is also easier to map, or otherwise visualise, in a meaningful way for the province and relevant regional stakeholders.

Here’s a brief overview of the steps we’re taking to get to a province-wide data set.

  • Download the data for the years available for Westland, Liander and Stedin (Westland goes back to 2010, the others to 2008)
  • Check the data formats: Westland and Stedin provide CSV, Liander XLSX
  • Check data structure: all use the same structure of fields and conventions
  • To get only the data for South-Holland we use the postcode that is mentioned in the data.
  • The Dutch postcode zones do not conform to provincial boundaries however, so we take the list of four position postcodes and determine the ones that fall within South-Holland:
    • 1428-1429
    • 2159-2164
    • 2170-3381
    • 3465-3466
    • 4126-4129
    • 4140-4146
    • 4163-4169
    • 4200-4209
    • 4213
    • 4220-4249
  • The data contains 6 position postcodes of the structure 1234AB. We need to split them into the four digits and the two letters, to be able to match them with the ranges that fall within the province.
  • For personal data protection purposes, in the data, for 6 position postcodes where the number of addresses in that postcode is less than 10, the data is aggregated with a neighbouring postcode, until the number of addresses is higher than 9. It is not certain that those aggregations fall within a single province. The data provides a ‘from’ 6 position postcode and a ‘to’ 6 position postcode. This is the same value where the number of addresses in a postcode is high enough but can be a wider range.
    • We need to test if the entire postcode range in a single data record falls within one of the ranges of postcodes that belong in South-Holland.
    • For the small number of aggregates that fall into two provinces we can adopt the average usage number, but need to mark that the number of households in that area is unknown,
    • or retrieve the actual number of addresses from the national address and building database, and mark that the average energy usage values are from a larger number of addresses.
    • Alternatively we can keep the entire range, including the part outside the province,
    • or we exclude the entire range and leave a ‘hole in the map’.
    • In any case we need to mark in the data what we did, and why.
  • The result is then a data set in CSV that consolidates the three sources for all those records that fall within the province.
  • This dataset can then be mapped, e.g. in Q-GIS or other tools in use within the province South-Holland.
  • We provide a recipe and/or script from the above steps that can take the future yearly data sets from the three sources and turn them into a consolidated subset for South-Holland, so that the province can automate keeping the data up to date.

When I talk about Networked Agency, I talk about reducing the barrier to entry for all kinds of technology as well as working methods, that we know work well in a fully networked situation. Reducing those barriers allows others to adopt these tools more easily and find power in refound ability to act. Networked agency needs tech and methods that can be easily deployed by groups, and that work even better when federated across groups and the globe-spanning digital human network.

The IndieWeb’s principles (own your own data, use tools that work well on their own, and better when federated, avoid silos as the primary place of where you post content) fit well with that notion.

Recently I said that I was coming back to a lot of my material on information strategies and metablogging from 2003-2006, but now with more urgency and a change in scope. Frank asked what I meant, and I answered

that the principles of the open web (free to use, alter, tinker, control, trust by you/your group) also apply to other techs (for instance energy production, blockchain, biohacking, open source hardware, cheap computing hardware, algorithms, IoT sensors and actuators) and methods (p2p, community building, social media usage/production, group facilitation etc.). Only then are they truly empowering, otherwise you’re just the person it is ‘done to’.

Blockchain isn’t empowering you to run your own local currency if you can only run it on de-facto centralised infrastructure, where you’re exposed to propagating negative externalities. Whether it is sudden Ethereum forks, or the majority of BTC transactions being run on opaque Chinese computing clusters. It is empowering only if it is yours to deploy for a specific use. Until you can e.g. run a block chain based LETS easily for your neighbourhood or home town on nodes that are Raspberry Pi’s attached to the LETS-members’ routers, there is no reliable agency in blockchain.

IoT is not empowering if it means Amazon is listening into all your conversations, or your fire alarm sensors run through centralised infrastructure run by a telco. It is empowering if you can easily deploy your own sensors and have them communicate to an open infrastructure for which you can run your own gateway or trust your neighbour’s gateway. And on top of which your group does their own data crunching.

Community building methods are not empowering if it is only used to purposefully draw you closer to a clothing brand or football club so they can sell your more of their stuff. Where tribalism is used to drive sales. It is empowering if you can, with your own direct environment, use those methods to strengthen local community relationships, learn how to collectively accommodate differences in opinions, needs, strengths and weaknesses, and timely reorient yourself as a group to keep momentum. Dave Winer spoke about working together at State of the Net, and 3 years ago wrote about working together in the context of the open web. To work together there are all kinds of methods, but like community building, those methods aren’t widely known or adopted.

So, what applies to the open web, IndieWeb, I see applies to any technology and method we think help increase the agency of groups in our networked world. More so as technologies and methods often need to be used in tandem. All these tools need to be ‘smaller’ than us, be ours. This is a key element of Networked Agency, next to seeing the group, you and a set of meaningful relationships, as the unit of agency.

Not just IndieWeb. More IndieTech. More IndieMethods.

How would the ‘Generations‘ model of the IndieWeb look if transposed to IndieTech and IndieMethods? What is Selfdogfooding when it comes to methods?

More on this in the coming months I think, and in the runup to ‘Smart Stuff That Matters‘ late August.

Triggered by some of the previous postings on RSS, I started thinking about what my ideal set-up for RSS reading would be. Because maybe there’s a way to create that for myself.

A description of how I approach my feeds, and what I would ideally like to be able to do, I already penned a decade ago, and it hasn’t really changed much.

The basic outline is:

  • I think of feed subscriptions as subscribing to people. I don’t follow your blog, but I follow and interact with you. I used to have a blogroll that reflected that by showing the faces of people whose writing I read. Basically the web is my social network always, In my feed reader every feed title is the name of the author, not the blog’s title.
    my blogroll in 2005, people’s faces, not site names
  • The feeds I subscribe to, I group in folders by subjective social distance, roughly following Dunbar-style group sizes. The dozen closest to me, the 50, the 150, the 500 beyond that, and above that 999 for people I don’t have a direct connection with at all. So my wife’s blog feed is in folder a12, and if I’ve just come across your blog this week and we never met, your feed will be in e999. The Keep Track folder are my own content feeds from various platforms.
    the folders in my current feedreader by social distance
  • There are three reading styles I’d like my reader to support, of which it only does one.
    • I read to see what is going on with people I know, by browsing through their writing from closer to further away, so from the a12 folder towards the e999 folder. This my reader supports, by way of allowing a folder structure for it
    • I read outside-in, looking at the general patterns in all the new postings of a day: what topics come up, what are people working on, what do they care about. This is not supported yet, other than scrolling through the whole thing. A quick overview of topics versus social distance would be useful here.
    • I read inside-out, where I have specific questions, ideas or topics on my mind and want to see if some of the people in my reader have been wrting about it recently. This is not supported yet. A good way to search my feeds would be needed.
  • I would like to be able to tag feeds. So I can contextualise the author (coder, lives in Portugal, interested in privacy by design, works independently). This allows me to look at different groups of people across the social distance related folders. E.g. “what are the people I follow in Berlin up to this week, as I will be visiting in a few days?” “What are the current concerns in the IndieWeb community?” Ten years ago I visualised that as below
    Plotting contexts

    Social distances with community and multi-faceted contexts plotted on them

  • I would like to be able to pull in tags of postings and have full content search functionality. This would support my inside-out reading. “What is being said today in my feeds about that conference I didn’t go to?” “Any postings today on privacy by design?”
  • I think I’d like visual representations of which communities are currently most active, and for topics, like heat maps. Alerts on when the level of activity for a feed or a community or subsets of people changes would be nice too.
  • From the reader follow actions, such as saving an article, creating a todo from it, bookmarking it, or sharing it in some channel. An ideal reader should support all those actions, or let me configure actions

From the whole IndieWeb exploration of late, I realized that while no feedreader does all the above, it might be possible to build something myself. TinyTiny RSS seems a good starting point. It’s an open source tool you can run as your own instance. It comes with features such as filtering and auto-tagging that might fit my needs. It can be hosted on my own domain, and it has a database I then have back-end access to, to build features it doesn’t have itself (such as visualisations and specific sharing actions). It can also produce RSS feeds. It seems with TinyTiny RSS I could do all kinds of things to the RSS feeds I pull in on my server, and push the results out again as RSS feeds themselves. Those I could load into my regular reader, or republish etc.

Now need to find a bit of time to set it up and to play with it.

Came across this post by Ruben Verborgh from last December, “Paradigm Shifts for the Decentralised Web“.

I find it helpful because of how it puts different aspects of wanting to decentralise the web into words. Ruben Verborgh mentions 3 simultaneous shifts:

1) End-users own their data, which is the one mostly highlighted in light of things like the Cambridge Analytica / Facebook scandal.

2) Apps become views, when they are disconnected from the data, as they are no longer the single way to see that data

3) Interfaces become queries, when data is spread out over many sources.

Those last two specifically help me think of decentralisation in different ways. Do read the whole thing.

Slate saw their traffic from Facebook drop by 87% in a year after changes in how FB prioritises news and personal messages in your timeline. Talking Points Memo reflects on it and doing so formulates a few things I find of interest.

TPM writes:
Facebook is a highly unreliable company. We’ve seen this pattern repeat itself a number of times over the course of company’s history: its scale allows it to create whole industries around it depending on its latest plan or product or gambit. But again and again, with little warning it abandons and destroys those businesses.” …”Google operates very, very differently.”..”Yet TPM gets a mid-low 5-figure check from Google every month for the ads we run on TPM through their advertising services. We get nothing from Facebook.”..”Despite being one of the largest and most profitable companies in the world Facebook still has a lot of the personality of a college student run operation, with short attention spans, erratic course corrections and an almost total indifference to the externalities of its behavior.

This first point I think is very much about networks and ecosystems, do you see others as part of your ecosystem or merely as a temporary leg-up until you can ditch them or dump externalities on.

The second point TPM makes is about visitors versus ‘true audience’.
“we are also seeing a shift from a digital media age of scale to one based on audience. As with most things in life, bigger is, all things being equal, better. But the size of a publication has no necessary connection to its profitability or viability.” It’s a path to get to a monopoly that works for tech (like FB) but not for media, the author Josh Marshall says. “…the audience era is vastly better for us than the scale era”

Audience, or ‘true audience’ as TPM has it, are the people who have a long time connection to you, who return regularly to read articles. The ones you’re building a connection with, for which TPM, or any newsy site, is an important node in their network. Scaling there isn’t about the numbers, although numbers still help, but the quality of those numbers and the quality of what flows through the connections between you and readers. The invisible hand of networks more than trying to get ever more eye-balls.

Scale thinking would make blogging like I do useless, network thinking makes it valuable, even if there are just 3 readers, myself included. It’s ‘small b’ blogging as Tom Critchlow wrote a few months ago. “Small b blogging is learning to write and think with the network“. Or as I usually describe it: thinking out loud, and having distributed conversations around it. Big B blogging, Tom writes, in contrast “is written for large audiences. Too much content on the web is designed for scale” and pageviews, where individual bloggers seem to mimick mass media companies. Because that is the only example they encounter.

Google Reader five years dead

Five years ago, on July 1st 2013, Google killed their Google Reader. It was then probably the most used way to keep track of websites through RSS.

RSS allows you to see the latest articles from a website, and thus makes it easier to keep track of many different blogs and sites all at once. RSS is a very important part of the plumbing of internet.

I used to read everything through RSS, but over time people migrated to FB, stopped blogging, fell silent. Services like Twitter stopped providing RSS, website owners forgot it was a standard feature. Google Reader stopping meant that many casual users stopped reading the web with RSS. Browsers stopped visibly supporting it.

Five years on, Dave Winer writes, Google Reader centralised a decentralised technology. We should have been more alert, choose an independent tool to read, and not hand it to a silo. Frank Meeuwsen says similar things, that removing the biggest RSS reader made room for new growth and variety.

Therefore:

Long Live RSS

Aral Balkan, timed with the 5th anniversary of Google Reader’s demise, blogged “Reclaiming RSS“, explaining what it does. RSS is an important part of letting the web be what it is best at, a decentralised space where all can read and write. Earlier in April Wired called for a RSS Revival as well.

How I read RSS
I interact with people, I don’t follow sites. That is how I shape and perceive my RSS reading, having all the feeds named after their authors, and grouping them in my reader roughly along my perceived social distance. From a group of people closest to me, to people I know well, not very well, to strangers whose writing I came across. Roughly following Dunbar-like group size levels. See the image below. This is because I see blogging as distributed conversations: you write something, I respond with some of my own writing. Blogposts building upon blogposts. So I set up my feedreader to reflect that conversational aspect. I use ReadKit currently, as I prefer an offline reader.


Screenshot of part of my RSS reader, in rough groups of social distance

You can find the link to my blog’s RSS feed in the right side column (with the orange icon). Any feedreader should also be able to automatically detect my blog’s feed if you point it to my web address.
I also publish which feeds I am reading. On the right hand side you see a link to ‘OPML Blogroll‘ which is a file of all the feeds I currently read that machines can read (the list is a bit out of date, but I update it every month or so). Any feed reader should be able to import that file.

In 2005 I described how I read RSS then, in response to Lee Lefever asking about people’s RSS reading routines. The current ‘grouping by social distance’ way of reading I have is the result of the search I mention at the end of that blogpost, on how to deal with a larger number of feeds.

How do you read RSS feeds?

Thirteen years on, I’m curious how you use RSS. How many feeds, what (daily) routine, what topics? How do you read RSS feeds?
I look forward to reading your take on it in my feedreader.

Dave Winer, one of, if not the, earliest bloggers asks what became of the blogosphere? It was a topic of the conversations in Trieste 2 weeks ago at State of the Net, where we both were on the program.

I get what he says about losing the center, and seeing that center as a corporation back then. This much in the way Tantek Celik talked about the silos first being friendly and made by the people we knew, but then got sold, which I wrote about yesterday. Creating a new center, or centers, is worthwile I concur with Dave, and if it can’t be a company at the center, then maybe it should be a network or an organisational manifestation thereof, such as a cooperative. An expression of networked agency.

Because of that I wonder about Dave’s last point “There used to be a communication network among bloggers, but that’s gone now.”

I asked (on Facebook), “What to you was that previous communications network, and what was it built on? What type of communications would you like to see re-emerge?” The answer is about being able to discover other bloggers, like Dave’s Weblogs.com platform used to do (and still does, but most updates are spam).

Blogs to me are distributed conversations. Look at the unbridled enthusiasm I expressed 11 years ago when I wrote about 5 years of blogging in this space, and the list of people I then regarded as my regular group of people I had blogged conversations with. It is currently harder to create those, and it has become harder for me to notice when something I write is reacted to as well. Much of the IndieWeb discussion is about at least being able to discover all online facets of someone from their own domain, and pulling responses to it back there too. Something I need to explore more how to do in a way that fits me.

In terms of communication and connecting, it would be great if I could explore the blogosphere much as in the picture below. Created by Anjo Anjewierden and presented at the AOIR conference in Chicago in 2005 by Lilia Efimova, it shows a representation of my blog network based on text analysis of my and other people’s blogs. It’s a pretty good picture of what my blog ‘neighbourhood’ looked like then.

Or this one also by Anjo Anjewierden from 2008, titled “the big one”. It shows conversations between my and other’s blogs. Grey boxes are conversations across blogs (the bigger the box, the more blogpostings), the other dots are postings that refer to such a conversation but aren’t part of it. Top-left a box is ‘opened up’ to show there are different postings (colored dots) inside it.

Makes me want to have a personal crawler that maps out connections between blogs! Are there any ‘personalised’ crawlers out there?

When Hossein Derakshan came back on-line after a 6 year absence in 2015, he was shocked to find how the once free flowing web ended up in walled gardens and silo’s. Musing about what he presented at State of the Net earlier this month, I came across Frank Meeuwsen’s posting about the IndieWeb Summit starting today in Portland (livestream on YT). That send me off on a short trip around the IndieWeb and related topics.

I came across this 2014 video of Tantek Celik. (he, Chris Messina and Andy Smith organised the first ever BarCamp in 2005, followed by a second one in Amsterdam where I met the latter two and many other fellow bloggers/techies)

In his talk he looks back at how the web got silo’d, and talks from a pure techie perspective about much the same things Hoder wrote about in 2015 and talked about this month. He places ‘peak open web’ in 2003, just before the web 2.0 silos came along. Those first silo’s (like Flickr, delicious etc) were ‘friendly silo’s’. We knew the people who built them, and we trusted their values, assumed the open web was how it was, is and would remain.

The friendly silos got sold, other less friendly silos emerged.
The silos have three things that make them hugely attractive. One ‘dark pattern’ which is adding functionality that feeds your dopamine cravings, such as like and heart buttons. The other two are where the open web is severely lacking: The seamless integration into one user interface of both reading and writing, making it very easy to respond to others that way, or add to the river of content. And the ability to find people and walk the social graph, by jumping from a friend to their list of friends and so on. The open web never got there. We had things like Qumana that tried to combine reading and writing, but it never really took off. We had FOAF but it never became easy.

So, Tantek and others set out in 2011 to promote the open web, IndieWeb, starting from those notions. Owning your data and content, and federating to participate. In his slides he briefly touches upon many small things he did, some of which I realised I could quickly adopt or do.
So I

  • added IndieAuth to my site (using the IndieAuth WP plugin), so that I can use my own website to authenticate on other services such as my user profile at IndieWeb wiki (a bit like Facebook Connect, but then from my own server).
  • added new sharing buttons to this site that don’t track you simply by being displayed (using the GDPR compliant Sharrif plugin), which includes Diaspora and Mastodon sharing buttons
  • followed Tantek’s notion of staying in control of the URLs you share, e.g. by using your own URLs such as zylstra.eu/source/apple-evernote-wordpress to redirect to my GitHub project of that name (so should GitHub be eaten alive after Microsofts take-over, you can run your own Gitnode or migrate, while the URLs stay valid).
  • decided to go to IndieWeb Camp in Nuremburg in October, together with Frank Meeuwsen

My stroll on the IndieWeb this morning leaves me with two things:

  • I really need to more deeply explore how to build loops between various services and my site, so that for all kinds of interactions my site is the actual repository of content. This likely also means making posting much easier for myself. The remaining challenge is my need to more fluidly cater to different circles of social distance / trust, layers that aren’t public but open to friends
  • The IndieWeb concept is more or less the same as what I think any technology or method should be to create networked agency: within control of the group that deploys it, useful on its own, more useful federated, and easy enough to use so my neighbours can adopt it.