[TL;DR: A long tail is needed for distributed technology to be sustainable I think, otherwise it’s just centralisation and single points of failure in a different form. A long tail means the bottom 80% take over 50% of a market, and the top 20% under 50%. Mastodon currently has over 85% of its participants in the top 20% of instances, and it’s worse than that as 77% of participants are in 0,7% of instances. Just 15% are in the bottom 80% of instances. There’s a power law distribution, but it’s not a long tail. What can Mastodon do to get there and to sustainability?]

On 6 October 2016 Mastodon was launched, and its originator Eugen Rochko looks back in a blogpost on the journey of the past two years.

I joined on 7 April 2017, 6 months after its launch, at the Mastodon.cloud instance. I posted some messages for a month, then fell quiet for half a year. A few messages last March, and then I started using it more frequently last month, in the run-up to figuring out how to run Mastodon for myself (which for now means a hosted solution, but still aiming for running it from the home router). It’s now part of my daily information diet, but no guarantee yet it will last, although being certain I have ‘my half’ of the conversation on a domain I own helps a lot towards maintaining worthwhile exchanges.

Eugen’s blogpost is rightfully proud of what has been accomplished. It’s not yet proof of the sustainability of federated solutions though as he suggests.

He shares a few interesting numbers about the usage of Mastodon. The median of the 3460 known instances is 8 users. In total there are 1.627.557 registered accounts. The largest instance has 415.941 members, while the top 3 together have 52% of users, meaning the number 2 and 3 average 215.194 accounts. The top 25 largest instances have 77% or 1.253.219 members, meaning that the numbers 4-25 average 18.495 users. As the median is 8 it means the smallest 1730 instances have at most 8*1730 = 13.840 users. It also means that the number 26 to number 1730 instances have at least 360.498 members, or an average of 211. This tells us there’s a Pareto power law distribution: the top 20% of instances hold at least 85% of users at the moment. That also means there is no long tail, just a stub that holds at most 15% of Mastodon users only. For a long tail to exist, the smallest 80% of instances should account for over 50% of users, or over three times more than the current number.

As the purpose of Mastodon is distribution, where federation allows everyone to connect regardless of their instances (sort of like e-mail), I think Mastodon can only be deemed sustainable if there is a true long tail. Meaning, that while the number of users goes up, the number of instances should go up at a faster rate. So that over 50% of all Mastodon users will be on the 80% smallest or even individual instances. In the current numbers we should be most interested in the 50% of instances that now have 8 or less users, and find out what drives those instances, so we may have many many more of them. We should also think about what a bigger-to-smaller-instances funnel for members can look like, not just leave it to chance. I think that the top 25 Mastodon instances, which is just 0.7% of the total, currently having 77% of all users is very problematic from a sustainability perspective. Because that level of concentration is completely at odds with the stated purpose of Mastodon: distribution.

Eugen Rochko in his anniversary posting points at a critical article from April 2017 in Mashable, implying that criticaster has been been proven wrong definitively. I disagree. While much of the ‘predictions’ in that article are indeed silly, it also contains a few hints as to where sustainability may be found. The criticaster doesn’t get federation (yet likely uses mail everyday), and complains about discovery (yet likely is relieved not all his personal e-mail addresses are to be found in Google). Yet if we can’t explain distribution and federation, and can’t or don’t communicatie how discovery works in such a setting then we won’t be able to make a long tail grow. For more people to adopt small or individual instance we need to bring the threshold for running your own instance way down, and then way down again. To the level of at most one click installing a script on any regular hosting service, and creating a first account.

Using open protocols, like ActivityPub which Mastodon supports, is key in getting more people out of walled gardens and silos, and on the open web. Tracking its adoption is a useful measure of success, but 2 years of existence is not a sign of sustainability at all. What Eugen Rochko has kicked off with Mastodon is valuable and very laudable, but we have barely started getting to where we need to be for it to stick.

We’re in a time where whatever is presented to us as discourse on Facebook, Twitter or any of the other platforms out there, may or may not come from humans, bots, or someone/a group with a specific agenda irrespective of what you say or respond. We’ve seen it at the political level, with outside influences on elections, we see it in things like gamer gate, and in critiques of the last Star Wars movie. It creates damage on a societal level, and it damages people individually. To quote Angela Watercutter, the author of the mentioned Star Wars article,

…it gets harder and harder to have an honest discussion […] when some of the speakers are just there to throw kerosene on a flame war. And when that happens, when it’s impossible to know which sentiments are real and what motivates the people sharing them, discourse crumbles. Every discussion […] could turn into a […] fight — if we let it.

Discourse disintegrates I think specifically when there’s no meaningful social context in which it takes place, nor social connections between speakers in that discourse. The effect not just stems from that you can’t/don’t really know who you’re conversing with, but I think more importantly from anyone on a general platform being able to bring themselves into the conversation, worse even force themselves into the conversation. Which is why you never should wade into newspaper comments, even though we all read them at times because watching discourse crumbling from the sidelines has a certain addictive quality. That this can happen is because participants themselves don’t control the setting of any conversation they are part of, and none of those conversations are limited to a specific (social) context.

Unlike in your living room, over drinks in a pub, or at a party with friends of friends of friends. There you know someone. Or if you don’t, you know them in that setting, you know their behaviour at that event thus far. All have skin in the game as well misbehaviour has immediate social consequences. Social connectedness is a necessary context for discourse, either stemming from personal connections, or from the setting of the place/event it takes place in. Online discourse often lacks both, discourse crumbles, entropy ensues. Without consequence for those causing the crumbling. Which makes it fascinating when missing social context is retroactively restored, outing the misbehaving parties, such as the book I once bought by Tinkebell where she matches death threats she received against the sender’s very normal Facebook profiles.

Two elements therefore are needed I find, one in terms of determining who can be part of which discourse, and two in terms of control over the context of that discourse. They are point 2 and point 6 in my manifesto on networked agency.

  • Our platforms need to mimick human networks much more closely : our networks are never ‘all in one mix’ but a tapestry of overlapping and distinct groups and contexts. Yet centralised platforms put us all in the same space.
  • Our platforms also need to be ‘smaller’ than the group using it, meaning a group can deploy, alter, maintain, administrate a platform for their specific context. Of course you can still be a troll in such a setting, but you can no longer be one without a cost, as your peers can all act themselves and collectively.
  • This is unlike on e.g. FB where the cost of defending against trollish behaviour by design takes more effort than being a troll, and never carries a cost for the troll. There must, in short, be a finite social distance between speakers for discourse to be possible. Platforms that dilute that, or allow for infinite social distance, is where discourse can crumble.

    This points to federation (a platform within control of a specific group, interconnected with other groups doing the same), and decentralisation (individuals running a platform for one, and interconnecting them). Doug Belshaw recently wrote in a post titled ‘Time to ignore and withdraw?‘ about how he first saw individuals running their own Mastodon instance as quirky and weird. Until he read a blogpost of Laura Kalbag where she writes about why you should run Mastodon yourself if possible:

    Everything I post is under my control on my server. I can guarantee that my Mastodon instance won’t start profiling me, or posting ads, or inviting Nazis to tea, because I am the boss of my instance. I have access to all my content for all time, and only my web host or Internet Service Provider can block my access (as with any self-hosted site.) And all blocking and filtering rules are under my control—you can block and filter what you want as an individual on another person’s instance, but you have no say in who/what they block and filter for the whole instance.

    Similarly I recently wrote,

    The logical end point of the distributed web and federated services is running your own individual instance. Much as in the way I run my own blog, I want my own Mastodon instance.

    I also do see a place for federation, where a group of people from a single context run an instance of a platform. A group of neighbours, a sports team, a project team, some other association, but always settings where damaging behaviour carries a cost because social distance is finite and context defined, even if temporary or emergent.

    Is this why Bridgy can’t find my web address on Twitter and returns an error when I try to post to Twitter from my WordPress blog? Bridgy expects a rel=”me” reference to my site’s URL on both my blog and my Twitter profile. I have that, but the Twitter one is a t.co shortened version and only shows my actual url as title, not as the link. Like rel=”me” href=”https://t.co/OaBGAJ7WV6″ title=”https://zylstra.org/blog”. So no 2-way confirmation of the relationship? [UPDATE It’s a missing www on my site. Tested with indiewebify.me and works now]

    Some links I thought worth reading the past few days

    Many tech companies are rushing to arrange compliance with GDPR, Europe’s new data protection regulations. What I have seen landing in my inbox thus far is not encouraging. Like with Facebook, other platforms clearly struggle, or hope to get away, with partially or completely ignoring the concepts of informed consent and unforced consent and proving consent. One would suspect the latter as Facebooks removal of 1.5 billion users from EU jurisdiction, is a clear step to reduce potential exposure.

    Where consent by the data subject is the basis for data collection: Informed consent means consent needs to be explicitly given for each specific use of person related data, based on a for laymen clear explanation of the reason for collecting the data and how precisely it will be used.
    Unforced means consent cannot be tied to core services of the controlling/processing company when that data isn’t necessary to perform a service. In other words “if you don’t like it, delete your account” is forced consent. Otherwise, the right to revoke one or several consents given becomes impossible.
    Additionally, a company needs to be able to show that consent has been given, where consent is claimed as the basis for data collection.

    Instead I got this email from Twitter earlier today:

    “We encourage you to read both documents in full, and to contact us as described in our Privacy Policy if you have questions.”

    and then

    followed by

    You can also choose to deactivate your Twitter account.

    The first two bits mean consent is not informed and that it’s not even explicit consent, but merely assumed consent. The last bit means it is forced. On top of it Twitter will not be able to show content was given (as it is merely assumed from using their service). That’s not how this is meant to work. Non-compliant in other words. (IANAL though)

    My friend Peter Rukavina blogged how he will no longer push his blogpostings to Facebook and Twitter. The key reason is that he no longer wants to feed the commercial data-addicts that they are, and really wants to be in control of his own online representation: his website is where we can find him in the various facets he likes to share with us.

    Climbing the Wall
    Attempting to scale the walls of the gardens like FB that we lock ourselves into

    This is something I often think about, without coming to a real conclusion or course of action. Yes, I share Peters sentiments concerning Facebook and Twitter, and how everything we do there just feeds their marketing engines. And yes, in the past two years I purposefully have taken various steps to increase my own control over my data, as well as build new and stronger privacy safeguards. Yet, my FB usage has not yet been impacted by that, in fact, I know I use it more intensively than a few years ago.

    Peter uses his blog different from me, in that he posts much more about all the various facets of himself in the same spot. In fact that is what makes his blog so worthwile to follow, the mixture of technology how-to’s, and philosphical musings very much integrated with the daily routines of getting coffee, or helping out a local retailer, or buying a window ventilator. It makes the technology applicable, and turns his daily routines into a testing ground for them. I love that, and the authentic and real impact that creates where he lives. I find that with my blog I’ve always more or less only published things of profession related interests, which because I don’t talk about clients or my own personal life per se, always remain abstract thinking-out-loud pieces, that likely provide little direct applicability. I use Twitter to broadcast what I write. In contrast I use FB to also post the smaller things, more personal things etc. If you follow me on Facebook you get a more complete picture of my everyday activities, and random samplings of what I read, like and care about beyond my work.

    To me FB, while certainly exploiting my data, is a ‘safer’ space for that (or at least succeeds in pretending to be), to the extent it allows me to limit the visibility of my postings. The ability to determine who can see my FB postings (friends, friends of friends, public) is something I intensively use (although I don’t have my FB contacts grouped into different layers, as I could do). Now I could post tumblerlike on my own blog, but would not be able to limit visibility of that material (other than by the virtue of no-one bothering to visit my site). That my own blog content is often abstract is partly because it is all publicly available. To share other things I do, I would want to be able to determine its initial social distribution.

    That is I think the thing I like to solve: can I shape my publications / sharings in much the same way I shape my feedreading habits: in circles of increasing social distance. This is the original need I have for social media, and which I have had for a very long time, basically since when social media were still just blogs and wikis. Already in 2006 (building on postings about my information strategies in 2005) I did a session on putting the social in social media front and center, together with Boris Mann at Brussels Barcamp on this topic, where I listed the following needs, all centered around the need to let social distance and quality of relationships play a role in publishing and sharing material:

    • tools that put people at the center (make social software even more social)
    • tools that let me do social network analysis and navigate based on that (as I already called for at GOR 2006)
    • tools that use the principles of community building as principles of tool design (an idea I had writing my contribution to BlogTalk Reloaded)
    • tools that look at relationships in terms of social distance (far, close, layers in between) and not in terms of communication channels (broadcasting, 1 to 1, and many to many)
    • tools that allow me to shield or disclose information based on the depth of a relationship, relative to the current content
    • tools that let me flow easily from one to another, because the tools are the channels of communication. Human relationships don’t stick to channels, they flow through multiple ones simultaneously and they change channels over time.

    All of these are as yet unsolved in a distributed way, with the only option currently being getting myself locked into some walled garden and running up the cost of moving outside those walls with every single thing I post there. Despite the promise of the distributed net, we still end up in centralized silo’s, until the day that our social needs are finally met in distributed ways in our social media tools.

    In the past few weeks I have seen several discussions on how to deal with an increase in Twitter follower requests stemming from unknown people or from ‘spammy’ sources. I think finding your own guidelines in how to deal with following and followers can be straightforward if you look at how you want to interact through Twitter.
    For me social media is all about conversation. True conversation in the definition of Habermas, but also by everyone’s experience, is symmetrical. It is an even exchange of ideas, views, where both have the same level of effort to be able to take part, and the same power within the exchange. I bring that notion of conversational symmetry to tools like Jaiku and Twitter.

    Large antenna array profile type in twitter

    It means that my postings there are not public, but only visible for contacts thus ensuring that we can see eachothers postings.
    It means that if you request to follow me, and there is a large unbalance in your number of followers and the number you follow, I will deny your request.

    Spammy profile type in twitter

    If you follow orders of magnitude more people than follow you, one on one interaction is not your goal apparantly. You’re building a phonebook, or you’re soaking up a large collage of everything that’s being posted, as a large antenna array. It’s a way of usage, but it’s not conversation.
    If you follow orders of magnitude less people than follow you, you’re someone others like to keep track of. Kind of like the A-list bloggers of old. It’s the celebrity profile so to speak. You can use it as mass media then. If you also want conversation as a ‘celebrity’ it might be useful to keep a seperate, non-public, Twitter account for that, while maintaining a public one to inform people of your public actions.

    A-lister profile type in Twitter

    If the number of followers divided by the number you’re following is near 1 chances are you use Twitter for conversation style exchanges. It’s what I do.
    I respond to what others write, and write stuff that might trigger conversation. Much like particles that help freeze water quicker, or create more bubbles in your soda.
    Also I seem to have different circles of people active in Twitter compared to Jaiku, different circles of conversation.

    Conversational symmetry profile type in Twitter