Last week I changed this site to provide better language mark-up. However, even though it changed mark-up correctly, it didn’t solve the issue that made me look into it in the first place: that if you click a link to a posting in my rss-feed, your browser would not detect the right language and translate the posting for you.

As it turns out, Google Translate doesn’t make any real effort to detect the language or languages of a page. It only ever checks if there is a default language indicated in the very first <html> tag of a page (which my WordPress sets to English for the entire website), and only if there is no such default set it uses a machine learning model (CLD2) to detect what language likely was used, and then only picks the most likely one. It never checks for language mark-up. It also never contemplates if multiple languages were used in a page, even though the machine learning model returns probabilities for more than one language if present in a page.

This is surprising on two levels. One, it disregards usable information even when provided (either the language mark-up, or probabilities from the ML model). Two, it makes an entire family of wrong assumptions, of which that something or someone will always be monolingual is only the first. While discussing this in a conversation with Kevin Marks, he pointed to Stephanie Booth‘s presentation at Google that he helped set up 12 years ago, listing all that is wrong with the simplistic monolingual world-view of platforms and tech silos. A dozen years on it is still all true and relevant, nothing’s changed. No wonder Stephanie and I have been talking about multi-lingual blogging off and on for as long as we’ve been blogging.

Which all goes to say that my previous changes weren’t very useful. I realised that to make auto-translation of clicked links from my feed work, I needed to set the language attribute for an entire page in the <html> tag, and not try to mark-up only the sections that aren’t in English. (Even if it is the wrong thing to do because it also means I am saying that everything that isn’t content, menu’s, tags etc, are in the declared language. And that isn’t the case. When I write postings in Dutch or German, the entire framework of my site is still in English.). After some web searching, I found a reference to writing a small function to change the default language setting, and calling that when writing the header of a page, which I adapted. The disadvantage is this gets called for every page, regardless if needed (it’s only ever needed for a single post page, or the overview pages of Dutch and German postings). The advantage is, almost all language adaptations are now in a single spot in my theme. I’ve rolled back all previous changes to the single and category templates. Only the changes to the front page template I’ve kept, so that there is still the correct language mark-up around front page postings that are not in English.


The function I added to functions.php in my child theme.


An example of changed page language setting (to German), for a posting in German. (if you follow that link and do view source, you’ll see it)

Google’s Chrome is not a browser, it’s advertisement delivery software. Adtech after all is where their profit is. This is incompatible with Doc SearlsCastle doctrine of browsers, so Chrome isn’t fit for purpose.

Removing Chrome
image by Matthew Oliphant, license CC BY ND

Read Chrome to limit full ad blocking extensions to enterprise users – 9to5Google (9to5Google)

Google shared that Chrome’s current ad blocking capabilities for extensions will soon be restricted to enterprise users. SEC filing: “New and existing technologies could affect our ability to customize ads and/or could block ads online, which would harm our business.”

Paper salesDoing this online is a neighbouring right in the new EU Copyright Directive. Photo by Alper, license CC BY

A move that surprises absolutely no one: Google won’t pay French publishers for snippets. France is the first EU country to transcribe the new EU Copyright Directive into law. This directive contains a new neighbouring right that says if you link to something with a snippet of that link’s content (e.g. a news link, with the first paragraph of the news item), you need to seek permission to do so, and that permission may come with a charge. This in the run-up to the directive was dubbed the ‘link tax’, although that falsely suggests it concerns any type of hyperlinking.
Google, not wanting to pay publishers for the right to use snippets with their links, will stop using snippets with those links.

reading the newspaperPhoto by Nicolas Alejandro, license CC BY

Ironically the link at the top is to a publisher, Axel Springer, that lobbied intensively for the EU Copyright Directive to contain this neighbouring right. Axel Springer is also why we knew with certainty up front this part of the Copyright Directive would fail. Years ago (2013) Germany, after lobbying by the same Axel Springer publishing house, created this same neighbouring right in their copyright law. Google refused to buy a license and stopped using snippets. Axel Springer saw its traffic from search results drop by 40%, others by 80%. They soon caved and provided Google with a free of charge license, to recoup some of the traffic to their sites.

read newsPhoto by CiaoHo, license CC BY

This element of the law failed in Germany, it failed in Spain in 2015 as well. Axel Springer far from being discouraged however touted this as proof that Google needed to be regulated, and continued lobbying for the same provision to be included in the EU Copyright Directive. With success, despite everyone else explaining how it wouldn’t work there either. It really comes at no surprise therefore that now the Copyright Directive will come into force in French law, it has the exact same effect. Wait for French publishers to not exercise their new neighbouring rights in 3, 2, 1…

Week 32/52.2012Photo by The JH Photography, license CC BY

News publishers have problems, I agree. Extorting anyone linking to them is no way to save their business model though (dropping toxic adtech however might actually help). It will simply mean less effective links to them, resulting in less traffic, in turn resulting in even less advert revenue for them (a loss exceeding any revenue they might hope to get from link snippet licenses). This does not demonstrate the monopoly of Google (though I don’t deny its real dominance), it demonstrates that you can’t have cake and eat it (determining how others link to you and get paid for it, but keep all your traffic as is), and it doesn’t change that news as a format is toast.

BELGIUMPhoto by Willy Verhulst, license CC BY ND

Donald Clark writes about the use of voice tech for learning. I find I struggle enormously with voice. While I recognise several aspects put forward in that posting as likely useful in learning settings (auto transcription, text to speech, oral traditions), there are others that remain barriers to adoption to me.

For taking in information as voice. Podcasts are mentioned as a useful tool, but don’t work for me at all. I get distracted after about 30 seconds. The voices drone on, there’s often tons of fluff as the speaker is trying to get to the point (often a lack of preparation I suppose). I don’t have moments in my day I know others use to listen to podcasts: walking the dog, sitting in traffic, going for a run. Reading a transcript is very much faster, also because you get to skip the bits that don’t interest you, or reread sections that do. Which you can’t do when listening, because you don’t know when a uninteresting segment will end, or when it might segue into something of interest. And then you’ve listened to the end and can’t get those lost minutes back. (Videos have the same issue, or rather I have the same issue with videos)

For using voice to ask or control things. There are obvious privacy issues with voice assistants. Having active microphones around for one. Even if they are supposed to only fully activate upon the use of the wake-up word, they get triggered by false positives. And don’t distinguish between me and other people that maybe it shouldn’t respond to. A while ago I asked around in my network how people use their Google and Amazon microphones, and the consensus was that most settle on a small range of specific uses. For those it shouldn’t be needed to have cloud processing of what those microphones tape in your living room, those should be able to be dealt with locally, with only novel questions or instructions being processed in the cloud. (Of course that’s not the business model of these listening devices).

A very different factor in using voice to control things, or for instance dictate is self-consciousness. Switching on a microphone in a meeting has a silencing effect usually. For dictation, I won’t dictate text to software e.g. at a client’s office, or while in public (like on a train). Nor will I talk to my headset while walking down the street. I might do it at home, but only if I know I’m not distracting others around me. In the cases where I did use dictation software (which nowadays works remarkably well), I find it clashes with my thinking and formulation. Ultimately it’s easier for me to shape sentences on paper or screen where I see them take shape in front of me. When dictating it easily descends into meaninglessness, and it’s impossible to structure. Stream of thought dictation is the only bit that works somewhat, but that needs a lot of cleaning up afterwards. Judging by all podcasts I sampled over the years, it is something that happens to more people when confronted with a microphone (see the paragraph above). Maybe if it’s something more prepared like a lecture, or presentation, it might be different, but those types of speech have been prepared in writing usually, so there is likely a written source for it already. In any case, dictation never saved me any time. It is of course very different if you don’t have the use of your hands. Then dictation is your door to the world.

It makes me wonder how voice services are helping you? How is it saving you time or effort? In which cases is it more novelty than effectiveness?

Some things I thought worth reading in the past days

  • A good read on how currently machine learning (ML) merely obfuscates human bias, by moving it to the training data and coding, to arrive at peace of mind from pretend objectivity. Because of claiming that it’s ‘the algorithm deciding’ you make ML a kind of digital alchemy. Introduced some fun terms to me, like fauxtomation, and Potemkin AI: Plausible Disavowal – Why pretend that machines can be creative?
  • These new Google patents show how problematic the current smart home efforts are, including the precursor that are the Alexa and Echo microphones in your house. They are stripping you of agency, not providing it. These particular ones also nudge you to treat your children much the way surveillance capitalism treats you: as a suspect to be watched, relationships denuded of the subtle human capability to trust. Agency only comes from being in full control of your tools. Adding someone else’s tools (here not just Google but your health insurer, your landlord etc) to your home doesn’t make it smart but a self-censorship promoting escape room. A fractal of the panopticon. We need to start designing more technology that is based on distributed use, not on a centralised controller: Google’s New Patents Aim to Make Your Home a Data Mine
  • An excellent article by the NYT about Facebook’s slide to the dark side. When the student dorm room excuse “we didn’t realise, we messed up, but we’ll fix it for the future” defence fails, and you weaponise your own data driven machine against its critics. Thus proving your critics right. Weaponising your own platform isn’t surprising but very sobering and telling. Will it be a tipping point in how the public views FB? Delay, Deny and Deflect: How Facebook’s Leaders Fought Through Crisis
  • Some of these takeaways from the article just mentioned we should keep top of mind when interacting with or talking about Facebook: FB knew very early on about being used to influence the US 2016 election and chose not to act. FB feared backlash from specific user groups and opted to unevenly enforce their terms or service/community guidelines. Cambridge Analytica is not an isolated abuse, but a concrete example of the wider issue. FB weaponised their own platform to oppose criticism: How Facebook Wrestled With Scandal: 6 Key Takeaways From The Times’s Investigation
  • There really is no plausible deniability for FB’s execs on their “in-house fake news shop” : Facebook’s Top Brass Say They Knew Nothing About Definers. Don’t Believe Them. So when you need to admit it, you fall back on the ‘we messed up, we’ll do better going forward’ tactic.
  • As Aral Balkan says, that’s the real issue at hand because “Cambridge Analytica and Facebook have the same business model. If Cambridge Analytica can sway elections and referenda with a relatively small subset of Facebook’s data, imagine what Facebook can and does do with the full set.“: We were warned about Cambridge Analytica. Why didn’t we listen?
  • [update] Apparently all the commotion is causing Zuckerberg to think FB is ‘at war‘, with everyone it seems, which is problematic for a company that has as a mission to open up and connect the world, and which is based on a perception of trust. Also a bunker mentality probably doesn’t bode well for FB’s corporate culture and hence future: Facebook At War.