I had thought I had the language stuff all sorted, also because I had tested it. As it turns out, Google Translate works slightly different than my earlier conclusion.
It does look at the declared language in the <html> tag, but it doesn’t do so exclusively. Even if the language is declared it seems to still also look at the machine learning model.
This has as an effect that when a posting here in Dutch or German is very short, tweet-like, it will still detect that most of the page is in English (navigation structure, sidebar etc.), and treat the entire page as English. This makes even less sense than my earlier notion it follows the declared language, and machine learning if nothing’s declared, as it seems to actively distrust even the little bit of language mark-up it bothers to check in the first place.

It does mean that adding machine translation links at the end of Dutch and German posting is a good service to provide. Here I can’t trust the auto-detect feature of Google Translate (see above), so I must force the correct source language in the link I provide. This doubles the code needed (once for Dutch, once for German), but it works. The code is in the same function I previously adopted from Frank and Jan. I’ve added the translation links only to the RSS-feed, not to the website. My reasoning is that most of my regular readers do so through RSS, and that it’s them that might be interested in also reading my non-English postings.

A posting in Dutch as it appears on the site

The same posting in Dutch as it appears in the RSS feed, with added link to machine translation

For now I’m done with language adaptations. Although, having looked at some of the older conversations concerning multilingualism I’ve had over the years, I also considered how Stephanie Booth adds English excerpts at the start of a posting in French and vice versa. That might be something to emulate. However, it should not clutter up the postings or feeds too much, so likely should be another field. As I’m already using the excerpt field for other things (posting to Twitter and Mastodon mostly), that’s something to figure out in the future.

4 reactions on “Adding Better Language Support III

  1. I set up different sites altogether, for my English and Dutch rants, respectively, but will occasionally mark short bits of text as written in another language using HTML’s `lang` attribute. I’ve no idea how that affects Google Translate, but thought screen readers treat them accordingly.

  2. Both sites cover different topics, too—and are probably not that interesting at all, to anyone but myself—but I sometimes do wish I had everything in one place. Both online “personas,” or identities, are still very much part of, well, me (a complete idiot).

Comments are closed.