Author Archives: Ton Zijlstra

ELLIS as the CERN for AI

I an open letter (PDF) a range of institutions call upon their respective European governments to create ELLIS, the European Lab for Learning and Intelligent Systems. It’s an effort to fortify against brain drain, and instead attract top talent to Europe. It points to the currently weak position in AI of Europe between what is happening in the USA and in China, adding a geo-political dimension. The letter calls not so much for an institution with a large headcount, but for commitment to long term funding to attract and keep the right people. These are similar reasons that led to the founding of CERN, now a global center for physics (and a key driver of things like open access to research and open research data), and more recently the European Molecular Biology Laboratory.

At the core the signatories see France and Germany as most likely to act to start this intra-governmental initiative. It seems this nicely builds upon the announcement by French president Macron late March to invest heavily in AI, and keep / attract the right people for it. He too definitely sees the European dimension to this, even puts European and enlightenment values at the core of it, although he acted within his primary scope of agency, France itself.

(via this Guardian article)

Time for an RSS Revival

Wired is calling for an RSS revival.

RSS is the most important piece of internet plumbing for following new content from a wide range of sources. It allows you to download new updates from your favourite sites automatically and read them at your leisure. Dave Winer, forever dedicated to the open web, created it.

I used to be a very heavy RSS user. I tracked hundreds of sources on a daily basis. Not as news but as a way to stay informed about the activities and thoughts of people I was interested in. At some point, that stopped working. Popular RSS readers were discontinued, most notably Google’s RSS reader, many people migrated to the Facebook timeline, platforms like Twitter stopped providing RSS feeds to make you visit their platform, and many people stopped blogging. But with FB in the spotlight, there is some interest in refocusing on the open web, and with it on RSS.

Currently I am repopulating from scratch my RSS reading ‘antenna’, following around 100 people again.

Wired in its call for an RSS revival suggests a few RSS readers. I, as I always have, use a desktop RSS reader, which currently is ReadKit. The FB timeline presents stuff to you based on their algorithmic decisions. As mentioned I definitely would like to have smarter ways of shaping my own information diet, but then with me in control and not the one being commoditised.

So it’s good to read that RSS Reader builders are looking at precisely that.
“Machines can have a big role in helping understand the information, so algorithms can be very useful, but for that they have to be transparent and the user has to feel in control. What’s missing today with the black-box algorithms is where they look over your shoulder, and don’t trust you to be able to tell what’s right.”,says Edwin Khodabakchian cofounder and CEO of RSS reader Feedly (which currently has 14 million users). That is more or less precisely my reasoning as well.

Suggested Reading: GDPR, Fintech, China and more

Some links I think worth reading today.

GDPR as De Facto Norm: Sonos Speakers

Just received an email from Sonos (the speaker system for streaming) about the changes they are making to their privacy statement. Like with FB in my previous posting this is triggered by the GDPR starting to be enforced from the end of May.

The mail reads in part

We’ve made these changes to comply with the high demands made by the GDPR, a law adopted in the European Union. Because we think that all owners of Sonos equipment deserve these protections, we are implementing these changes globally.

This is precisely the hoped for effect, I think. Setting high standards in a key market will lift those standards globally. It is usually more efficient to internally work according to one standard, than maintaining two or more in parallel. Good to see it happening, as it is a starting point for the positioning of Europe as a distinct player in global data politics, with ethics by design as the distinctive proposition. GDPR isn’t written as a source of red tape and compliance costs, but to level the playing field and enable companies to compete by building on data protection compliance (by demanding ‘data protection by design’ and following ‘state of the art’, which are both rising thresholds). Non-compliance in turn is becoming the more costly option (if GDPR really gets enforced, that is).

Facebook GDPR Changes Unimpressive

It seems, from a preview for journalists, that the GDPR changes that Facebook will be making to its privacy controls, and especially the data controls a user has, are rather unimpressive. I had hoped that with the new option to select ranges of your data for download, you would also be able to delete specific ranges of data. This would be a welcome change as current options are only deleting every single data item by hand, or deleting everything by deleting your account. Under the GDPR I had expected more control over data on FB.

It also seems they still keep the design imbalanced, favouring ‘let us do anything’ as the simplest route for users to click through, and presenting other options very low key, and the account deletion option still not directly accessible in your settings.

They may or may not be deemed to have done enough towards implementing GDPR by the data protection authorities in the EU after May 25th, but that’s of little use to anyone now.

So my intention to delete my FB history still means the full deletion of my account. Which will be effective end of this week, when the 14 day grace period ends.

Available Energy Data in The Netherlands

Which energy data is available as open data in the Netherlands, asked Peter Rukavina. He wrote about postal codes on Prince Edward Island where he lives, and in the comments I mentioned that postal codes can be used to provide granular data on e.g. energy consumption, while still aggregated enough to not disclose personally identifiable data. This as I know he is interested in energy usage and production data.

He then asked:

What kind of energy consumption data do you have at a postal code level in NL? Are your energy utilities public bodies?
Our electricity provider, and our oil and propane companies are all private, and do not release consumption data; our water utility is public, but doesn’t release consumption data and is not subject (yet) to freedom of information laws.

Let’s provide some answers.

Postal codes

Dutch postal codes have the structure ‘1234 AB’, where 12 denotes a region, 1234 denotes a village or neighbourhood, and AB a street or a section of a street. This makes them very useful as geographic references in working with data. Our postal code begins with 3825, which places it in the Vathorst neighbourhood, as shown on this list. In the image below you see the postal code 3825 demarcated on Google maps.

Postal codes are both commercially available as well as open data. Commercially available is a full set. Available as open data are only those postal codes that are connected to addresses tied to physical buildings. This as the base register of all buildings and addresses are open data in the Netherlands, and that register includes postal codes. It means that e.g. postal codes tied to P.O. Boxes are not available as open data. In practice getting at postal codes as open data is still hard, as you need to extract them from the base register, and finding that base register for download is actually hard (or at least used to be, I haven’t checked back recently).

On Energy Utilities

All energy utilities used to be publicly owned, but have since been privatised. Upon privatisation all utilities were separated into energy providers and energy transporters, called network maintainers. The network maintainers are private entities, but are publicly owned. They maintain both electricity mains as well as gas mains. There are 7 such network maintainers of varying sizes in the Netherlands

(Source: Energielevernanciers.nl

The three biggest are Liander, Enexis and Stedin.
These network maintainers, although publicly owned, are not subject to Freedom of Information requests, nor subject to the law on Re-use of Government Information. Yet they do publish open data, and are open to data requests. Liander was the first one, and Enexis and Stedin both followed. The motivation for this is that they have a key role in the government goal of achieving full energy transition by 2050 (meaning no usage of gas for heating/cooking and fully CO2 neutral), and that they are key stakeholders in this area of high public interest.

Household Energy Usage Data

Open data is published by Liander, Enexis and Stedin, though not all publish the same type of data. All publish household level energy usage data aggregated to the level of 6 position postal codes (1234 AB), in addition to asset data (including sub soil cables etc) by Enexis and Stedin. The service areas of all 7 network maintainers are also open data. The network maintainers are also all open to additional data requests, e.g. for research purposes or for municipalities or housing associations looking for data to pan for energy saving projects. Liander indicated to me in a review for the European Commission (about potential changes to the EU public data re-use regulations), that they currently deny about 2/3 of data requests received, mostly because they are uncertain about which rules and contracts apply (they hold a large pool of data contributed by various stakeholders in the field, as well as all remotely read digital metering data). They are investigating how to improve on that respons rate.

Some postal code areas are small and contain only a few addresses. In such cases this may lead to personally identifiable data, which is not allowed. Liander, Stedin and I assume Enexis as well, solve this by aggregating the average energy usage of the small area with an adjacent area until the number of addresses is at least 10.

Our address falls in the service area of Stedin. The most recent data is that of January 1st 2018, containing the energy use for all of 2017. Searching for our postal code (which covers the entire street) in their most recent CSV file yields on lines 151.624 and 625:

click for full sizeclick to enlarge

The first line shows electricity usage (ELK), and says there are 33 households in the street, and the avarage yearly usage is 4599kWh. (We are below that at around 3700kWh / year, which is higher than we were used to in our previous home). The next line provides the data for gas usage (heating and cooking) “GAS”, which is 1280 m3 on average for the 33 connections. (We are slightly below that at 1200 m3).

SmugMug Buys Flickr, End of the Yahoo Era

I’ve been using Flickr to store photos since March 2005. It’s at the same time an easy way to embed photos in my blog without using up storage space in the hosting account, and an online remote back-up. Over the years I’ve uploaded some 24.000 photos, though I’ve been using Flickr less in the last 2 years.

My account is from just before the moment Yahoo bought Flickr from its founders, which was also in March 2005, and it forced me to create a Yahoo account for it in 2007. Yahoo never seemed to have much vision for Flickr, but as an early user (Flickrs was founded in 2004) the original functionality I signed up and paid for was all I really needed.

Yahoo has been bought by Verizon last year, and since then it was likely they’d sell some parts of it. SmugMug has acquired Flickr last week, and that at least means that photography is now the main focus again. That hopefully means further evolution of Flickr, or it might mean a switch to SmugMug in the future.

Tellingly one needs to accept the new terms of service by 25th May 2018, which is the day the EU data protection regulation GDPR enters into force.

It also means that I will be able to delete my Yahoo account, which I only had because Flickr users were forced to.
Yahoo is an internet dinosaur, launched in 1994. Its best days already lie way back. Deleting my Yahoo account as such is also an end of an era, an end that felt long overdue for years already.

Week Notes #16

This week was another unhurried one, although I’m left with some sense of urgency as I didn’t do all I wanted to focus on.

  • Received a request for a new project offer, for a provincial government, and worked on the offer
  • Worked on the Serbian open data impact study
  • Conversations with several leads and network partners to discuss our respective views on what lies ahead
  • Planned some more conversations like that
  • Repopulated my RSS reader to improve my information diet
  • Enjoyed working outside in the beautiful weather, as well hanging out in the garden with the little one
  • Visited the Big Data Festival, organized by a Dutch Ministry, where I
    mostly valued a session on ethics
  • Blogged a lot, read a lot
  • Visited the local FabLab to fix the humidity sensor on my Measure Your City sensor hub, and attended an interesting presentation there by the national government institute for health and environment (RIVM) on particulate matter pollution measurements from fireworks around New Year’s Eve, using a partly citizen generated sensor network

Backdoors and Futile Stamping

Russia is trying to block Telegram, an end-to-end encrypted messaging app. The reason for blocking is that Telegram refused to provide keys to the authorities with which messages can be decrypted. Not for a specific case, but for listening into general traffic.

Asking for keys (even if technologically possible), to have a general backdoor is a very bad idea. It will always be misused by others. And yes, you do have something to hide. Your internet banking is encrypted, your VPN connection from home to your work computer is too. You use passwords on websites, mail accounts and your wifi. If you don’t have anything to hide, please leave your Facebook login details along with your banking details in the comments. I promise I won’t use them. The point isn’t whether I or government keep our promises (and I or government might not), it’s that others definitely won’t.

As a result of Telegram not providing the keys, Russia is now trying to block people from using it. This results in millions of IP addresses now being blocked, more than 1 IP address per the around 14 million users of Telegram in Russia. (Telegram reports about 200 million users globally per month). Because the service partly runs on servers of Amazon and Google data centers, and those are getting blocked. This impacts other services as well, who use the same data centers to flexibly scale their computing needs. The blocking attempts aren’t working though.

It shows how fully distributed systems are hard to stamp out, it will merely pop up somewhere else. The internet routes around damages, it is what it was designed to do.

Let’s see if actions will now be taken by Russian authorities against persons and assets of Telegram, as that really is the only (potential, not garantueed,) way to stamp out something: dismantling it. In the case of Telegram, a private company, there are indeed people and assets one could target. And Telegram is pledging to deploy those assets in resisting. Yet dismantling Telegram, even if successful and disregarding other costs and consequences for a government, defeats the original purpose of wanting to listen in to message traffic. Traffic will easily move into other encrypted tools, like Signal, while new even more distributed applications will also emerge in response.

Summary:

  • General backdoors, bad idea, regardless of whether you can trust the one you give back door access to.
  • Blocking is hard to do with distributed systems.
  • If you don’t accept attempts to do either from data driven authoritarian governments, you need to accept the same objections to general back door access apply to other situations where you think the stated aim has more merit.
  • Do use an encrypted messaging app, like Signal, as much as possible