Nir Eyal, of the Nir and Far blog, on dealing with distractions.
The US government is looking at whether to start asking money again for providing satellite imagery and data from Landsat satellites, according to an article in Nature.
Officials at the Department of the Interior, which oversees the USGS, have asked a federal advisory committee to explore how putting a price on Landsat data might affect scientists and other users; the panel’s analysis is due later this year. And the USDA is contemplating a plan to institute fees for its data as early as 2019.
To “explore how putting a price on Landsat data might affect” the users of the data, will result in predictable answers, I feel.
- Public digital government held data, such as Landsat imagery, is both non-rivalrous and non-exclusionary.
- The initial production costs of such data may be very high, and surely is in the case of satellite data as it involves space launches. Yet these costs are made in the execution of a public and mandated task, and as such are sunk costs. These costs are not made so others can re-use the data, but made anyway for an internal task (such as national security in this case).
- The copying costs and distribution costs of additional copies of such digital data is marginal, tending to zero
- Government held data usually, and certainly in the case of satellite data, constitute a (near) monopoly, with no easily available alternatives. As a consequence price elasticity is above 1: when the price of such data is reduced, the demand for it will rise non-lineary. The inverse is also true: setting a price for government data that currently is free will not mean all current users will pay, it will mean a disproportionate part of current usage will simply evaporate, and the usage will be much less both in terms of numbers of users as well as of volume of usage per user.
- Data sales from one public entity to another publicly funded one, such as in this case academic institutions, are always a net loss to the public sector, due to administration costs, transaction costs and enforcement costs. It moves money from one pocket to another of the same outfit, but that transfer costs money itself.
- The (socio-economic) value of re-use of such data is always higher than the possible revenue of selling that data. That value will also accrue to the public sector in the form of additional tax revenue. Loss of revenue from data sales will always over time become smaller than that. Free provision or at most at marginal costs (the true incremental cost of providing the data to one single additional user) is economically the only logical path.
- Additionally the value of data re-use is not limited to the first order of re-use (in this case e.g. academic research it enables), but knows “downstream” higher order and network effects. E.g. the value that such academic research results create in society, in this case for instance in agriculture, public health and climatic impact mitigation. Also “upstream” value is derived from re-use, e.g. in the form of data quality improvement.
This precisely was why the data was made free in 2008 in the first place:
Since the USGS made the data freely available, the rate at which users download it has jumped 100-fold. The images have enabled groundbreaking studies of changes in forests, surface water, and cities, among other topics. Searching Google Scholar for “Landsat” turns up nearly 100,000 papers published since 2008.
That 100-fold jump in usage? That’s the price elasticity being higher than 1, I mentioned. It is a regularly occurring pattern where fees for data are dropped, whether it concerns statistics, meteo, hydrological, cadastral, business register or indeed satellite data.
The economic benefit of the free Landsat data was estimated by the USGS in 2013 at $2 billion per year, while the programme costs about $80 million per year. That’s an ROI factor for US Government of 25. If the total combined tax burden (payroll, sales/VAT, income, profit, dividend etc) on that economic benefit would only be as low as 4% it still means it’s no loss to the US government.
It’s not surprising then, when previously in 2012 a committee was asked to look into reinstating fees for Landsat data, it concluded
“Landsat benefits far outweigh the cost”. Charging money for the satellite data would waste money, stifle science and innovation, and hamper the government’s ability to monitor national security, the panel added. “It is in the U.S. national interest to fund and distribute Landsat data to the public without cost now and in the future,”
European satellite data open by design
In contrast the European Space Agency’s Copernicus program which is a multiyear effort to launch a range of Sentinel satellites for earth observation, is designed to provide free and open data. In fact my company, together with EARSC, in the past 2 years and in the coming 3 years will document over 25 cases establishing the socio-economic impact of the usage of this data, to show both primary and network effects, such as for instance for ice breakers in Finnish waters, Swedish forestry management, Danish precision farming and Dutch gas mains preventative maintenance and infrastructure subsidence.
(Nature article found via Tuula Packalen)
Just received an email from Sonos (the speaker system for streaming) about the changes they are making to their privacy statement. Like with FB in my previous posting this is triggered by the GDPR starting to be enforced from the end of May.
The mail reads in part
We’ve made these changes to comply with the high demands made by the GDPR, a law adopted in the European Union. Because we think that all owners of Sonos equipment deserve these protections, we are implementing these changes globally.
This is precisely the hoped for effect, I think. Setting high standards in a key market will lift those standards globally. It is usually more efficient to internally work according to one standard, than maintaining two or more in parallel. Good to see it happening, as it is a starting point for the positioning of Europe as a distinct player in global data politics, with ethics by design as the distinctive proposition. GDPR isn’t written as a source of red tape and compliance costs, but to level the playing field and enable companies to compete by building on data protection compliance (by demanding ‘data protection by design’ and following ‘state of the art’, which are both rising thresholds). Non-compliance in turn is becoming the more costly option (if GDPR really gets enforced, that is).
Stephanie Booth, a long time blogging connection, has been writing about reducing her Facebook usage and increasing her blogging. She says at one point
As the current “delete Facebook” wave hits, I wonder if there will be any kind of rolling back, at any time, to a less algorithmic way to access information, and people. Algorithms came to help us deal with scale. I’ve long said that the advantage of communication and connection in the digital world is scale. But how much is too much?
I very much still believe there’s no such thing as information overload, and fully agree with Stephanie that the possible scale of networks and connections is one of the key affordances of our digital world. My rss-based filtering, as described in 2005, worked better when dealing with more information, than with less. Our information strategies need to reflect and be part of the underlying complexity of our lives.
Algorithms can help us with that scale, just not the algorithms that FB uses around us. For algorithms to help, like any tool, they need to be ‘smaller’ than us, as I wrote in my networked agency manifesto. We need to be able to control its settings, tinker with it, deploy it and stop it as we see fit. The current application of algorithms, as they usually need lots of data to perform, sort of demands a centralised platform like FB to work. The algorithms that really will be helping us scale will be the ones we can use for our own particular scaling needs. For that the creation, maintenance and usage of algorithms needs to have a much lower threshold than now. I placed it in my ‘agency map‘ because of it.
Going back to a less algorithmic way of dealing with information isn’t an option, nor something to desire I think. But we do need algorithms that really serve us, perform to our information needs. We need less algorithms that purport to aid us in dealing with the daily river of newsy stuff, but really commodotise us at the back-end.
“The medium was no longer the message, it was just an asshole.
I want my attention back.
Attention is a muscle. It must be exercised.
We deserve our attention.”
Craig Mod on attention in January 2017. In his case he got his attention back by disconnecting, which for all intents and purposes isn’t a viable option. Completely disconnecting in our networked societies is just a reactionary exercise in privilege. But it does sum up my current sentiment, e.g. concerning Facebook, quite nicely.
Some disturbing key data points, reported by the Guardian, from a Congressional hearing in the US last week on the usage of facial recognition by the FBI: “Approximately half of adult Americans’ photographs are stored in facial recognition databases that can be accessed by the FBI, without their knowledge or consent, in the hunt for suspected criminals. About 80% of photos in the FBI’s network are non-criminal entries, including pictures from driver’s licenses and passports. The algorithms used to identify matches are inaccurate about 15% of the time, and are more likely to misidentify black people than white people.” It makes you wonder how many false positives have ended up in jail because of this.
I am in favor of mandatory radical transparency of government agencies. Not just in terms of releasing data to the public, but also / more importantly specifying exactly what it is they collect, for what purpose, and what amount of data they have in each collection. Such openness I think is key in reducing the ‘data hunger’ of agencies (the habit of just collecting stuff because it’s possible, and ‘well, you never know’), and forces them to more clearly think about information design and the purpose of the collection. If it is clear up-front that either the data itself, or the fact that you collect such data and in which form you hold them, will be public at a predictable point in time, this will likely lead to self-restraint / self-censorship by government agencies. The example above is a case in point: The FBI did not publish a privacy impact assessment, as legally required, and tried to argue it would not need to heed certain provisions of the US Privacy Act.
If you don’t do such up-front mandatory radical transparency you get scope creep and disturbing collections like above. It is also self-defeating as this type of all encompassing data collection is not increasing the amount of needles found, but merely enlarging the haystack.
Every day I save a bunch of links from my explorations over the interwebs. Stuff that passes my radar, may become fodder for my writing at some point, but often gets piled and forgotten.I thought maybe it is good to share some of the unsought links I encounter, and some of the notions why I bookmarked it. Blogging of course used to be linklogging, sharing links to your blog neighbourhood, so let’s say it’s returning to a respected tradition. Here are a fistful of links from this week.
- I met Cindy Kohtala (@ckohtala) at the Koppelting unconference yesterday, who just published her dissertation (PDF) at Aalto University in Finland on the sustainability of FabLabs. She included the Dutch network in her research because of the more diverse forms of labs, and the network they form. Will read her dissertation with interest. Reminds me of posting about the failings of FabLabs, and discussing the BeNeLux FabLabs network, because of its unparallelled density and connectedness, I named it the City of FabLabs in my Lift ’10 talk.
- Foss4Geo is the annual global conference of the open source for geo community. This year Bonn in Germany is the host city and I will be keynoting there next week, on the value of open data and the peculiar role geo data has in it.
Internet of Things
- Peter Bihr, of ThingsCon fame, writes about The arena’s of IoT. Please note that the internetconnected fridge is not an IoT arena
- IPFS, a distributed way of delivering webpages and files. Pointed out to me in the context of my postings on distributedness and agency. Napsterizing/torrenting everything. Also seems to want to preserve everything on the web better.
- Steem is a blockchain based social media platform. Aims to ‘pay’ you for contributing, and do the bookkeeping in a blockchain ledger. Not sure that may work, nor that permanent records of each social media utterance are desirable. Like with IPFS mentioned above, ’not forgetting’ may not be a feature but a very concerning social bug. My friend Boris Mann is trying it out, looking forward to reading more of his reflections. I may not understand, I never understood the purpose of Medium either, which superficially seems to be the same thing but without the bookkeeping.
- Anil Dash reflects on the lost infrastructure of social media. This resonates strongly with me in terms of what made blogging so exciting 10-15 years ago, as well as with my recent writings about agency. Part of the picture is weaving a tapestry of functionality across different services and tools that together are a potent mix. It needs plumbing like RSS, trackback and discoverability over the lines of conversations distributed over the individual blogs of the participants. My friend Lilia did her Phd on those distributed conversations. And as Hoder wrote seeing the web again after six years in an Iranian prison: much of our web now, such as Facebook, is just TV, not coffee house interaction.
- Free private cities. Sign up to live in one, so you have an ‘equal’ position based on contracted service provision. Because tinkering with democracy and the fact that others have different needs is bothersome, or such. Apparantly the social contract isn’t good enough. This has high overtones of Snowcrash Burbclaves, and the micro-democracy states (100.000 people each, and with every election there is freedom of movement globally to pick the government (corporate, value or ethnicity based) of your choice in the very entertaining near-future SF book Infomocracy by Malka Ann Older. These private city contracts don’t seem to account for the cost of leaving if you cancel your contract, as it is still territory bound, so finding a new service provider means physically moving. With all the social and monetary cost of doing that. Also seems to me that the Principality of Monaco held up as a good practice example, incorporated US towns, or the City of London for that matter provide ample demonstration of why this may not be the way forward to a more inclusive global society.
- The Ribbon Farm, a blog by Venkatesh Rao, newly added to my feed-reader. His recent newsletter edition on premature synchronization as a cause of problems, chimes with a lot of my experience. Converging too early (because there are just 10 minutes left in the meeting), or forcing convergence in a group doesn’t help much usually. The leading example in the link being military reminds me of an anecdote I once heard about “the world championship of armies” where the US military units were failing because they waited or tried to confirm orders continuously, and the Dutch fared better because they upon receiving others did what seemed worth doing based on context and observation, not seeking further orders and disregarding the literal meaning of orders in the process. Desyncing, as a practice seems valuable advice, and similar to making stuff distributed by design, or probe-based evolution. Seek out new perspectives and let yourself be challenged as part of your routines.
Earlier this week I came across a Lifehacker posting “Get a Better Creative Workflow in Evernote by Ditching Tags” by Melanie Pinola, quoting Tiago Forte who’s into productivity, which proposes one might as well get rid of tags in Evernote because :
- “When you rely heavily on tags, you have to perfectly recall every single tag you’ve ever used, and exactly how it is spelled and punctuated.”
- “The real problem with tags, and why they not only fail to help, but actually even hurt people’s creative self-esteem, is that they give the impression that keeping a useful collection of personal notes requires nothing less than a heroic feat of comprehensive planning, followed by years of meticulous, unwavering cataloging and annotating”
This does not make much sense to me at all. For me tags are a key ingredient in provoking serendipity, as well as a navigational aid. Both play a strong role in my creativity process. If you think tags limit your creativity, I think it is likely because of how you use tags.
Tags vs Categories
It seems both Forte and Pinola see tags as categories. Tags aren’t categories. Yes, categories do require you have a good understanding of how they are organized, and need you to stick with it thoroughly, as otherwise everything ends up in the ‘miscellaneous basket.’ Categories are things you make up before you start categorizing. Tags work the other way around: you add tags to things as you go along. Over time a structure may emerge from the tags which you could adopt as categories, but that isn’t the purpose of tagging. With tags everything starts out as miscellaneous. This key difference is the difference between approaching information from a hierarchical perspective (categories) and from a connected perspective (tags). In the networked age, Everything Is Miscellaneous, as David Weinberger put forth in 2007.
Categories in Evernote
Evernote has no explicit categories functionality, but allows you to work with categories in 2 ways.
- One is dividing your notes in different notebooks. This is something you can use for fixed and mutually exclusive categories. I have different notebooks for different areas of responsibility.
- The other is using the tagging functionality. These can be used for non-exclusive categories (as a note cannot be in more than 1 notebook at the same time, but can be in several categories). I use tags like this as categories as well, for instance to indicate project status, or that a note is related to a specific project. However those tags as categories are just a small part of the tags I use.
How I add tags (e.g. in Evernote)
My tags do not form a structure of categories / a taxonomy. They are reflective of my associations with a piece of information. I add tags to capture what a piece of information means to me, what I associate it with, or how I might use it. All of this in a non-prescribed way, and not as a ‘must’ either. There’s plenty of stuff I don’t tag at all, and there is no planned consistency in my tagging. It simply evolves with my own internal dialogue and idiom (something I would have tagged socialsoftware in 2002, would maybe have been tagged socialmedia in 2009 and socmed in 2015).
Key here is that with my tags I do not try to capture what something is “objectively” about, like the echo of systematic categories, but why I saved it. A piece about an animal may be tagged with collaboration or with business_models based on the associations I had while reading it.
My tags may very well not be used or present in the information I tag with it. (In general if you ask people to tag stuff or title it based on what it means to them, there is a good chance they use words not present in the tagged information itself).
I also save material in about half a dozen languages, and then tagging is a way of connecting material together and make it findable in ways that full-text search cannot do, as search is monolingual.
There is likely a power-law distribution in my use of tags: most will get used maybe once or twice, some will get used heavily. The more heavily used ones, if I notice it as a pattern, can become a sort-of de facto category. So I don’t need to remember all my tags and how I used them, as suggested in the linked article above, I usually only remember the less than 10 I use frequently. I am not bothered if I don’t use them.
How tags help my creativity
There are two ways in which tagging aids my creativity.
The first is that it aids my serendipity. If I search my notes it surfaces things not only based on the content of those notes, but also on the associations I used as tags, and other words I used as tags that are not in the content itself. That way unexpected search results, but nevertheless relevant to or overlapping with my search question, can pop up. So that when I search e.g. for business models the example article about the animal I mentioned above will pop up. That way I find things I did not realize I was looking for.
The second is that tags allow me to navigate and pivot through my collected material. I see social software / networked tools as working in triangles (see my 2006 posting Social Software Works in Triangles).
Such a triangle is formed out of an information item (a Flickr photo, a Delicious bookmark, or indeed a note in Evernote), the person that created/shared it (in Evernote usually myself), and one or more descriptors (tags, locations etc.).
The point is that tags are not just descriptors, they are also turning points on the path through my data. These pivots or forks in the road, allow me to hop-step-jump from an article to other things within the same context through a tag, like another article, and then through to the author of it and maybe onwards to one of their other writings, to somebody’s bookmark collection of which it is a part, to that person’s blog etc.
It allows for navigation and triangulation that way, bringing me places I didn’t know about. That is a richness in association, multiple viewpoints etc, that a category system cannot produce. ( I even dreamt about tags and pivots once, in 2007)
So, don’t ditch tags because they cramp your style. Uncramp your style so you can use tags fruitfully.
Streetfilms have made a great video making the case for opening up (US) public transit data. It nicely illustrates what can be done if private people have access to public information in a reusable way. (what is reusable public service information?)
Some notable quotes from the video:
Chris Dempsey, Massachusetts Department of Transportation on why it makes sense for his organization to release the data:
“If you take the model of the national weather service and apply it to the transit agencies you realize you can have just as many ways to get transit information as you do to get weather information. And the beauty of it is that it’s no cost to the transit agencies.”
But above all I liked what Tim O’Reilly said (emphasis mine):
“Government should think of itself as the platform that society builds on. Rather than government as a vending machine of actual service delivery. The idea of being a platform provider is you do the least possible, not the most possible, to enable others to build on what you do.
I think the importance of that remark bears repeating everywhere where the initial government reflex is to turn anything into something large and expensive. When you talk to those government parts and mention the word ‘portal’ they immediately envision a multi million Euro project. But that is completely unnecessary. I’ve spoken to different EU open data catalogue initiatives in the past few weeks and all of them are sticking to rules of simplicity and small size in terms of organization and budget, as that is what allows them to be successful. Currently I am working with the Dutch government on how a national open data catalogue should be organized, and I think Tim O’Reilly sums up nicely what the leading thought of my advice will be.
In blog based discussions there has been talk of ‘effective’ group sizes and network sizes in the past (see some of it here from 2003 and 2004). Most of that however was always based on anecdotal ‘laws’ or Dunbar’s number (the application of which I usually see as the mis-interpretation of Dunbar’s theory).
Working in different groups.
Of course I know from personal experience the size of groups I am comfortable with in different settings. I like working on concrete tasks with 1 or 2 others, I like teams of 5, I like doing interactive sessions with 8 to 16 people, with an optimum of 12, I enjoyed working for a company where the communication habits didn’t scale beyond 16, I like to do open conversational sessions with 20 to 25 people, and I like to present to larger audiences.
But what are the ‘transition points’ in group size? How much people do you need to have enough variety in a group to increase the learning in that group during learning activities? When does communication overhead become too big to stay with 1 on 1 connections and additional group roles or tools to facilitate communication are needed?
I can imagine all kinds of variables coming into play: variety of skills in the group, group inertia (though the work of Olson seems proven to be false), organizational overhead needed, cognitive overhead, communication needs, in-/outgroup aspects, peer pressure, etc.
All these factors are probably depending on what needs to be done: group learning, a concrete task, problem solving, collective action etc.
Is there any academic source you are aware of, or empirical studies you’ve seen that cover this, or at least aspects of it? Any pointers are welcome. I will of course blog what I find / receive.
Working in different groups.