In the past week a report was published by the University of Strasbourg and APIE on charging policies for government data re-use. Its conclusion was that charging for commercial re-use of government data can make sense if the price point reflects the government costs made, and lies below the price private companies are willing to pay.

For two specific reasons I cannot share the conclusion of the mentioned study: first the way value-added is defined as the effort government put into making data available and re-usable, and second using the data-information-knowledge ladder as a yard stick for that added value, even though using such a ladder is fundamentally flawed, as it implies a causality and hierarchy that in reality isn’t there.

Below however I list a few reasons why I have difficulty in general with charging for government data, and making distinctions between non-commercial and commercial re-use. I posted them to the OKFN mailing list (see the entire discussion there), but post them here so I can easily retrieve them later.

The question ‘isn’t it reasonable to charge for commercial re-use of data?’ is simple to ask, but has no simple answer. The answer is the sum of a number of arguments that together must be weighed against that single question.
Some of those arguments would be in my perception:

The customer always pays, and therefore we all pay twice.
Any company will charge the costs of data they bought in the price of their product, so the customer pays. However all customers already paid for the data generation through taxes. Unless the revenue goes directly to cut taxes for all, citizens are actually paying twice. As are the owners of the companies involved (they are after all also citizens). Charging for data hence becomes a ‘data-tax’.

Setting a price has consequences if it’s based on the perceived market-value
This way of price setting will always be according to what one sees happening now in established markets. You are then damaging innovation efforts as any price creates a threshold for novel ways of doing things. Innovation is really not about merely a novel product, it is also about novel costing-structures, novel distribution channels, and novel customer groups (the ones that cannot be served by the incumbent costing structures and distribution channels) Setting a price on perceived market value therefore helps incumbents in the re-use niche involved, and impedes new entrants, as it assumes that innovation apart form the product does not change other aspects of the playing field. Government is then basically saying: under this price threshold nothing much can happen, in fact we are making it actively impossible as we set that minimum cap. Setting a fixed price this way creates a minimum threshold (which is also assymetrically favouring those companies that can afford it better and thus a market intervention), setting a relative price is effectively an additional revenue-tax for the companies involved. This way of pricing also turns the government into a market party/player.

Setting a price has consequences if it’s based on incurred costs (other than incremental costs of distribution)
What are the costs of getting data and getting it ready for release? Is collection part of it? Is putting it on-line part of it? Even if it’s in the same format you collected it in? There are no separate systems or processes for it, the datasets are a result of the entire system of government executing its tasks which are paid through our taxes. And are you charging all of those costs to commercial re-users? Also to pay for non-commercial re-use, which is basically party of gov’s requirement to actively inform the citizenry about its ongoing tasks?
Also, there are positive effects for data release to the internal workings of government itself (transparency, participation etc., as well as higher effectiveness and efficiency in their own tasks). Are you going to account for that off-set first in determining the costs of the data-sets? If you cannot clearly demarcate the costs, you are in fact charging for what you think others will be willing to pay (see item above). I have a feeling that isolating specific costs will bring you to a load of trivial costs, ending at the conclusion “let’s charge incremental distribution costs”. Anything above that will be again a wholly arbitrary price setting. Government is not supposed to be acting arbitrarily but predictably and controllable.

Innovation is not non-commercial
Thinking that innovation happens as research, analysis etc, and then after a while magically becomes a commercially viable product, is far from reality. Innovation happens not in the lab (new knowledge can happen in the lab, but innovation and new knowledge are not synonymous) but in experimenting in the market. Innovation is a trajectory, a path of exploration, repeated ‘life’ attempts at getting a still changing and developing product to take a market foothold, not an incident of something shooting out of a lab. Therefore it cannot be argued that when you charge for commercial re-use, you are not impeding innovation as that ‘can take place as non-commercial re-use’. Innovation includes successful market-entry and commercial success. Thus the non-commercial / commercial distinction will indeed have an effect on possible innovation.

Collecting revenue is not cheap
Collecting revenue costs a lot of money. If you charge at the level of ‘production costs’ for commercial re-use, government incurs a lot of new costs: Administrating billing, monitoring and actively policing of commercial re-use (to make sure nobody gets a freebie and make sure all commercial re-users are treated equally)
This will either cost more than the revenue collected, or it will drive up prices for the data-sets to effectively preclude a larger part of potential commercial use of said data, raising the threshold more towards the high end of any re-use market and excluding not just new entrants but also the smaller incumbents in a market, again constituting market intervention for the sake of monetizing a single transaction.
Commercial re-use may save government costs
Commercial use may in fact make gov’s work lighter,with e.g. products in the area of having informed citizens making better decisions about their own lives, and thus make less demand of other government services.

Who will set the price?
Who is actually mandated to set a price for commercial re-use on any given dataset? How do we know it is fairly/reasonably calculated? Where are the checks and balances for those decisions? If any gov institution sets its own price (now the case it seems) you create a confusing and unpredictable landscape. Unpredictability in resources is not something companies take lightly: they will move away from that, and stop using the data. At the same time this might be the real revenue opportunity: charge not for the data, but for the service level companies want to have on top of what a gov does as part of its own tasks. Data with no strings attached, and SLAs with a price tag.
If all gov institutions adhere to the same price setting process, this probably needs legislation first as well. Is there currently a legal basis other than the phrase in the PSI directive (costs plus reasonable profit margin, which applies to all re-use, not just commercial) for pricing?

All in all, charging for commercial use of data is only interesting it seems to me if you stick to looking at only the basic transaction, not at the chain or ecosystem that transaction is part of. If you look further than just the transaction, as I am sure we must, you are for all intents and purposes raising a data-tax and doing market interventions if government starts monetizing commercial re-use. That may be a government goal, just as it may be it wants to keep people from using data, using money to influence behaviour (like rising tobacco taxes) or it may be it really wants to get some short term revenue at the cost of less innovation, higher taxation. Neither of those however is the stated goal of any government at this point when it comes to data re-use.