Are data markets necessarily failing?

Written by Pekka Nikander on 6.3.2019. Posted in Yleinen.

It has been estimated that there will be tens of billions of devices connected to the Internet in a few years. This will result in very large amounts of data, including “raw” data from sensors, cameras, etc; processed and annotated data resulting from such data. At the same time, it looks extremely likely that the large majority of this data will be kept private. Data does not move. So far we have failed to create proper markets for data, and there are reasons why it may remain so.

There are several reasons why people want to keep data private. Firstly, it is extremely hard to evaluate the value of any given data set, since the value depends on the context in which it is used. Secondly, many companies are afraid of their data being used against them. Thirdly, the difficulties in controlling where the data may end up, issues with legislation, difficulties in determining what information is sensitive, etc. As a result of such factors, the majority of the industrial data is very likely to remain private to the companies for a long time to come, just as today.

Overall, the situation can be described as a massive market failure. In general, a market failure is a situation where the goods do not get allocated efficiently, leading to a loss in welfare. With data today the situation is clearly so: most of the industrial data does not move, leading to a loss in production efficiency, and most of the personal data is collected by a few corporations, who have a virtual monopoly in their respected fields. The latter phenomenon is often denoted as the winner-takes-all, happening all the time in the data platforms.

We believe that there are structural reasons for the virtual monopolies and for the unwillingness of sharing industrial data. Taking the earlier argument of Kemkes and going further, we believe that money itself is a structurally inefficient method of compensation for data. Furthermore, reflecting Bauwens and Kostakis in one hand and Carballa in the other, we also believe that ownership is an inefficient model for governing data.

The reason is conceptually simple: Data is anti-rival, while money is rival and ownership, at is defined today, is an appropriate method of governance only for rival goods. However, understanding the different between rival and anti-rival goods may not be that simple.

Most of the goods we are familiar with are rival. If I use a rival good, such as drinking a cup of coffee, you cannot use the very same good. You cannot drink the same cup of coffee.

Anti-rival goods are different. Essentially, an anti-rival good is one that gains value the more it is used. Today a classic example of an anti-rival good is a massive multiplayer online game, which becomes more enjoyable to its users the more other people play the same game. (The game console used to play the game is, of course, rival itself, as are the servers where the game backend runs. However, the game itself is anti-rival.)

Because money is rival by its nature and data is anti-rival, we claim that it is impossible to use money in an efficient way to compensate for sharing data. Other, completely new methods are needed. These new compensation methods should be ones that gain value the more they are used, as anti-rival goods do.

In practical terms, that means that the new compensation methods we might be like reputation or honour. Reputation is earned, a little bit like money, but it cannot be disbursed, unlike money or tokens.

To illustrate the difference between rival, non-rival and anti-rival goods, we prepared a short video, which is available through Aalto channel.