[rant] Increasingly, in the contexts I operate in, I feel the distinction between data and information is something of a pre-digital pre-networked hang-up. Yes there’s a difference between e.g. measurements (-1, 0, 1, 2, 4) and an informative conclusion drawn from it (the world’s getting hotter), but in the common perception of both data and information as objects, there isn’t much useful distinction anymore between a database and a document. When digitised, they’re both objects that can be either, as it is in the eyes of the beholder and their use case. Context as always is key. If it was used as data, it was. If the same thing was used as information, it was. (An example is the European Commission’s documents. Information to most of us, but data for Google’s translation algorithms as its the largest body of text on the planet carefully translated into 23 languages)
There is often a difference in difficulty of processing it with machines, yes. Most what is called information in that sense is badly packaged badly marked-up data to machines. Structured data with meta-data and expressed relations (linked data e.g.) in that sense are large documents hard to read for human eyes. But is there any practical gain in terms of agency by making the distinction between data and information, in the context of digital processes? You can make a distinction between a datum (’42’) and a collection of that datum with more of it or other stuff (‘The Hitchhiker’s Guide to the Galaxy’). But a singular datum on its own is not what ever happens in real use cases where we discuss data and information as separate objects. As a pragmatist, I find I’ve mostly dropped the distinction.
Oh and please don’t extend the data-information sequence to data-information-knowledge-wisdom. The 1970’s DIKW model’s been the CS/IS mantra for decades, but there is no linearity or hierarchy between those four terms, and the implication the latter two are objectifiable is actively destructive. The D-I part served once to help explain how data was a strategic resource, which is still a very valid proposition, more than ever even as data is a geo-political factor now, but don’t assume a wider purpose of the model than that.