In discussions about data usage and sharing and who has a measure of control over what data gets used and shared how, we easily say ‘my data’ or get told about what you can do with ‘your data’ in a platform.
While it sounds clear enough, I think it is a very imprecise thing to say. It distracts from a range of issues about control over data, and causes confusion in public discourse and in addressing those issues. Such distraction is often deliberate.
Which one of these is ‘my data’?
- Data that I purposefully collected (e.g. temperature readings from my garden), but isn’t about me.
- Data that I purposefully collected (e.g. daily scale readings, quantified self), that is about me.
- Data that is present on a device I own or external storage service, that isn’t about me but about my work, my learning, my chores, people I know.
- Data that describes me, but was government created and always rests in government databases (e.g. birth/marriage registry, diploma’s, university grades, criminal records, real estate ownership), parts of which I often reproduce/share in other contexts while not being the authorative source (anniversaries, home address, CV).
- Data that describes me, but was private sector created and always rests in private sector databases (e.g. credit ratings, mortgage history, insurance and coverage used, pension, phone location and usage, hotel stays, flights boarded)
- Data that describes me, that I entered into my profiles on online platforms
- Data that I created, ‘user generated content’, and shared through platforms
- Data that I caused to be through my behaviour, collected by devices or platforms I use (clicks through sites, time spent on a page, how I drive my car, my e-reading habits, any IoT device I used/interacted with, my social graphs), none of which is ever within my span of control, likely not accessible to me, and I may not even be aware it exists.
- Data that was inferred about me from patterns in data that I caused to be through my behaviour, none of which is ever within my span of control, and which I mostly don’t know about or even suspect exists. Which may say things I don’t know about myself (moods, mental health) or that I may not have made explicit anywhere (political or religious orientation, sexual orientation, medical conditions, pregnancy etc)
Most of the data that holds details about me wasn’t created by me, and wasn’t within my span of control at any time.
Most of the data I purposefully created or have or had in my span of control, isn’t about me but about my environment, about other people near me, things external and of interest to me.
They’re all ‘my data’. Yet, whenever someone says ‘my data’, and definitely when someone says ‘your data’, that entire scope isn’t what is indicated. My data as a label easily hides the complicated variety of data we are talking about. And regularly, specifically when someone says ‘your data’, hiding parts of the list is deliberate.
The last bullets, data that we created through our behaviour and what is inferred about us, is what the big social media platforms always keep out of sight when they say ‘your data’. Because that’s the data their business models run on. It’s never part of the package when you click ‘export my data’ in a platform.
The core issues aren’t about whether it is ‘my data’ in terms of control or provenance. The core issues are about what others can/cannot will/won’t do with any data that describes me or is circumstantial to me. Regardless in whose span of control such data resides, or where it came from.
There are also two problematic suggestions packed into the phrase ‘my data’.
One is that with saying ‘my data’ you are also made individually responsible for the data involved. While this is partly true (mostly in the sense of not carelessly leaving stuff all over webforms and accounts), almost all responsibility for the data about you resides with those using it. It’s other’s actions with data that concern you, that require responsibility and accountability, and should require your voice being taken into account. "Nothing about us, without us" holds true for data too.
The other is that ‘my data’ is easily interpreted and positioned as ownership. That is a sleight of hand. Property claims and citizen rights are very different things and different areas of law. If ‘your data’ is your property, all that is left is to haggle about price, and each context is framed as merely transactional. It’s not in my own interest to see my data or myself as a commodity. It’s not a level playing field when I’m left to negotiating my price with a global online platform. That’s so asymmetric that there’s only one possible outcome. Which is the point of the suggestion of ownership as opposed to the framing as human rights. Contracts are the preferred tool of the biggest party, rights that of the individual.
Saying ‘my data’ and ‘your data’ is too imprecise. Be precise, don’t let others determine the framing.