Researching Personal Data
5 Stars of Personal Data Access
As a volunteer ‘data donor’ at the Midata Innovation Lab, I’ve recently been attempting to get my data back from a range of suppliers. As our lives become more data-driven, an increasing number of people want access to a copy of the data gathered about them by service providers, personal devices and online platforms. Whether it’s financial transactions data, activity records from a Fitbit or Nike Fuelband, or gas and electricity usage, access to our own data has the potential to drive new services that help us manage our lives and gain self-insight. But anyone who has attempted to get their own data back from service providers will know the process is not always simple. I encountered a variety of complicated access procedures, data formats, and degrees of detail.
For instance, BT gave me access to my latest bill as a CSV file, but previous months were only available as PDF documents. And my broadband usage was displayed as a web page in a seperate part of the site. Wouldn’t it be useful to have everything – broadband usage, landline, and billing – in one file, covering, say, the last year of service? Or, even better, a secure API which would allow trusted applications to access the latest data directly from my BT account, so I don’t have to?
Another problem was that in order to get my data, I sometimes had to sign up for unwanted services. My mobile network provider, GiffGaff, require me to opt-in to their marketing messages in order to receive my monthly usage report. FitBit users need to pay for a premium account to get access to the raw data from their own device.
Wouldn’t it be nice to rate these services according to a set of best practices? In 2006, when the open data movement was in its infancy, Tim Berners-Lee defined ‘Five Stars of Open Data‘ to describe how ‘open’ a data source is. If it’s on the web under an open license, it gets one star. Five stars means that it is in a machine-readable, non-proprietary format, and uses URI’s and links to other data for context. While we don’t necessarily want our private, personal data to be ‘open’ in Berners-Lee’s sense, we do want standard ways to get access to our personal data from a service. So, here are my suggested ‘Five Stars of Personal Data Access’ (to be read as complementary, not necessarily hierarchical):
1. My data is made available to me for free in a digital form. For instance, through a web dashboard, or email, rather than as a paper statement. There are no strings attached; I do not need to pay for premium services or sign up to marketing alerts to read it.
2. My data is machine-readable (such as CSV rather than PDF).
3. My data is in a non-proprietary format (such as CSV, XML or JSON, rather than Excel).
4. My data is complete; all the relevant fields are included in the same place. For instance, usage history and billing are included in the same file or feed.
5. My data is up-to-date; available as a regularly-updated feed, rather than a static file I have to look up and download. This could be via a secure API that I can connect trusted third-party services to.
The Midata programme has considered these issues from the outset, calling for suppliers to adopt common procedures and formats. Simplifying this process is an important step towards a world where individuals are empowered by their own data. My initial attempts to get my data back from suppliers point to a number of areas for improvement, which I’ve tried to reflect in these star ratings. Of course, there’s lots of room for debate over the definitions I’ve given here. And I’m sure there are other important aspects I’ve missed out. What would you add?