I recently delivered a seminar for the Southampton University Cyber Security seminar series. My talk introduced some of the research I’ve been doing into the UK’s Data Protection Register, and was entitled ‘Data Controller Registers: Waste of Time or Untapped Transparency Goldmine?’.
The idea of a register of data controllers came from the EU Data Protection Directive, which set out a blueprint for member state’s data protection laws. Data controllers – any entity responsible for collection and use of personal data – must provide details about the purposes of collection, categories of data subjects, categories of personal data, any recipients, and any international data transfers, to the supervisory authority (in the UK, this is the Information Commissioner’s Office). This represents a rich data source on the use of personal data by over 350,000 UK entities.
My talk explored some initial results from my research into 3 years worth of data from this register. A number of broad trends have been identified, including;
The amount of personal data collection reported is increasing. This is measured in terms of the number of distinct register entries for individual instances of data collection, which have increased by around 3% each year.
There are over 60 different stated reasons for collection of data, with ‘Staff Administration’, ‘Accounts & Records’ and ‘Advertising, Marketing & Public Relations’ being the most popular (outnumbering all other purposes combined).
The categories of personal data collected exhibit a similar ‘long tail’, with ten very common categories (including ‘Personal Details’, ‘Financial Details’ and ‘Goods or Services Provided’) accounting for the majority of instances.
In terms of transfers of data outside the EU, the vast majority of international data transfers are described as ‘Worldwide’. Of those who do specify, the most popular countries are the U.S., Canada, Australia, New Zealand and India.
Beyond these general trends, I explored one particular category of personal data collection which has been raised as a concern in studies of EU public attitudes, namely, trading and sharing of personal data. The kinds of data likely to be collected for this purpose are broadly reflective of the general trends, with the exception of ‘membership details’, which are far more likely to be collected for the purpose of trading.
Digging further into this category, I selected one particularly sensitive kind of data – ‘Sexual Life’ – to see how this was being used. This uncovered 349 data controllers who hold data about individual’s sexual lives, for the purpose of trading and sharing with other entities (from the summer 2012 dataset). I visualised this activity as a network graph, looking at the relationship between individual data controllers and the kinds of entities they share this information with. By clicking on blue nodes you can see individual data controllers, while categories of recipients are in yellow (note: wordpress won’t allow me to embed this in an iframe) Trading / Sharing Data about Sexual Life
I also explored how this dataset can be used to create personalised transparency tools, or to ‘visualise your digital footprint’. By identifying the organisations, employers, retailers and suppliers who have my personal details, I can pull in their entries from the register in order to see who knows what about me, what kinds of recipients they’re sharing it with and why. A similar interactive network graph shows a sample of this digital footprint.
Open data is often seen as in tension with privacy. However, through this research I hope to demonstrate some of the ways that open data can address privacy concerns. These concerns often stem from a lack of transparency about the collection and use of personal data by data controllers. By providing knowledge about data controllers, open data can be a basis for accountability and transparency about the use (or abuse) of personal data.