Anonymization: The UK Response to Big Data

Posted

“Everywhere you look, the quantity of information in the world is soaring.”

ICD has predicted that, by 2012, mankind will have created 2.7 zettabytes of data! The numbers are mind boggling – a zettabyte is a 1 billion terabytes. With all of that data comes the Next Big Thing – namely, Big Data.

What is Big Data?

Big Data refers to the commercial “aggregation, mining, and analysis” of very large, complex and unstructured datasets such as images, videos, MP3 files, and files based on social media and web-enabled workloads. This data is rich in (often personal) information but until recently has been difficult to understand and analyse – that is, without a supercomputer or two at your disposal! New data and analytics technologies, coupled with scalable, distributed data processing models (i.e. cloud computing), are enabling commercial and research organisations to take advantage of Big Data processing techniques with a relatively low investment in technology.

Why does it matter?

Simple really, it’s a huge market opportunity. According to a research report from the McKinsey Global Institute, Big Data is the next frontier for innovation, competitive advantage and productivity, although, as McKinsey notes, it is not without its challenges “including a shortage of skilled analysts and managers.” IT analyst, Gartner, suggests worldwide IT spending on Big Data in 2013 will be $34 billion. Big Data is Big Business.

As businesses move online the number one issue is customer engagement. Data (demographic, behavioural and real-time) is the key to connecting businesses with customers. Early movers such as Amazon use collaborative filtering technology to develop automatic recommendations for customers based on their purchase history. Global pharmaceutical company GlaxoSmithKline (Sensodyne, Lucozade and lots of other brands) uses data analytics tools to track consumers online and repurpose the data to benefit particular brands. GSK aims to build direct relationships with one million consumers using social media . Somewhat more controversially, US discount retailer Target’s use of Big Data analytics, first reported by the New York Times then picked up in more sensationalist form by Forbes.com, used Big Data analysis “to figure out whether you have a baby on the way long before you need to start buying diapers”. With all this analysis being applied to commercial ends, privacy advocates are concerned that individuals may be harmed, or at least annoyed, by the use of “their” data in ways they had not expected.

Is anonymity an answer?

When thinking about the legal issues, data protection laws seem to throw up more roadblocks than solutions. Just take, as an example, EU Data Protection Principles which mandate things such as user notice and choice, purpose limitation, data minimisation, data retention and data export. These principles are shortly to be bolstered by the new General Data Protection Regulation which will propose a new “right to be forgotten”.
Whilst data rendered anonymous falls outside the scope of EU data protection laws, there have long been concerns that anonymised data can be re-identified with a particular individual through matching with other data, leading to official EU guidance that, for data to be considered as anonymised, re-identification must no longer be possible.

In the UK, the Information Commissioner’s Office (ICO) just came out on the side of business with pragmatic guidance on the use of anonymisation. In an approach modelled on UK case law, the ICO stated that a business which wants to anonymise data need only prove that it has assessed the risk of re-identification, and having done so, can reasonably conclude that there is only a remote risk of re-identification. The ICO code of practice “Anonymisation: managing data protection risk” is essential reading for UK-based data controllers looking at developing and implementing compliant Big Data strategies. The ICO has also established the Anonymisation Network (not to be confused with Anonymous, the notorious hacker network). This is a consortium led by the University of Manchester, with the University of Southampton, Office for National Statistics and the UK government’s new Open Data Institute (ODI), in order to provide greater access to more detailed expertise and advice.

It remains to be seen if other EU countries will adopt the business-friendly approach of the ICO.