The topic of the day appears to be “big data,” meaning the aggregation, mining, and analysis of data. This data analytics helps determine customer profiles so that companies can tune their offerings and sell more of the right things to the right customers. As recently reported in the New York Times Magazine, Target, through the use of such analytics, was able to determine that a teen was pregnant by her purchases before her father knew she was pregnant. This allowed Target to adjust its coupon offers based on Target’s knowledge of buying practices of mothers-to-be. But, at what cost does this analytics come?
Caribou Honig, writing on Forbes.com, makes a case “In Defense of Small Data” that collecting, storing, and processing mounds of data is costly and provides no more–and perhaps less–useful data than analyzing only the limited data set that really matters. In addition, storing this volume of data has its own direct costs.
And this is only half of the story . . . There are also legal costs and risks to big data.
With every item of data collected and retained, comes increased data privacy risk. Nearly every state and the District of Columbia has a data breach law that requires companies to take affirmative actions in conjunction with any release of personally identifiable information. The net result is that if information is improperly disclosed, a company can face huge financial and reputational risk. Any time a company collects more data, a company increases its risk of disclosure of personally identifiable information. This means that added security is required, additional insurance may be required, and there is still the risk of a disclosure. These problems are compounded in the international space where different countries have laws that are even more stringent than those in the US about how personally identifiable information can be used–particularly without the consent of the relevant individual.
Even seemingly anonymous data can become personally identifiable. For example, as noted in a study by CIO Magazine, a mobile application that uses anonymous geolocation data can pretty readily identify me simply because most of the time, I am either at my home or at my office–that combination alone is likely sufficient to uniquely identify me.
Of course the proponents of “big data” are also correct. There is a big benefit in mining data to better target a company’s offerings (frankly, I prefer to get banner ads that are relevant to me than completely irrelevant ads, and I am more likely to click through those ads), but when forming a data analytics strategy, a company needs also to focus on the cost side of the equation and should only collect and process that data that really provides a commercial benefit. There is no single right answer, this is a commercial balance between business benefit and risk.
As they say, “bigger isn’t always better.” Sometimes, the smallest bowl of porridge is just right.