The use of big data by enterprises is almost commonplace at this point, with advocates claiming it aids decision-making, increases revenue and productivity and decreases operational costs. But it comes at a cost to data privacy.
With these advantages, national and international companies, banks and government organizations have now amassed huge data sets. However, it seems, based on the number of hacks in the last ten years, they are unable to protect this data, with breached data often ending up for sale on the Dark Web.
The result is that the breached enterprise involved suffers some reputation damage, hopefully a regulatory fine and data harvesting from customers then continues unabated. The customers themselves are blameless, yet their information (whether financial, clinical or perhaps usernames and passwords) has been leaked, leaving them vulnerable to identity theft, financial loss or additional breaches on other accounts. It’s no surprise that privacy is an issue when data mining takes place.
Why do enterprises insist on harvesting data without our express permission? What do they expect to achieve with data analytics that basic human intelligence cannot determine from what should be a service or product provider and client relationship? Despite all the hype, for commercial enterprises, it all comes down to one thing; monetize everything.
Personally, I believe data science should be left in the shadowy corridors of academia (where hopefully a sense of ethics still applies) and not used for profit or for surveillance by state actors. It’ll never happen but…
Big data is being used for many worthwhile applications, such as weather prediction, climate change analysis, city operations and social justice. Such applications involve use of big data but not of the individual i.e. no personally identifiable information (PII) is compromised.
Return to Paper Files?
Given the number of hacks previously mentioned, where millions of users were impacted by breaches, it’s fair to say that companies cannot protect digital data. Many do not even encrypt it or are lax in permission management (weak passwords, for example). We’ve known for quite some time now that financial and healthcare institutions are attractive targets for cyber criminals. Why do holders of valuable data ignore basic cyber security principles? I advocate a return to paper for all sensitive transactions (financial and medical) and the storage of same in filing cabinets, just like in the good old days. Being connected to the global Internet has its drawbacks, when hackers can remotely launch attacks. Launching a remote attack on a filing cabinet requires more thought, doesn’t it?
Of course, big data sets are not just databases of financial transactions or clinical visits. Every aspect of our digital and personal lives is tracked, logged and stored for future analysis. They call it quantitative or predictive analytics rather than an invasion of privacy. It's not enough that we have used their service or purchased their products, other insights are necessary, based on algorithms created by data scientists or off-the-shelf software.
Privacy is Dead, Long Live Surveillance
If you believe that you have privacy online, I hate to burst your bubble, as without using specific tools (Tor, Signal and ProtonMail, for example) everything you do online is tracked in some way. Your browser searches, site visits, social media interactions and online comments are all part of various data sets. The official line is that it’s to personalize the user experience but, in most cases, it’s about personalized marketing, ad placement and so on.
Big Tech, companies such as Google, Facebook, Amazon are heavily invested in big data as their revenue depends on it. You’ll have noticed that comments on a subject will result in related ads on Facebook. Amazon’s Alexa, Apple’s Siri and until recently Google’s Assistant all gather audio by default, to enhance future efforts in voice recognition and related AI tech, compromising privacy in the process.
Luckily, the times are a-changing, with many countries enforcing data and privacy protection laws, taking on big business and their operational methods as necessary.
Often cited as the biggest issue of data mining, compliance with applicable regulations is a known problem for those involving in invading our privacy. Google and Facebook have already received millions in fines and many others are under investigation . Facebook is expecting a $5 billion fine from the FTC.
These are clear indications that governments are prepared to take a tougher stance on data privacy, with the EU’s GDPR perceived as key legislation that protect users. In fact, legislation in other countries often reflects the key principles of GDPR. In the U.S., several states (including California, with the CCPA (California Consumer Privacy Act) have adopted similar principles in their legislation. An equivalent federal law has yet to be passed but has been called for.
Obviously, there are privacy concerns with data mining and enterprises that already collect data must comply with regulations or face penalties. I’ve focused on global giants but the same GDPR principles should be applied to all data collection activities, even if you do not have European clients. Surely, it’s worth protecting your client or user data?
Those considering big data initiatives must also consider the challenges involved. I’m delighted to point out that big data requires skills that are not readily available. There’s a known shortage of data scientists and related professionals. Those that are available are expensive.
Ensuring data quality is problematic, especially if collected data is from several sources. Removing PII is difficult, especially if you consider that a combination of several ‘anonymous’ facts could unwittingly identify a specific person.
Users demand privacy. That will not go away and companies that casually monetize and resell all data associated with user or client activity must ensure that PII is removed. Or suffer the consequences…
In conclusion, while big data has its advantages, is it necessary to analyze everything? Product and service sales used to be about the quality of that service or product and a reputation gained by referrals or positive reviews. Now, it seems that being a customer is not enough. Having the money to pay is not enough. A visit to a hospital should not result in being placed on marketing databases for insurance, real estate or any other industry not asked for.
As a user of multiple solutions, why not act as I have? Ensure that big data becomes pointless by using VPNs (to make geolocation data useless) and ad blockers while online. Only activate location services and GPS on smartphones when you need them. The less information you give them (all who track us) the better. Is personalization a good thing, when the so-called insights provided to business travellers reflect a language you do not speak? With all the ways they gather information from us, websites and search engines cannot detect the system language of the user?
In my view, big data is failing to deliver in many areas and those concerned about privacy should stop using services that harvest personal data with no consideration for our privacy. “You have a Facebook page,” you may say. Sure, I do but I only post what I want everyone to see–no family photos or commentary that my family or employers would object to. My aversion to certain green vegetables, simian presidents or unethical business practices are common knowledge and I defy anyone to make it commercially viable. Everything else is private… apart from the tracking in every aspect of my waking life–CCTV, the payments I make, where I stay or travel to and of course my smartphone, itself a surveillance powerhouse. Yes, I value the little privacy I have, how about you?