Protecting Data is Good. Protecting Information Generated from Big Data is Priceless

By E.G. Nadhan, HP

This was the key message that came out of The Open Group® Big Data Security Tweet Jam on Jan 22 at 9:00 a.m. PT, which addressed several key questions centered on Big Data and security. Here is my summary of the observations made in the context of these questions.

Q1. What is Big Data security? Is it different from data security?

Big data security is more about information security. It is typically external to the corporate perimeter. IT is not prepared today to adequately monitor its sheer volume in brontobytes of data. The time period of long-term storage could violate compliance mandates. Note that storing Big Data in the Cloud changes the game with increased risks of leaks, loss, breaches.

Information resulting from the analysis of the data is even more sensitive and therefore, higher risk – especially when it is Personally Identifiable Information on the Internet of devices requiring a balance between utility and privacy.

At the end of the day, it is all about governance or as they say, “It’s the data, stupid! Govern it.”

Q2. Any thoughts about security systems as producers of Big Data, e.g., voluminous systems logs?

Data gathered from information security logs is valuable but rules for protecting it are the same. Security logs will be a good source to detect patterns of customer usage.

Q3. Most BigData stacks have no built in security. What does this mean for securing Big Data?

There is an added level of complexity because it goes across apps, network plus all end points. Having standards to establish identity, metadata, trust would go a long way. The quality of data could also be a security issue — has it been tampered with, are you being gamed etc. Note that enterprises have varying needs of security around their business data.

Q4. How is the industry dealing with the social and ethical uses of consumer data gathered via Big Data?

Big Data is still nascent and ground rules for handling the information are yet to be established. Privacy issue will be key when companies market to consumers. Organizations are seeking forgiveness rather than permission. Regulatory bodies are getting involved due to consumer pressure. Abuse of power from access to big data is likely to trigger more incentives to attack or embarrass. Note that ‘abuse’ to some is just business to others.

Q5. What lessons from basic data security and cloud security can be implemented in Big Data security?

Security testing is even more vital for Big Data. Limit access to specific devices, not just user credentials. Don’t assume security via obscurity for sensors producing bigdata inputs – they will be targets.

Q6. What are some best practices for securing Big Data? What are orgs doing now and what will organizations be doing 2-3 years from now?

Current best practices include:

  • Treat Big Data as your most valuable asset
  • Encrypt everything by default, proper key management, enforcement of policies, tokenized logs
  • Ask your Cloud and Big Data providers the right questions – ultimately, YOU are responsible for security
  • Assume data needs verification and cleanup before it is used for decisions if you are unable to establish trust with data source

Future best practices:

  • Enterprises treat Information like data today and will respect it as the most valuable asset in the future
  • CIOs will eventually become Chief Officer for Information

Q7. We’re nearing the end of today’s tweet tam. Any last thoughts on Big Data security?

Adrian Lane who participated in the tweet jam will be keynoting at The Open Group Conference in Newport Beach next week and wrote a good best practices paper on securing Big Data.

I have been part of multiple tweet chats specific to security as well as one on Information Optimization. Recently, I also conducted the first Open Group Web Jam internal to The Cloud Work Group.  What I liked about this Big Data Security Tweet Jam is that it brought two key domains together highlighting the intersection points. There was great contribution from subject matter experts forcing participants to think about one domain in the context of the other.

In a way, this post is actually synthesizing valuable information from raw data in the tweet messages – and therefore needs to be secured!

What are your thoughts on the observations made in this tweet jam? What measures are you taking to secure Big Data in your enterprise?

I really enjoyed this tweet jam and would strongly encourage you to actively participate in upcoming tweet jams hosted by The Open Group.  You get to interact with a wide spectrum of knowledgeable practitioners listed in this summary post.

NadhanHP Distinguished Technologist and Cloud Advisor, E.G.Nadhan has more than 25 years of experience in the IT industry across the complete spectrum of selling, delivering and managing enterprise level solutions for HP customers. He is the founding co-chair for The Open Group SOCCI project, and is also the founding co-chair for the Open Group Cloud Computing Governance project. Connect with Nadhan on: Twitter, Facebook, LinkedIn and Journey Blog.