Thinking About Big Data

By Dave Lounsbury, The Open Group

“We can not solve our problems with the same level of thinking that created them.”

- Albert Einstein

The growing consumerization of technology and convergence of technologies such as the “Internet of Things”, social networks and mobile devices are causing big changes for enterprises and the marketplace. They are also generating massive amounts of data related to behavior, environment, location, buying patterns and more.

Having massive amounts of data readily available is invaluable. More data means greater insight, which leads to more informed decision-making. So far, we are keeping ahead of this data by smarter analytics and improving the way we handle this data. The question is, how long can we keep up? The rate of data production is increasing; as an example, an IDC report[1] predicts that the production of data will increase 50X in the coming decade. To magnify this problem, there’s an accompanying explosion of data about the data – cataloging information, metadata, and the results of analytics are all data in themselves. At the same time, data scientists and engineers who can deal with such data are already a scarce commodity, and the number of such people is expected to grow only by 1.5X in the same period.

It isn’t hard to draw the curve. Turning data into actionable insight is going to be a challenge – data flow is accelerating at a faster rate than the available humans can absorb, and our databases and data analytic systems can only help so much.

Markets never leave gaps like this unfilled, and because of this we should expect to see a fundamental shift in the IT tools we use to deal with the growing tide of data. In order to solve the challenges of managing data with the volume, variety and velocities we expect, we will need to teach machines to do more of the analysis for us and help to make the best use of scarce human talents.

The Study of Machine Learning

Machine Learning, sometimes called “cognitive computing”[2] or “intelligent computing”, looks at the study of building computers with the capability to learn and perform tasks based on experience. Experience in this context includes looking at vast data sets, using multiple “senses” or types of media, recognizing patterns from past history or precedent, and extrapolating this information to reason about the problem at hand. An example of machine learning that is currently underway in the healthcare sector is medical decision aids that learn to predict therapies or to help with patient management, based on correlating a vast body of medical and drug experience data with the information about the patients under treatment

A well-known example of this is Watson, a machine learning system IBM unveiled a few years ago. While Watson is best known for winning Jeopardy, that was just the beginning. IBM has since built six Watsons to assist with their primary objective: to help health care professionals find answers to complex medical questions and help with patient management[3]. The sophistication of Watson is the reaction of all this data action that is going on. Watson of course isn’t the only example in this field, with others ranging from Apple’s Siri intelligent voice-operated assistant to DARPA’s SyNAPSE program[4].

Evolution of the Technological Landscape

As the consumerization of technology continues to grow and converge, our way of constructing business models and systems need to evolve as well. We need to let data drive the business process, and incorporate intelligent machines like Watson into our infrastructure to help us turn data into actionable results.

There is an opportunity for information technology and companies to help drive this forward. However, in order for us to properly teach computers how to learn, we first need to understand the environments in which they will be asked to learn in – Cloud, Big Data, etc. Ultimately, though, any full consideration of these problems will require a look at how machine learning can help us make decisions – machine learning systems may be the real platform in these areas.

The Open Group is already laying the foundation to help organizations take advantage of these convergent technologies with its new forum, Platform 3.0. The forum brings together a community of industry thought leaders to analyze the use of Cloud, Social, Mobile computing and Big Data, and describe the business benefits that enterprises can gain from them. We’ll also be looking at trends like these at our Philadelphia conference this summer.  Please join us in the discussion.


2 Comments

Filed under Cloud, Cloud/SOA, Data management, Enterprise Architecture

2 responses to “Thinking About Big Data

  1. Interesting blog, Dave. Good to think about.
    My first reaction is that the most important paragraph is where you state “first need to understand the environments in which they will be asked to learn”. We really do need to understand ourselves what it is we want the computers to look at. At the same time, as you say, we should be looking at how (emphasis on how) machines can help us in this space. I could imagine a process involving both the machines themselves and the sort of crowd sourcing NASA’s Chris Gerty told us about in Newport Beach. I don’t see machines replacing the human element but I think you’re right that some degree of synergy is both possible and probably essential.

  2. There are some environments where I expect machines learning will prove useful, particularly those where there are large bodies of knowledge to use in drawing inferences. Law immediately comes to mind, where searching laws and case histories to find precedent and case law appropriate to the decision at hand is a necessary part of presenting a winning case. One could easily imagine a learning machine providing “Legal Precedents as a Service” a few years down the road.