Latest posts by Kalsi (see all)
- Who Else Wants to Know the Mystery Behind Sentiment Analysis? - June 23, 2019
- 5 Real Examples of Artificial Intelligence - October 4, 2018
- Everything you wanted to know about AWS IoT - March 7, 2017
“The goal is to turn data into information, and information into insight.” – Carly Fiorina, Former CEO of HP
Is there a hidden problem with Big Data adoption?
If there is one thing that pretty much every analyst agrees on it is that the Big Data (or Analytics) age is here. On the face of it there is no arguing with that. Consider the sheer volume of data being generated, stored, secured and analysed today. Data Center company ViaWest estimated that a staggering “2.5 quintillion bytes of data are created (one followed by 18 zeros) every day”. Every activity, whether online or offline, is logged and the devices we use send that data to the organizations that can leverage it better.
Scratch the surface a bit though and a slightly different picture is revealed. One clue emerges from IDC’s most recent “Trends in Enterprise Hadoop Deployments” that showed only 32% of the surveyed organizations had deployed Hadoop. Late last year an InformationWeek survey found only 13% of the surveyed organizations had Hadoop in production or pilot. Considering Hadoop as a reasonable proxy for Big Data both the surveys are almost farcically short of Deloitte’s estimate from a few years ago that over 90% of the Fortune 500 would have functioning Big Data initiatives by 2012. So what’s the big deal about big data adoption?
The gap between input and insight
Our view is that, as a general rule, organizations are still only flirting with these initiatives because they see an incompleteness in the way these solutions are put together. We believe that an obvious gap exists between the maths of the data scientists and the technology of the Big Data paradigm. This gap comes in the way of the organization as it seeks to leverage the data well enough to draw out the most impactful insights for the business.
Data Science in this context can be defined as residing everywhere between the processes that collect the inputs to the business and the insights that the business gets after that data is crunched. Mathematician John Tukey said, “The greatest value of a picture is when it forces us to notice what we never expected to see.” The focus of the data scientist is the data itself – organizing and applying statistical models to it to identify trends not otherwise apparent. There is also the sense that in many ways Data Science is a more mathematical practice and to that extent can even exist independent of the underlying business aims and values.
Which data is relevant?
One issue immediately apparent is many organizations don’t know for sure what data they should be gathering. This could be a huge problem –like Alex Peiniger, CEO of quintly said, “.. you should only measure and look the numbers that drive action, meaning the data tells you what you should do next” Organizations need good advice on what are the data points that have relevance to their business and the best ways to gather that information. Once that data is available to the data scientists they can work their magic and the right patterns can emerge.
Are we missing something?
The other problem is organizations often don’t have the capability to bring together a cohesive view of the facts that could have relevance to the business and the technology trends out there that could contribute valuable data that impacts those facts. A comprehensive strategy is also about experimenting with other kinds of data which you may not think as relevant to the business e.g. weather, or stock market, news events. Examples are becoming common in the healthcare space – a Stanford School of Medicine was able to establish a previously theorized but never adequately established correlation between pre-term births and exposure to environmental toxins by studying medical records, birth census information, weather data and pollution related information from the Environment Protection Agency.
A complete solution
The third big challenge seems to be to put together an end to end implementation of the acceptable solutions. This requires an understanding of the business, the maths of the data science and the technology of big data. The need is to establish a seamless handover from one to the other while keeping the business objectives in mind at all times. This is where specialists (like us) come in – organizations know their business but is would be too much to expect them to stay ahead of the technology and also to build up the mad maths skills their data cries out for.
For most organizations there is definite value to be had from a well implemented Big Data initiative, but the operative term is well implemented. The alternative is, well, scary. According to Geoffrey Moore, “Without Big Data analytics, companies are blind and deaf, wandering out onto the web like deer on a freeway.”