The world is exploding of data. Just google the words “data explosion” and have a brief look at the number of results you will get. I did and it is probably not a surprise that the first search result I have got was a Forbes article that was published last year under the title “Why Most Companies Can’t Deal with the Data Explosion”.
Earlier this month I have attended the first large Big Data event of 2017 – Strata Hadoop in San Jose California. This is probably my 5th or 6th Strata Hadoop event in the past few years and it is interesting to see how the Big Data industry is growing fast – some would say, as fast as the data is growing and yet others may say, not fast enough.
The number of open source technologies in the Big Data space continue to grow rapidly. While it’s great to see the rapid pace of innovation this also presents a challenge for companies on how to make best use of these technologies and absorb them in their own ecosystem. Also, as the data volumes are growing exponentially, Cloud is fast becoming a key component in Big Data implementations and a facilitator in large Big Data projects, with its flexibility and capabilities to support more storage and compute power.
It was one of the analysts who said a couple of years ago in one of these events the words that I have been quoting (never enough apparently) in the past few years – “Hadoop is not an island”. Now, more than ever, it seems to me as there is an across the board understanding that if you want your Big Data project to be successful, you need solutions in place that will assure you are enterprise ready and can scale.
Let’s talk about scalability. It seems to me as the variation of the words scale / scalable / scalability was used in most if not all of the sessions I have attended, not to mention the various conversations I had at the exhibition hall. While many companies have seen successful in a Big Data pilot case, being able to scale and support the growing business demands is still one of the major challenges organizations implementing Big Data face along with the need to deliver applications fast.
Clearly, the time to value remains a hot topic on everyone’s agenda – the need to deliver value to business and to do it rapidly. This need is driving organizations to look into automation solutions or more specifically, scheduling solutions. I remember attending my first Big Data event many years ago and people looking at me puzzled when I asked them of what they are using to schedule their Hadoop jobs. Well, not anymore.
The usage of enterprise grade workload automation solutions is allowing customers not only to schedule their Big Data processes but also providing them the so much needed connectivity between the various platforms, applications and technologies in use to support their business initiatives. It provides the “management layer” so Big Data developers can focus on how to obtain maximum value of the data. As Darren Chinen, Senior Director of Data Science and Engineering in Malwarebytes said in his interview to the CUBE “We had to evaluate where we wanted to spend our time” ( Malwarebytes Cube Interview ).
One last thought I had, as I was wandering around the exhibition hall, speaking to various companies and vendors. Big Data is no longer just a “cool” initiative. It has an important role to play in the digital world and is relevant to each and every industry. Be there or be …