Is Hadoop Maturity in Sight?

by Drew Robb
Is Hadoop Maturity in Sight?

Gartner predicts 'anemic' growth for Hadoop in the near term. What gives?

There is no escaping the outpouring of hype about Hadoop. Some see it as heading toward maturity in terms of its enterprise readiness, while others think it is the enabler of a new class of so-called data lakes. But more than a few aren’t convinced.

Gartner analyst Nick Heudecker said enterprises remain tentative about heavy investment in Hadoop. This is due to concerns about overall business value as well as a shortage of skills in this specialized area of technology, he said.

Gartner surveys reveal more than half of enterprises have no plans to do anything at all with Hadoop within the next two years. Fewer than one in five plan Hadoop investments over the next two years.

So what’s going on with this darling of the Big Data analytics community?

There are plenty of well-promoted successes available about Hadoop's ability to distribute the processing of large data sets across clusters and scale out to thousands of servers. That said, Gartner believes that while Hadoop may be great for particular use cases, it could be overkill for the average business that does not need to derive instant insight from mountains of data, Twitter feeds, transactions and systems. That’s one of the reasons why the analyst firm called the outlook for Hadoop growth "anemic" for at least the next two years.

$100 Million Hadoop

Others are far more optimistic about Hadoop. Cloudera has been heralded as only the second open source company in history to beat the $100 million mark in annual revenues, a feat accomplished on the back of Hadoop. Mike Olson, the company's chief strategy officer and one of the original founders of Hadoop, is not surprisingly bullish on the subject.

Big Data is transforming industry, and Hadoop is faced with an enormous opportunity, he said. "The ability to ingest, store, process and analyze any kind of data is transformative. By being able to scale to hundreds of terabytes, you no longer can only keep a quarter’s worth of transactional data online, but a decade’s worth to analyze."

In particular, he believes Hadoop's capabilities are opening the door to interactive and real time data analytics for such functions as better fraud monitoring, recognition of money laundering and improved security. Far from floundering, he sees Hadoop as making steady ground in the enterprise.

As organizations can now build apps that utilize Hadoop far more easily, Cloudera is seeing more usage of mobile data analytics, said Olson. He gave the example of Tableau as one of many companies providing tools that sit on top of Hadoop. Tableau connects to Hadoop distributions (Cloudera, Pivotal, HortonWorks, MapR and IBM InfoSphere BigInsights) to provide data analytics and visualization.

Hadoop has yet to attain the maturation needed to provide true end-to-end enterprise analytics, Olson conceded. But he cited the emergence of ERP running on relational databases as a historical example of technology evolution. After the advent of those early databases, it took at least a decade for a complete toolset to emerge to take advantage of these new capabilities, he pointed out.

Hadoop's Role in Data Infrastructure

Far from conquering all, Olson noted that Hadoop is not leading to the elimination of large swaths of traditional infrastructure. On the contrary, people are running data warehouses and relational databases to solve the problems they were designed to address.

What Cloudera is picking up is business related to ingesting and storing data from many different sources in order to do analytics and processing more cheaply at a much larger scale. These systems are gradually being integrated with legacy platforms into enterprise data hubs, he said.

High-end analytic workloads, for instance, are likely to remain on OLAP within a traditional data warehouse structure. And don’t expect many people to migrate transactional systems to Hadoop. Traditional databases will continue to deal with that traffic.  

Hadoop and Influential Analytics

Paul Maritz, CEO of Pivotal, is another who remains positive about Hadoop’s future. He sees its biggest value in being able to catch something or someone in the act of doing something and change it in order to power new experiences. In a health care use case, an app could tell you not to eat that burger and fries, or remind you to set up a medical appointment.

"To become part of our daily lives, Big Data analytics will have to provide a compelling experience that users will want and that can be used to change behavior," Maritz said.

He also talked about car company Tesla Motors, which has done pioneering work in making the app almost as important as the car – with the idea being that the relationship between the driver, the automotive supplier and the vehicle largely comes through the app.

Pivotal Goes Open Source

Whatever direction this goes in, Pivotal has taken steps to let many more become involved. Pivotal's Cloud Foundry represents the fact that the company has surrendered control of core components of its Hadoop technology to a consortium of over 40 companies that will combine their resources to push the technology forward as an open source project. Pivotal will provide its own distribution based on that Cloud Foundry core.  

"The Pivotal Big Data Suite lets you analyze data at scale by exploiting low-cost cloud infrastructure to sort through data and analyze it in real time," said Maritz. "This means you can take a data warehouse, a relational database and Hadoop clusters and make it one platform."

The suite is available as an appliance or in the cloud. Pricing is per CPU core per year. Pivotal also has upgraded the suite with a query optimizer which is said to deliver a performance boost to queries against the Pivotal Greenplum database.

Hadoop Skills Gap

While Maritz and Olsen are avid proponents of Hadoop, it could well be that its technology adoption curve is hitting a road block in terms of available talent. According to Gartner, nearly 60 percent of enterprises feel the skills gap is a major inhibitor of Hadoop. Those with the requisite skills are currently in high demand and are probably beyond the pay range of the average business.

It will take a surge in qualified talent to bridge the gap to more widespread Hadoop adoption. Gartner’s Heudecker believes another two to three years are required before enough skilled labor will be available.

Drew Robb is a freelance writer specializing in technology and engineering. Currently living in Florida, he is originally from Scotland, where he received a degree in geology and geography from the University of Strathclyde. He is the author of Server Disk Management in a Windows Environment (CRC Press).

  This article was originally published on Thursday Jun 4th 2015
Mobile Site | Full Site