In the world of Big Data, Hadoop is the big thing. Yet even though Hadoop has become the defining application of the Big Data world, until today it had not officially attained 1.0 status.
The Apache Software Foundation today officially announced the release of Hadoop 1.0, marking a significant milestone for the stability of the Big Data effort. The new release provides additional security as well as support for the Hadoop HBase database.
“Typically when people talk about Hadoop, it's much more than just the Apache Hadoop project," Arun C. Murthy, vice president of Apache Hadoop, told InternetNews.com. "One of the big things with the 1.0 release is that it finally brings complete support for HBase, which previously had not been as reliable as it should have been."
The core Apache Hadoop project includes the HDFS filesystem and the MapReduce processing engine. Murthy noted that with 1.0, the main development components of Hadoop are now contained in a single coherent release.
Going a step further, the 1.0 release also includes Webhdfs, an HTTP web interface to the Hadoop filesystem.
“This way you don't have to have a Java or C client to talk to the filesystem; you can do it over HTTP, which makes it easy to integrate and work with the filesystem," Murthy said.
From a security perspective, the Hadoop 1.0 release includes Kerberos strong authentication support to help secure Big Data.
“At this point we figured that as a community we can support this release and be compatible for the foreseeable future. That makes this release an ideal candidate to be called 1.0," Murthy said.
A 1.0 release in an open source context means that the project is committing to binary compatibility for all 1.x releases, making the project more stable. The Hadoop 1.0 release is API compatible with the entire 0.20.x release branch and binary compatible with the 0.20.205 release, Murthy explained.
While Hadoop is just now hitting the 1.0 release milestone, Murthy stressed that the project has long been stable and reliable enough for big production usage. He started working on Hadoop six years ago at Yahoo and knows firsthand how well it works, even prior to a 1.0 release.
“Hadoop is widely used at Yahoo, Facebook, LinkedIn and other big sites so I don't think it matters as much about the 1.0 number," he said. "Having said that, it will be nice to have 1.0 because it means it can fit into more places. For those that do care about a number, now it doesn't matter."
The 1.0 release is a strong statement from the Apache community that Hadoop will be compatible and supported for the foreseeable future, Murthy added.
“We hope it will aid quicker adoption," he said.