The world of Big Data is full of big money. The latest group to raise capital is Big Data application vendor Concurrent, the leading sponsor of the open source Cascading framework used to build Big Data apps. Concurrent today announced it has raised $4 million in a Series A funding round.
"The whole Big Data space is frothy, so a lot of VCs have approached Concurrent to put money in," Concurrent CEO Gary Nakamura told Enterprise Apps Today. "The timing is pretty good, since Cascading as a development framework for Big Data applications has accelerated over the last 12 months, to the tune of 75,000 downloads a month."
Hiding Hadoop Complexity
Nakamura explained that Big Data Hadoop adopters are now starting to move into the next phase of adoption maturity. While the initial phase is typically about small nodes and trial applications, the phase that Nakamura now sees evolving is one in which mainstream enterprises start deploying clusters of Hadoop with 100 nodes or more and running applications on top.
"Cascading is really taking off mostly because it's a simple way to build applications and it obfuscates the complexity of Hadoop," Nakamura said.
For Concurrent, the opportunity lies in extending what is available as open source with enterprise offerings. Nakamura did not provide specifics on what the enterprise version of Cascading might include, though he stressed that the core project will always remain open source.
The enterprise version of Cascading is expected to have its first beta this summer.
What About SQL?
A SQL query-based approach is one of the ways that people currently use Hadoop to wrangle Big Data. The core Hadoop project has the Hive project to do that. Other industry efforts including Cloudera's Impala project are emerging to do the same thing.
Cascading is taking a different approach, however.
"Cascading is really just a generalized platform for doing computation," Cascading founder Chris Wensel explained. "It's a Java API that lets people build their own applications on top."
There is also a SQL layer on top of Cascading called Lingual.
"If you can move SQL-based workloads onto Hadoop, you're winning" Wensel said. "If you can move your data off Hadoop through a SQL interface so you can do your job, you're winning."