Databases are proliferating like, well, cockroaches. The overall database market is expected to reach $50 billion by 2017, according to IDC, with the landscape changing quickly among many new players.
Deep, impactful forces are underneath these changes,said Tarun Thakur, co-founder and CEO of cloud data recovery startup Datos IO, who mentions economics, open source databases, new applications and the cloud among them.
Here are some of the most significant database trends:
Next Generation Applications
The majority of next-gen applications have a high volume of real-time data coming in.
"Today's application includes many, many different endpoints; they might be browsers, mobile devices or sensors on machines. It’s what I call a 'follow you everywhere' system," explained Robin Schumacher, vice president of products for DataStax, the commercial company building upon Apache Cassandra.
"It tends to be geographically distributed; they're all around the globe. They're intensely transactional; there's information being passed back and forth constantly. The system can never go down. And they tend to be very responsive. It's fast and it's intelligent: what to do next, where to click next, what to buy next."
Traditional databases are ill-equipped to handle applications that run everywhere at once, have data synchronized everywhere and provide the same levels of data access and performance to all those end points, he said.
New Data Types
Kurt Mackey, CEO of Compose, a database-as-a-service vendor acquired by IBM, points to the sophistication of the features developers want to deliver to users.
Using Facebook as an example, he said, "Facebook will happily tell you, 'Hey, there’s a friend of yours nearby' and these new features spawn all this imagination [about the possibilities among other companies], so developers are getting a lot of requests for features that use a lot of different types of data, like geodata."
Multi-model Database Strategy
DataStax is among the companies pursuing a multi-model strategy, in which one database handles multiple types of data and different data models.
An e-commerce vendor's application might include a product catalog, user-profile management, a shopping cart, a recommendation engine, a fraud-detection component and analysis, for instance. All these components likely will have different data management and storage requirements, Schumacher said.
Gartner has predicted that by 2017, all leading operational database management systems will offer multiple data models, including relational and NoSQL, in a single platform.
However, Mackey warned that multi-model can get database vendors in trouble. "[Especially when they raise a lot of money,] they expand the scope of what they're trying to solve and lose sight of why they're really good at things," he said.
Rapid growth remains an ongoing problem for database users, and scale has been a particular weakness for relational databases. High-scale heavyweights including Facebook, Google, LinkedIn and Twitter have teamed up to solve this problem in MySQL with a project called WebScaleSQL.
It's a problem new entrants to the market are trying to solve, such as Crate which offers distributed SQL-supporting Docker containers and microservices.
NoSQL databases, meanwhile, boast of their scalability, such as the "push button" scalability of Riak KV through its integration with Mesos orchestration.
Open Source Databases
While Oracle, MySQL and Microsoft SQL Server still dominate the market -- and though owned by Oracle, MySQL is still open source -- other open source options including Cassandra, Redis, Postgres and MongoDB make the top 10 in the list of most popular databases at DBEngine.
In Datos IO's recent survey, MongoDB was cited as leading choice for next-generation database deployment, followed by Cassandra then cloud-native databases such as Amazon Aurora and Amazon DynamoDB.
"There's an entire midmarket customer segment that does not want to worry about infrastructure. They say, 'Why do I want to worry about deployment and nodes?'" said Shalabh Goyal, Datos IO's product manager for Big Data and cloud databases, of the database-as-a-service (DBaaS) trend.
The DBaaS market is expected to be worth $1.8 billion in the coming year, according to 451 Research.
Ninety percent of database managers and executives in a survey by 451 Research said their selection of database technologies relates to the vendors' cloud strategy. While the report states that adoption of DBaaS offerings from cloud providers, especially Amazon Web Services (AWS), has posed a challenge to the big database vendors, survey results indicate the big players will retain their dominance as DBaaS moves mainstream.
Ease of Database Use
Lack of knowledge and experience was cited as one of the biggest barriers to use of next-gen databases in the Datos IO survey.
Deep Information Sciences, meanwhile, says it is addressing these issues with its database deepSQL, which it bills as "self-tuning," with no human tuning required.
Florida State University's Research Computing Center, a high-performance computing lab that has no dedicated DBAs, is using the database. The center's interim director, Paul Van Der Mark, praises its performance on massive data sets and calls it a good choice for less-technical users. One faculty member had never performed an SQL query before, he said.
Database Analytics Capabilities
Database vendors are adding more analytics capabilities as they try to help customers struggling with increasingly large and diverse data streams.
The new analytic workloads are more complex than the typical business intelligence (BI) use case, notes database pioneer Michael Stonebraker, an adjunct professor at MIT's Computer Science and Artificial Intelligence Laboratory.
Internet advertisers want to know how women who bought an Apple computer differ from women who bought a Ford pickup, and the most profitable ads to present to each, for example.
Yet analytics platforms are in a "wild West" phase, and "all solutions we know of have scalability and performance problems," he writes in his Readings in Database Systems or "Red Book" analysis.
OLAP + OLTP = HTAP
Not too long ago, a database could process transactions or perform analytics, but not both and certainly not at the same time. But new systems have OLAP (online analytical processing) and OLTP (online transaction processing) running simultaneously, a function called hybrid transaction/analytical processing (HTAP).
Gartner coined the HTAP moniker and outlined four key benefits: data doesn't have to move from operational databases to data warehouses; transactional data is readily available for analytics when created; drill-down from analytic aggregates always points to fresh HTAP application data; and you eliminate or at least reduce the need for multiple copies of the same data.
APIs and Data Accessibility
Application programming interfaces have become the way of the world, with even once-considered-laggard federal agencies using APIs to make their databases more available to researchers and developers in projects such as openFDA.
Predictive APIs are becoming the latest thing, harnessing machine learning to delve into the new world of artificial intelligence.