"Applications come and go, but data is forever," said Justin Cave, lead database consultant at Distributed Database Consulting, noting that he frequently sees programmers designing a database to cater to a specific application without considering future data needs. While they are often under pressure to deliver a database in a relatively short period of time, Cave said taking design shortcuts to help meet a deadline can make it much tougher to add new functionality in the future.
"If you properly design a database upfront, it is going to work for everybody. If you skip some functionality because 'we don't really need that,' you can count on someone needing it in the future," he said.
Too many programmers "treat databases like a data dump or a black box," Cave said, and choose to build database-independent applications. While there are instances where this is appropriate, it is not a database design best practice. "When you are not using the power of the database, you end up making things less efficient and harder to maintain," he said.
Utilizing the database makes it relatively easy to reuse design elements and add functionality. "But if everything is built into the application, you'll have to rebuild validation and a lot of the logic and you'll never get the same information," he said.
"There are so many ways that data gets used, it's hard for someone whose task is to 'build X' to envision that," Cave said, adding that it's not uncommon for developers to wait until the final stages of development to interact with database administrators. The result may be a lack of attention to details important to DBAs, such as naming conventions which make it easy to identify objects contained in the database. Given that DBAs may not have much time to spare to work with developers, Cave suggested it's a good idea for developers to "learn to think like a DBA."
Despite widespread agreement that software engineering is important, many programmers forego data models in database design, said Michael Blaha, a partner at Modelsoft Consulting and author of seven books including "Object-Oriented Modeling and Design." Noting that "data is the memory of an enterprise," he said, "Code is important, but replaceable; an organization can always rewrite an app or purchase a new app. There is no excuse for not constructing at least a summary data model."
Many developers lack imagination in how applications can interact with a database, Blaha said. They may choose to use a database layer, hiding the database with programming objects that make database data indistinguishable from in-memory data. While this is a viable strategy, it has limitations.
"There are additional data interaction strategies," Blaha said. "For example, dedicated methods can encapsulate data access. Some methods can be written as stored procedures. Others can be written in programming code. The remaining programming logic accesses the database via these special methods."
Rather than "obsessing on use cases" and trying to capture requirements from scratch, Blaha said developers can reuse database design ideas that have worked well in the past.
"For example, I often reverse engineer databases -- databases of an application to be replaced as well as databases of related applications," he said. "These existing databases often do not have an available data model. But a data model is implicit in the database schema and can be at least partially extracted with database reverse engineering techniques."
Another good example of reuse is incorporating data patterns, Blaha said, noting that one of his books, "Patterns of Data Modeling," explains many data patterns. "There are tried-and-true data representations that often occur and need not be recreated from scratch."
Every foreign key in a database should be covered by an index, Blaha said. While some foreign keys are covered by implicit indexes that are created as a side effect of primary and alternate key definitions, developers should create an explicit index for the other foreign keys, he explained.
"Database queries often traverse from a foreign key to its primary key," Blaha said. "The resulting joins are efficient if primary keys are defined and every foreign key has an index. Just one missing index can degrade query performance by several orders of magnitude. It's hard to predict the traversals that might occur."
Even if you're not planning on partitioning now, the application will likely need to evolve to support it at some point. Bryn Rhodes, co-owner and software architect for Database Consulting Group, said selection of a key type that easily supports partitioning is a database design best practice that can mean the difference between a smooth transition and a difficult upgrade.
Optimal structures for one use case are often different than for another, especially when considering access paths to data. Make sure you take multiple perspectives into account so you find these potential issues early, Rhodes said, and build structures that enable data access in various ways.
When modeling the structures, be sure to consider that everything you build is going to have to be consumed by a data access layer at some point. Avoid using proprietary extensions to SQL unless there is a very good reason to do so. And if you must use proprietary extensions, try to provide some insulation for them.