With data becoming increasingly central to business strategies, data quality management has never been more important. So it is a little disheartening to see that just 40 percent of companies surveyed by 451 Research were very confident in their organization's data quality or its data quality management practices.
In fact the research, sponsored by Blazent, found a complete lack of data quality management practices for a surprising 8.5 percent of respondents.
IT departments are primarily accountable for data quality at most of the surveyed companies, the research revealed. Cross-functional teams and other employees are largely not held responsible. This creates a big barrier to data quality, said Dan Ortega, Blazent's VP of marketing.
The Data Quality Divide
IT organizations and business teams need to work together on data quality initiatives, Ortega said, although this can be tough because of the traditional divide between IT and the business.
"Historically the two areas do not communicate well. I've been in IT since the early '80s, and I've never seen IT with a seat at the table at the planning sessions companies have at the beginning of the year," he said. "The only time most users deal with IT is when something breaks; they do not see IT as proactive and enabling. People do not think to look at IT and ask 'how can you help us use our customer information to launch more effective advertising programs? How can we improve our customer satisfaction levels?'"
In addition, he said, IT typically does not take a strategic view of data.
Data entry by employees was the top reason for poor data quality, cited by 57.5 percent of respondents, followed by data migration or conversion projects, mentioned by 47 percent, and mixed entries by multiple users (44 percent).
(It may be worth noting that 37 percent of respondents were in IT operations, vs. just 21 percent in business operations.)
Writing for Enterprise Apps Today, Jon Green, director of Product Management for BackOffice Associates, a provider of data management solutions, offered several suggestions for making business users more accountable for data quality, including developing a set of central data quality scorecards and asking people to review and drill down to their specific areas of responsibility, seeing if there is data assigned to them that requires cleansing or resolution.
Because business users are not too concerned about data quality when capturing data or entering it into systems, IT often ends up employing multiple data cleansing techniques, many of them manual in nature, according to the research. Nearly 45 percent of respondents said they found data errors by using reports and then taking subsequent actions to correct data.
Automated Data Quality Management
While use of automated data quality management methods is increasing, Ortega said, automation is happening more quickly with structured data than with unstructured data.
"Companies do not have a way to contextualize what is going on so they can end up with wildly inconsistent information on the same data point," he said. "Machine learning tools are not set up to handle unstructured data very well. Semantic processing tools can do it, but we are in the very early stages with those."
Most data quality management tools are geared toward technical users, although that is beginning to change, Ortega said.
"The holy grail is a business-level tool that has tight integration to the back end so that IT and the business are looking at the same data and there is more data consistency," he said. "Our whole premise is to get data consistent and contextualized so at least everyone is looking at the same thing."
The majority of respondents appear to be in the early stages of data quality management. Some 31 percent are implementing a data quality management plan, while another 33 percent are developing a plan. Twenty-four percent report that their current plan is working, while 6.5 percent said they need a new plan because the one they implemented is not working well.
Looking ahead, the research found strong interest in machine learning. About 40 percent of respondents want a machine learning program within 12 months. Twenty-two percent already have a program, and 14.5 percent would like to implement machine learning within 24 months. The two top uses for machine learning, each mentioned by about 67 percent of respondents, are predictive analytics and recommendation systems.
Machine learning currently works well for specific tasks, Ortega said, though organizations will face challenges as they broaden its usage. As with many data-oriented initiatives, data integration will likely be one of those challenges.
"Right now companies are building an algorithm that does a specific process," he said. "But how can you connect all of the different algorithms? To be really effective, all of the machine pieces need to be exchanging information with each other. The minute you open up the information silos, that is when things will really take off."