In this continuing series, we're looking at root causes of data quality problems and the business processes you can put in place to solve them. Companies rely on data to make significant decisions that can affect customer service, regulatory compliance, supply chain and many other areas. As you collect more and more information about customers, products, suppliers, transactions and billing, you must attack the root causes of data quality.
Root Cause Number Nine: Defining Data Quality
More and more companies recognize the need for data quality, but there are different ways to clean data and improve data quality. You can:
- Write some code and cleanse manually
- Handle data quality within the source application
- Buy tools to cleanse data
Root Cause Attack Plan
- Standardize Tools – Whenever possible, choose tools that aren’t tied to a particular solution. Having data quality only in SAP, for example, won’t help your Oracle, Salesforce and MySQL data sets. When picking a solution, select one that is capable of accessing any data, anywhere, at any time. It shouldn't cost you a bundle to leverage a common solution across multiple platforms and solutions.
- Data Governance – By setting up a cross-functional data governance team, you will have the people in place to define a common data model.
Root Cause Number Ten: Loss of Expertise
On almost every data intensive project, there is one person whose legacy data expertise is outstanding. These are the folks who understand why some employee date of hire information is stored in the date of birth field and why some of the name attributes also contain tax ID numbers.
Data might be a kind of historical record for an organization. It might have come from legacy systems. In some cases, the same value in the same field will mean a totally different thing in different records. Knowledge of these anomalies allows experts to use the data properly.
If you encounter this situation, there are some business processes you can follow.
Root Cause Attack Plan
- Profile and Monitor – Profiling the data will help you identify most of these types of issues. For example, if you have a tax ID number embedded in the name field, analysis will let you quickly spot it. Monitoring will prevent a recurrence.
- Document – Although they may be reluctant to do so for fear of losing job security, make sure experts document all of the anomalies and transformations that need to happen every time the data is moved.
- Use Consultants – Expert employees may be so valuable and busy that there is no time to document the legacy anomalies. Outside consulting firms are usually very good at documenting issues and providing continuity between legacy and new employees.
This post is an excerpt from a white paper available here. More to come on this subject in the days ahead.
See also:
- Part One: The Basics
- Part Two: Renegades and Pirates
- Part Three: Secret Code and Corporate Evolution
- Part Four: Data Flow