Showing posts with label business strategy. Show all posts
Showing posts with label business strategy. Show all posts

Tuesday, August 30, 2011

Top Ten Root Causes of Data Quality Problems: Part Four

Part 4 of 5: Data Flow
In this continuing series, we're looking at root causes of data quality problems and the business processes you can put in place to solve them.  In part four, we examine some of the areas involving the pervasive nature of data and how it flows to and fro within an organization.

Root Cause Number Seven: Transaction Transition

More and more data is exchanged between systems through real-time (or near real-time) interfaces. As soon as the data enters one database, it triggers procedures necessary to send transactions to other downstream databases. The advantage is immediate propagation of data to all relevant databases.

However, what happens when transactions go awry? A malfunctioning system could cause problems with downstream business applications.  In fact, even a small data model change could cause issues.

Root Cause Attack Plan
  • Schema Checks – Employ schema checks in your job streams to make sure your real-time applications are producing consistent data.  Schema checks will do basic testing to make sure your data is complete and formatted correctly before loading.
  • Real-time Data Monitoring – One level beyond schema checks is to proactively monitor data with profiling and data monitoring tools.  Tools like the Talend Data Quality Portal and others will ensure the data contains the right kind of information.  For example, if your part numbers are always a certain shape and length, and contain a finite set of values, any variation on that attribute can be monitored. When variations occur, the monitoring software can notify you.

Root Cause Number Eight: Metadata Metamorphosis

Metadata repository should be able to be shared by multiple projects, with audit trail maintained on usage and access.  For example, your company might have part numbers and descriptions that are universal to CRM, billing, ERP systems, and so on.  When a part number becomes obsolete in the ERP system, the CRM system should know. Metadata changes and needs to be shared.

In theory, documenting the complete picture of what is going on in the database and how various processes are interrelated would allow you to completely mitigate the problem. Sharing the descriptions and part numbers among all applicable applications needs to happen. To get started, you could then analyze the data quality implications of any changes in code, processes, data structure, or data collection procedures and thus eliminate unexpected data errors. In practice, this is a huge task.

Root Cause Attack Plan
  • Predefined Data Models – Many industries now have basic definitions of what should be in any given set of data.  For example, the automotive industry follows certain ISO 8000 standards.  The energy industry follows Petroleum Industry Data Exchange standards or PIDX.  Look for a data model in your industry to help.
  • Agile Data Management – Data governance is achieved by starting small and building out a process that first fixes the most important problems from a business perspective. You can leverage agile solutions to share metadata and set up optional processes across the enterprise.

This post is an excerpt from a white paper available here. My final post on this subject in the days ahead.

Monday, August 29, 2011

Top Ten Root Causes of Data Quality Problems: Part Three

Part 3 of 5: Secret Code and Corporate Evolution
In this continuing series, we're looking at root causes of data quality problems and the business processes you can put in place to solve them.  In part three, we examine secret code and corporate evolution as two of the root causes for data quality problems.

Root Cause Number Five: Corporate Evolution
Change is good… except for data quality
An organizations undergoes business process change to improve itself. Good, right?  Prime examples include:
  • Company expansion into new markets
  • New partnership deals
  • New regulatory reporting laws
  • Financial reporting to a parent company
  • Downsizing
If data quality is defined as “fitness for purpose,” what happens when the purpose changes? It’s these new data uses that bring about changes in perceived level of data quality even though underlying data is the same. It’s natural for data to change.  As it does, the data quality rules, business rules and data integration layers must also change.

Root Cause Attack Plan
  • Data Governance – By setting up a cross-functional data governance team, you will always have a team who will be looking at the changes your company is undergoing and considering its impact on information. In fact, this should be in the charter of a data governance team.
  • Communication – Regular communication and a well-documented metadata model will make the process of change much easier.
  • Tool Flexibility – One of the challenges of buying data quality tools embedded within enterprise applications is that they may not work in ALL all enterprise applications. When you choose tools, make sure they are flexible enough to work with data from any application and that the company is committed to flexibility and openness.

Root Cause Number Six: Secret Code
Databases rarely start begin their life empty. The starting point is typically a data conversion from some previously existing data source. The problem is that while the data may work perfectly well in the source application, it may fail in the target. It’s difficult to see all the custom code and special processes that happen beneath the data unless you profile.

Root Cause Attack Plan
  • Profile Early and Often – Don’t assume your data is fit for purpose because it works in the source application. Profiling will give you an exact evaluation of the shape and syntax of the data in the source.  It also will let you know how much work you need to do to make it work in the target.
  • Corporate Standards - Data governance will help you define corporate standards for data quality.
  • Apply Reusable Data Quality Tools When Possible – Rather than custom code in the application, a better strategy is to let data quality tools apply standards.  Data quality tools will apply corporate standards in a uniform way, leading to more accurate sharing of data.

This post is an excerpt from a white paper available here. The final posts on this subject will come in the days ahead.

Wednesday, August 24, 2011

Top Ten Root Causes of Data Quality Problems: Part One

Part 1 of 5: The Basics
We all know data quality problems when we see them.  They can undermine your organization’s ability to work efficiently, comply with government regulations and make revenue. The specific technical problems include missing data, misfielded attributes, duplicate records and broken data models to name just a few.
But rather than merely patching up bad data, most experts agree that the best strategy for fighting data quality issues is to understand the root causes and put new processes in place to prevent them.  This five part blog series discusses the top ten root causes of data quality problems and suggests steps the business can implement to prevent them.
In this first blog post, we'll confront some of the more obvious root causes of data quality problems.

Root Cause Number One: Typographical Errors and Non-Conforming Data
Despite a lot of automation in our data architecture these days, data is still typed into Web forms and other user interfaces by people. A common source of data inaccuracy is that the person manually entering the data just makes a mistake. People mistype. They choose the wrong entry from a list. They enter the right data value into the wrong box.

Given complete freedom on a data field, those who enter data have to go from memory.  Is the vendor named Grainger, WW Granger, or W. W. Grainger? Ideally, there should be a corporate-wide set of reference data so that forms help users find the right vendor, customer name, city, part number, and so on.

Root Cause Attack Plan
  • Training – Make sure that those people who enter data know the impact they have on downstream applications.
  • Metadata Definitions – By locking down exactly what people can enter into a field using a definitive list, many problems can be alleviated. This metadata (for vendor names, part numbers, and so on can) become part of data quality in data integration, business applications and other solutions.
  • Monitoring – Make public the results of poorly entered data and praise those who enter data correctly. You can keep track of this with data monitoring software such as the Talend Data Quality Portal.
  • Real-time Validation – In addition to forms, validation data quality tools can be implemented to validate addresses, e-mail addresses and other important information as it is entered. Ensure that your data quality solution provides the ability to deploy data quality in application server environments, in the cloud or in an enterprise service bus (ESB).

Root Cause Number Two: Information Obfuscation
Data entry errors might not be completely by mistake. How often do people give incomplete or incorrect information to safeguard their privacy?  If there is nothing at stake for those who enter data, there will be a tendency to fudge.

Even if the people entering data want to do the right thing, sometimes they cannot. If a field is not available, an alternate field is often used. This can lead to such data quality issues as having Tax ID numbers in the name field or contact information in the comments field.

Root Cause Attack Plan
  • Reward – Offer an incentive for those who enter personal data correctly. This should be focused on those who enter data from the outside, like those using Web forms. Employees should not need a reward to do their job. The type of reward will depend upon how important it is to have the correct information.
  • Accessibility – As a technologist in charge of data stewardship, be open and accessible about criticism from users. Give them a voice when processes change requiring technology change.  If you’re not accessible, users will look for quiet ways around your forms validation.
  • Real-time Validation – In addition to forms, validation data quality tools can be implemented to validate addresses, e-mail addresses and other important information as it is entered.
This post is an excerpt from a white paper available here. More to come on this subject in the days ahead.

Thursday, March 10, 2011

My Interview in the Talend Newsletter

Q. Some people would say that data quality technology is mature and that the topic is sort of stale. Are there major changes happening in the data quality world today?
A. Probably the biggest over-arching change we see today is that the distinction between those managing data from the business standpoint and those managing the technical aspects of data quality is getting more and more blurry. It used to be that data quality was... read more

Friday, December 10, 2010

Six Data Management Predictions for 2011

This time of year everyone makes prognostications about the state of the data management field for 2011. I thought I’d take my turn by offering my predictions for the coming year.

Data will become more open
In the old days good quality reference data was an asset kept in the corporate lockbox. If you had a good reference table for common misspellings of parts, cities, or names for example, the mind set was to keep it close and away from falling into the wrong hands.  The data might have been sold for profit or simply not available.  Today, there really is no “wrong hands”.  Governments and corporations alike are seeing the societal benefits of sharing information. More reference data is there for the taking on the internet from sites like data.gov and geonames.org.  That trend will continue in 2011.  Perhaps we’ll even see some of the bigger players make announcements as to the availability of their data. Are you listening Google?

Business and IT will become blurry
It’s becoming harder and harder to tell an IT guy from the head of marketing. That’s because in order to succeed, the IT folks need to become more like the marketer and vice versa.  In the coming year, the difference will be less noticeable and business people get more and more involved in using data to their benefit.  Newsflash One: If you’re in IT, you need marketing skills to pitch your projects and get funding.  Newsflash Two: If you’re in business, you need to know enough about data management practices to succeed.

Tools will become easier to use
As the business users come into the picture, they will need access to the tools to manage data.  Vendors must respond to this new marketplace or die.

Tools will do less heavy lifting
Despite the improvements in the tools, corporations will turn to improving processes and reporting in order to achieve better data management. Dwindling are the days where we’re dealing with data that is so poorly managed that it requires overly complicated data quality tools.  We’re getting better at the data management process and therefore, the burden on the tools becomes less. Future tools with focus on supporting the process improvement with work flow features, reporting and better graphical user interfaces.

CEOs and Government Officials will gain enlightenment
Feeding off the success of a few pioneers in data governance as well as failures of IT projects in our past, CEOs and governments will gain enlightenment about managing their data and put teams in place to handle it.  It has taken decades of our sweet-talk and cajoling for government and CEOs to achieve enlightenment, but I believe it is practically here.

We will become more reliant on data
Ten years ago, it was difficult to imagine us where we are today with respect to our data addiction. Today, data is a pervasive part of our internet-connected society, living in our PCs, our TVs, our mobile phones many other devices. It’s a huge part of our daily lives. As I’ve said in past posts, the world is addicted to data and that bodes well for anyone who helps the world manage it. In 2011, no matter if the economy turns up or down, our industry will continue to feed the addiction to good, clean data.

Wednesday, July 28, 2010

DGDQI Viewer Mail

From time to time, people read my blog or book and contact me to chat about data governance and data quality. I welcome it. It’s great to talk to people in the industry and hear their concerns.

Occasionally, I see things in my in-box that bother me, though.  Here is one item that I’ll address in a post. The names have been changed to protect the innocent.

A public relations firm asked:

Hi Steve,
I wonder if you could answer these questions for me.
- What are the key business drivers for the advent of data governance software solutions?
- What industries can best take advantage of data governance software solutions?
- Do you see cloud computing-based data governance solutions developing?

I couldn’t answer these questions, because they all pre-supposed that data governance is a software solution.  It made me wonder if I have made myself clear enough on the fact that data governance is mostly about changing the hearts and minds of your colleagues to re-think their opinion of data and its importance.  Data governance is a company’s mindful decision that information is important and they’re going to start leveraging it. Yes, technology can help, but a complete data governance software solution would have more features than a Workchamp XL Swiss Army Knife. It would have to include data profiling, data quality, data integration, business process management, master data management, wikis, a messaging platform, a toothpick and a nail file in order to be complete. 

Can you put all this on the cloud?  Yes.  Can you put the hearts and minds of your company on a cloud?  If only it were that easy...

Thursday, May 13, 2010

Three Conversations to Have with an Executive - the Only Three

If you’re reading this, you’re most likely in the business of data management. In many companies, particularly large ones, the folks who manage data don’t much talk to the executives. But every so often, there is that luncheon, a chance meeting in the elevator, or even a break from a larger meeting where you and an executive are standing face to face.  (S)he asks, what you’re working on. Like a boy scout, be prepared.  Keep your response to one of these three things:

  1. Revenue – How has your team increased revenue for the corporation?
  2. Efficiency – How has your team lowered costs by improving efficiency for the corporation?
  3. Risk – How have you and your team lowered the risk to the corporation with better compliance to corporate regulations?

The executive doesn’t want to hear about schemas, transformations or even data quality. Some examples of appropriate responses might include:

  • We work on making the CRM/ERP system more efficient by keeping an eye on the information within it. My people ensure that the reports are accurate and complete so you have the tools to make the right decisions.
  • We’re doing things like making sure we’re in compliance with [HIPAA/Solvency II/Basel II/Antispam] so no one runs afoul of the law.
  • We’re speeding up the time it takes to get valuable information to the [marketing/sales/business development] team so they can react quickly to sales opportunities
  • We’re fixing [business problem] to [company benefit].

When you talk to your CEO, it’s your opportunity get him/her in the mindset that your team is beneficial, so when it comes to funding, it will be something they remember. It’s your chance to get out of the weeds and elevate the conversation.  Let the sales guys talk about deals. Let the marketing people talk about the market forces or campaigns. As data champions, we also need to be prepared to talk about the value we bring to the game.

Tuesday, February 16, 2010

The Secret Ingredient in Major IT Initiatives

One of my first jobs was that of assistant cook at a summer camp.  (In this case, the term ‘cook’ was loosely applied meaning to scrub pots and pans for the head cook.) It was there I learned that most cooks have ingredients that they tend to use more often.  The cook at Camp Marlin tended to use honey where applicable.  Food TV star Emeril likes to use garlic and pork fat.  Some cooks add a little hot pepper to their chocolate recipes – it is said to bring out the flavor of the chocolate.  Definitely a secret ingredient.
For head chefs taking on major IT initiatives the secret ingredient is always data quality technology. Attention to data quality doesn’t make the recipe of an IT initiative alone so much as it makes an IT initiative better.  Let’s take a look at how this happens.

Profiling
No matter what the project, data profiling provides a complete understanding of the data before the project team attempts to migrate it. This can help the project team create a more accurate plan for integration.  On the other hand, it is ill-advised to migrate data to your new solution as-is, as it can lead to major costs over-runs and project delays as you have to load and reload it.

Customer Relationship Management (CRM)
By using data quality technology in CRM, the organization will benefit from a cleaner customer list with fewer duplicate records. Data quality technology can work as a real-time process, limiting the amount of typos and duplicates in the system, thus leading to improved call center efficiency.  Data profiling can also help an organization understand and monitor the quality of a purchased list for integration will avoid issues with third-party data.

Enterprise Resource Planning (ERP) and Supply Chain Management (SCM)

If data is accurate, you will have a more complete picture of the supply chain. Data quality technology can be used to more accurately report inventory levels, lowering inventory costs. When you make it part of your ERP project, you may also be able to improve bargaining power with suppliers by gaining improved intelligence about their corporate buying power. 

Data Warehouse and Business  Intelligence
Data quality helps disparate data sources to act as one when migrated to a data warehouse. Data quality makes data warehouse possible by standardizing disparate data. You will be able to generate more accurate reports when trying to understand sales patterns, revenue, customer demographics and more.

Master Data Management (MDM)
Data quality is a key component of master data management.     An integral part of making applications communicate and share data is to have standardized data.  MDM enhances the basic premise of data quality with additional features like persistent keys, a graphical user interface to mitigate matching, the ability to publish and subscribe to enterprise applications, and more.

So keep in mind, when you decide to improve data quality, it is often because of your need to make a major IT initiative even stronger.  In most projects, data quality is the secret ingredient to make your IT projects extraordinary.  Share the recipe.

Monday, February 1, 2010

A Data Governance Mission Statement

Every organization, including your data governance team has a purpose and a mission. It can be very effective to communicate your mission in a mission statement to show the company that you mean business.  When you show the value of your team, it can change your relationship with management for the better.

The mission statement should pay tribute to the mission of the organization with regard to values, while defining why the data governance organization exists and setting a big picture goal for the future.
The data governance mission statement could revolve around any of the following key components:

  • increasing revenue
  • lowering costs
  • reducing risks (compliance)
  • meeting any of the organization’s other policies such as being green or socially responsible

The most popular format seems to follow:
Our mission is to [purpose] by doing [high level initiatives] to achieve [business benefits]

So, let’s try one:
Our mission is to ensure that the highest quality data is delivered via company-wide data governance strategy for the purpose of improving the efficiency, increasing the profitability and lowering the risk of the business units we serve.
Flopped around:
Our mission is to improve the efficiency, increase the profitability and lower the business risks to Acme’s business units by ensuring that the highest quality data is delivered via company-wide data governance strategy.
Not bad, but a mission statement should be inspiring to the team and to management. Since the passions of the company described above are unknown, it’s difficult for a generic mission statement to be inspirational about the data governance program. That’s up to you.
 
Goals & Objectives
There are mission statements and there are objectives. While every mission statement should say who you are and why you exist, every objective should specify what you’re going to do and the results you expect.  Objectives include activities that can be easily tracked, measured, achieved and, of course, meet the objectives of the mission.  When you start data governance projects, you can look back to the mission statement to make sure we’re on track. Are you using our people and technology in a way that will benefit the company?

Staying On Mission
When you take on a new project, the mission statement can help protect us and ensure that the project is worthwhile for both the team and the company. The mission statement should be considered as a way to block busy-work and unimportant projects.  In our mission statement example above, if the project doesn’t improve efficiency, lower costs or lower business risk, it should not be considered.


In this case, your can clearly map three projects to the mission, but the fourth project is not as clear.  Dig deeper into the mainframe project to see if any efficiency will come out of the migration.  Is the data being used by anyone for a business purpose?

A Mission Never Ends
A mission statement is a written declaration of a data governance team's purpose and focus. This focus  normally remains steady, while objectives may change often to adapt to changes in the business environment. A properly crafted mission statement will serve as a filter to separate what is important from what is not and to communicate your value to the entire organization.

.

Tuesday, November 10, 2009

Overcoming Objections to a Data Governance Program


You’ve created a wonderful proposal for a comprehensive data governance program. You’ve brought it up to management, but the chiefs tell you there’s just no budget for data governance. Now what?

The best thing you can do it to keep at it. It often takes time to win the hearts and minds of your company. You know that any money spent on data governance will usually come back with multipliers. It just may take some time for others to get on board. Be patient and continue to promote your quest.

Here are some ideas for thinking about your next steps for your data governance program:

Corporate Revenue
Today, companies manage spending tightly, looking at the expenses and revenue each fiscal quarter and each month to optimize the all-important operating income (revenue minus expenses equals operating income). If sales and revenue are weak, management gets miserly. On the other hand, if revenue is high and expenses are low, your high-ROI proposal will have a better chance for approval.

For many people, this corporate reality is hard to deal with. Logical thinkers would suggest that if something is broken, it should be fixed, no matter how well the sales team is performing. The people who run your business have their first priorities set on stockholder value. You too should pay attention to your company’s sales figures as they are announced each quarter. If your company has a quarterly revenue call, use it to strike when the environment for spending is right.

Cheap Wins
If there is no money to spend on information quality, there still may be potential for information quality wins for you to exploit. For example, let’s say you were to profile or make some SQL queries into your company’s supply chain system database and you found a part that has a near duplicate. So, part number “21-998 Condenser” and part number “2-1-998 Cndsr” exist as duplicated parts in your supply chain.

After verifying the fairly obvious duplicate, you can ask your friend on the procurement side how much it costs to store and hold these condensers in inventory. Then use some guerilla marketing techniques to extol the virtues of data governance. After all, if you could find this with just SQL queries, consider how much you could find with a data discovery/profiling tool. Better yet, consider how much you could find with a company-wide initiative.  In a previous blog post, I referred to this as the low-hanging fruit.

Case Studies
Case studies are a great way to spread the word about data governance. They usually contain real-world examples, often of your competitors, who are finding gold with better attention to information quality. Vendors in the data governance space will have case studies on their websites, or you can get unpublished studies by asking your sales representative.

Consider that built-in desire of your company to be competitive, and keep your Google searches and alerts tuned to what data management projects are underway at your competitors.

Analysts
Analysts are another valuable source for proving your point about the virtues of data governance. Your boss may have installed his own custom spam filter against your cajoling on data governance. But he doesn’t have to take your word for it; he can listen to an industry expert.

If you own a subscription to an analyst firm, use it to sell the power of data governance. Analysts offer telephone consultations, reports and webinars to clients. These offerings may be useful to sway your team.  If you are not a client of these firms, go to the vendors. If there is a crucial report, they will often license it to offer on their website for download, particularly if it speaks well about their solution.

Data Governance Expert Sessions
This technique also falls within the category of “don’t just take my word for it.” You can find a data governance workshop from many vendors to assist your organization with developing your data quality strategies. Often conducted for a group, the session leader interacts with a group of your choosing and presents the potential for improving the efficiency of your business with data governance. As the meeting leader, you would invite both technologists and business users. Include those who are skeptical of the value a data-quality program will bring to their company; a third-party opinion may sway them. The cost is usually reasonable and it can help the group understand and share key concepts of data governance.

Guerrilla Marketing
Why not start your own personal crusade, your own marketing initiative to drive home the power of information quality? In my previous installment of the data governance blog, I offer graphics for use in your signature file to drive home the importance of IQ to your organization. Use the power of a newsletter, blog, or e-mail signature to get your message across.


Excerpt from Steve Sarsfield's book "The Data Governance Imperative"

Monday, August 24, 2009

9 Questions CEOs Should Ask About Data Governance

When it comes to data governance, the one most influential power in an organization with respect to data governance is the executive team (presidents, vice presidents, managing directors, and CxOs). Sure, business users control certain aspects of the initiative and may even want to hold them back to maintain data ownership. It’s also true that the technology team is influential, but may be short on staff, short on budget and busy with projects like software upgrades. So, it sometimes falls to executives to push data governance as a strategic initiative when the vision doesn’t come from elsewhere.

It makes sense. Executives have the most to gain from a data governance program. Data governance brings order to the business, offering the ability to make effective and timely decisions. By implementing a data governance program, you can make fewer decisions based on ‘gut’ and better decisions based on knowledge. It’s an executive’s job to strive for greater control and lower risk, and that can’t be achieved without some form of data governance.

Rather than issuing edicts, a tactic of many smart executives implement is to ask questions. Questioning your IT and business teams is a form of fact-checking your decisions, understanding shortcomings in skills and resources and empowering your people. It ultimately allows your people to come to the same decision at which you may have already arrived. It is a very gracious way to manage.

Therefore asking questions about data governance is an important job of a CEO. Some of the questions you should be asking your technology leaders are as follows:

Question

Impact

Do we have a data management strategy?

Ask the question to understand if your people have considered data governance. If you have a strategy, you should know who are the people and how are they organized around providing information to the corporation. What are the process for information in the organization?

Are we ahead or behind our competitors with regard to business intelligence and data governance?

Case studies on managing data are widely available on vendor web sites. It’s important to understand if any of your competitors are outflanking you on the efficiencies gained from data governance.

What is poor information quality costing us?

Has your technology team even considered the business impact of information quality on the bottom line, or are they just accepting these costs as standard operating procedure?

What confidence level do you have in my revenue reports?

Has your team considered the impact of information on the business intelligence and therefore the reports they are handing you?

Are we in compliance with all laws regarding our governance of data?

Executives are often culpable for non-compliance, so you should be concerned about any laws that govern the company’s industry. This holds especially true in banking and healthcare, but even in unregulated industries, organizations must comply with spam laws and “do not mail” laws for marketing, for example.

Are you working across business units to work towards data governance, or is data quality done in silos?

To provide the utmost efficiency, information quality processes should be reusable and implemented in similar manner across business units. This is done for exactly the same reason you might standardize on a type of desktop computer or software package for your business – it’s more efficient to share training resources and support to work better as a team. Taking successful processes from one business unit and extending them to others is the best strategy.

Do you have the access to data you need?

The CEO should understand if any office politics are getting in the way of ensuring that the business has the information it need. This question opens the door to that discussion.

How many people in your business unit are managing data?

To really understand if you need to a unified process for managing data, it often helps to look at the organizational chart and try to figure out how many people already manage it. A centralized strategy for data governance may actually prove more efficient.

Who owns the information in your business unit? If something goes right, who should I praise, and if something is wrong, who should I reprimand?

The business should understand who is culpable for adverse events with regard to information. If, for example, you lose revenue by sending the wrong type of customer discount offers, or if you can’t deliver your product because of problems with inventory data, there should be someone responsible. Take action if the answer cannot easily be given.




By asking these questions, you’ll open up the door to some great discussions about data governance. It should allow you to be a maverick for all of your company’s data needs. Thanks to Ajay Ohri for posing this question to me in last week’s interview; it’s something every executive should consider.

Tuesday, July 21, 2009

Data Quality – Technology’s Prune

Prunes. When most of us think of prunes, we tend to think of a cure for older people suffering from constipation. In reality, prunes are not only sweet but are also highly nutritious. Prunes are a good source of potassium and a good source of dietary fiber. Prunes suffer from a stigma that’s just not there for dried apricots, figs and raisins, which have a similar nutritional benefit and medicinal benefit. Prunes suffer from bad marketing.

I have no doubt that data quality is considered technology’s prune by some. We know that information quality is good for us, having many benefits to the corporation. It also can be quite tasty in its ability to deliver benefit, yet most of our corporations think of it as a cure for business intelligence constipation – something we need to “take” to cure the ills of the corporation. Like the lowly prune, data quality also suffers from bad marketing.

In recent years, prune marketers in the United States have begun marketing their product as "dried plums” in an attempt to get us to change the way we think about them. Commercials show the younger, soccer Mom crowd eating the fruit and being surprised at its delicious flavor. It may take some time for us to change our minds about prunes. I suppose if Lady Gaga or Zac Efron would be spokespersons, prunes might have a better chance.

The biggest problem in making data quality beloved by the business world is that it’s well… hard to explain. When we talk about it, we get crazy with metadata models and profiling metrics. It’s great when we’re communicating among data professionals, but that talk tends to plug-up business users.

In my recent presentations and in recent blog posts, I’ve made it clear that it’s up to us, the data quality champions, to market data quality, not as a BI laxative, but as a real business initiative with real benefits. For example:

  • Take a baseline measurement and track ROI, even if you think you don’t have to
  • If the project has no ROI, you should not be doing it. Find the ROI by asking the business users of the data what they use it for.
  • Aggregate and roll-up our geeky metrics of nulls, accuracy, conformity, etc into metrics that a business user would understand – like according to our evaluation, 86.4% of our customers are fully reachable by mail.
  • Create and use the aggregated scores similar to the Dow Jones Industrial Average. Publish them at regular intervals. To raise awareness of the data quality, talk about why it’s up and talk about why it has gone down.
  • Have a business-focused elevator pitch ready when someone asks you what you do. “My team is saving the company millions by ensuring that the ERP system accurately reflects inventory levels.”
Of course there's more. There’s more to this in my previous blog posts, yet to come in my future blog posts, and in my book The Data Governance Imperative. Marketing the value of data quality is just something we all need to do more of. Not selling the business importance of data quality... it’s just plum-crazy!

Monday, July 13, 2009

Data Quality Project Selection

What if you have five data intensive projects that are all in need of your very valuable resources for improving data quality? How do you decide where to focus? The choice is not always clear. Management may be interested in accurate reporting from your data warehouse, but revenue may be at stake in other projects. So, just how do you decide where to start?

To aid in a choice between projects, it may help to plot your projects on a “Project Selection Quadrant” as I’ve shown here. The quadrant chart plots the difficulty of completing a project versus the value it brings to the organization.




















Project Difficulty
To find the project on the X axis, you must understand how your existing system is being used; how various departments use it differently; and if there are special programs or procedures that impact the use of the data. To predict project length, you have to rely heavily on your understanding your organization's goals and business drivers.

Some of the things that will affect project difficulty:
Access to the data – do you have permission to get the data?
Window of opportunity – how much time do you have between updates to work on the data
Number of databases – more databases will increase complexity
Languages and code pages – is it English or Kanji? Is it ASCII or EBCDIC? If you have mixed languages and code pages, you may have more work ahead of you
Current state of data quality – The more non-standard your data is to begin with, the harder the task
Volume of data – data standardization takes time and the more you have, the longer it’ll take
Governance, Risk and Compliance mandates – is your access to the data stopped by regulation?

Project Value
For assessing project value (the Y axis), there is really one thing that you want to look at – money. It comes from your discussions with the business users around their ability to accomplish things like:
• being able to effectively reach/support customers
• call center performance
• inventory and holding costs
• exposure to risk such as being out of compliance with any regulations in your industry
• any business process that is inefficient because of data quality

The Quadrants
Now that you’ve assessed your projects, they will naturally fall into the following quadrants:

Lower left: The difficult and low value targets. If management is trying to get you to work on these, resist. You’ll never get anywhere with your enterprise-wide appeal by starting here.
Lower right: These may be easy to complete, but if they have limited value, you should hold off until you have complete corporate buy-in for an enterprise-wide data quality initiative.
Upper left: Working on high value targets that are hard complete will likely only give your company sticker shock when you show them the project plan. Or, they may run into major delays and be cancelled altogether. Again, proceed with caution. Make sure you have a few wins under your belt before you attempt.
Upper right: Ah, low-hanging fruit. Projects that are easier to complete with high value are the best places to begin. As long as you document and promote the increase in value that you’ve delivered to the company, you should be able to leverage these wins into more responsibility and more access to great projects.

Keeping an eye on both the business aspect of the data, its value, and the technical difficulty in standardizing the data will help you decide where to go and how to make your business stronger. It will also ensure that you and your business co-workers to understand the business value of improving data quality within your projects.

Thursday, June 25, 2009

Evil Dictators: You Can’t Rule the World without Data Governance

Buried in the lyrics of one of my favorite heavy metal songs are these beautiful words:

Now, what do you own the world? How do you own disorder, disorder? – System of the Down, Toxicity


System of the Down’s screamingly poetic lyrics reminds us of a very important lesson that we can take into the business. After all, it is the goal of many companies to “own their world”. If you’re Coke, you want to dominate over Pepsi. If you’re MacDonald’s, you want to crush Burger King. Yet to own competitive markets, you have to run your business with the utmost efficiency. Without data governance, or at least enterprise data quality initiatives, you won’t have that efficiency.

Your quest for world domination will be in jeopardy in many ways without data governance. If your evil world domination plan is to buy up companies, poor data quality and lack of continuity will prevent you from creating a unified environment after the merge. On the day of a merger, you may be asked to produce, one list of products, one list of customers, one list of employees, and one accurate financial report. Where is that data going to come from if it is not clean all over your company? How will the data get clean without data governance?

Data governance brings order to the business units. With order comes the ability to own the information of your business. The ownership brings the ability to make effective and timely decisions. In large companies, whose business units may be warring against each other for sales and control of the information, it’s impossible to own the chaos. It’s difficult to make good decisions and bring order to your people. If you want to own your market, you must have order.

Those companies succeeding in this data-centric world are treating their data assets just as they would treat cold, hard cash. With data governance, companies strive to protect their vast ecosystem of data like it is a monetary system. It can't be the data center's problem alone; it has to be everyone's responsibility throughout the entire company.

Data governance is the choice of CEOs and benevolent dictators, too. The choice about data governance is one about hearing the voices of your people. It's only when you harmonize the voices of technologists, executives and business teams that allow you produce a beautiful song; one that can bring your company teamwork, strategic direction and profit. When you choose data governance, you choose order, communication and hope for your world.

So megalomaniacs, benevolent dictators and CEOs pay heed. You can’t own the world without data governance.

Thursday, May 21, 2009

Guiding Call Center Workers to Data Quality

Data Governance and data quality are often the domain of data quality vendors, but any technology that can help your quest to achieve better data is worth exploring. Rather than fixing up data after it has been corrupted, it’s a good idea to use preventative technologies to stop poor data quality in the first place.

I recently met with some folks from Panviva Software to talk about how the company’s technologies do just that. Panviva is considered the leader in Business Process Guidance, an emerging set of technologies that could help your company improve data quality and lower training costs on your call centers.

The technology is powerful, particularly in situations where the call center environment is complex – multiple environments mixed together. IT departments in the banking, insurance, telecommunication and high-tech industries have particularly been rattled with many mergers and acquisitions. Call center workers at those companies must be trained where to navigate and which application to use to get a customer service process accomplished. On top of that, processes may change often due to change in regulation, change in corporate policy, or the next corporate merger.

To use a metaphor, business process guidance is a GPS for your complicated call center apps.

If you think about it, the way we drive our cars has really improved over the years because of the GPS. We no longer need buy a current road map at Texaco and follow the map as far as it’ll take us. Instead, GPS technology knows where we are, what potential construction and traffic issues we may face – we simply need to tell it where we want to go. Business Process Guidance provides that same paradigm improvement for enterprise applications. Rather than forcing training on your Customer Service Representatives (CSRs) with all of its unabridged training manuals, business process guidance provides a GPS-like function that sits on top of those systems, providing context-sensitive information on where you need to go. When a customer calls into the call center, the technology combines the context of the CSR’s screens with knowledge of the company’s business processes to guide the CSR to much faster call times and lower error rates.

A case study at BT leverages Panviva technology to reduce the error rate in BT's order entry system from 30% down to 6%, an amazing 80% reduction. That’s powerful technology on the front-end of your data stream.

Sunday, April 19, 2009

New Book - The Data Governance Imperative

My new book entitled The Data Governance Imperative is making its way to Amazon, Barnes and Noble, and other outlets this week. I’m very proud of this and happy to see it finally hit the streets. It was a lot of work and dedication to get it done.

I decided to write this book because I saw a common recurring question that arose during discussions about data governance. How do I get my boss to believe that data governance is important? How do I work with my colleagues to build better information and a better company? How do I break through the barriers preventing data governance maturity like getting money, resources and expertise to accomplish the task? When it comes to justifying the costs of data governance to their organization, building organizational processes, learning how to staff initiatives, understanding the role and importance of technologies, and dealing with corporate politics, there is little information available.

In my years working at Trillium Software, I have been exposed to many great projects in Fortune 1000 companies worldwide. Over the years, I’ve made note of the success factors that contribute to strong data governance. I’ve seen successful strategies for data governance and the common threads to success within and across the industry.

I’ve written the Data Governance Imperative to help readers pioneer data governance initiatives, breaking through political barriers by shining a light on the benefits of corporate information quality. This book is designed to give data governance team members insight into the art of starting data governance. It could be helpful to:

  • Data governance teams – those looking for direction/validation in starting a corporate data governance initiative.
  • Business stakeholders – those working in marketing, sales, finance and other business roles who need to understand the goals and functions of a data governance team.
  • C-level executives – those looking to learn about the benefits of data governance without having to read excessive technical jargon, or even those who need to be convinced that data governance is the right thing to do.
  • IT executives – those who believe in the power of information quality but have faced challenges in convincing others in their corporation of its value.
This book does not focus on the technical aspects of data governance, although technologies are discussed. There are some great books on the technology of data governance in the market today. Some are listed on the left side of this blog in the carousel.

Friday, January 2, 2009

Building a More Powerful Data Quality Scorecard

Most data governance practitioners agree that a data quality scorecard is an important tool in any data governance program. It provides comprehensive information about quality of data in a database, and perhaps even more importantly, allows business users and technical users to collaborate on the quality issue.

However, if we show that 7% of all tables have data quality issues, the number is useless - there is no context. You can’t say whether it is good or bad, and you can’t make any decisions based on this information. There is no value associated with the score.

In an effort to improve processes, the data governance teams should roll-up the data into metrics into slightly higher formulations. In their book “Journey to Data Quality”, authors Lee, Pipino, Funk and Wang correctly suggest that making the measurements quantifiable and traceable provide the next level of transparency to the business. The metrics may be rolled up into a completeness rating, for example if your database contains 100,000 name and address postal codes and 3,500 records are incomplete, 3.5% of your postal codes failed and 96.5% pass. Similar simple formulas exist for Accuracy, Correctness, Currency and Relevance, too. However, this first aggregation still doesn’t support data governance, because business users aren’t thinking that way. They have processes that are supported by data and it's still a stretch figuring out why this all matters.

Views of Data Quality Scorecard
Your plan must be to make data quality scorecards for different internal audiences - marketing, IT, c-level, etc.

The aggregation might look something like this:You must design the scorecards to meet the needs of the interest of the different audiences, from technical through to business and up to executive. At the beginning of a data quality scorecard is information about data quality of individual data records. This is the default information that most profilers will deliver out of the box. As you aggregate scores, the high-level measures of the data quality become more meaningful. In the middle are various score sets allowing your company to analyze and summarize data quality from different perspectives. If you define the objective of a data quality assessment project as calculating these different aggregations, you will have much easier time maturing your data governance program. The business users and c-level will begin to pay attention.

Business users are looking for whether the data supports the business process. They want to know if the data is facilitating compliance with laws. They want to decide whether their programs are “Go”, “Caution” or “Stop” like a traffic light. They want to know whether the current processes are giving them good data so they can change them if necessary. You can only do this by aggregating the information quality results and aligning those results with business.

Sunday, May 4, 2008

Data Governance Structure and Organization Webinar

My colleague Jim Orr just did a great job delivering a webinar on data governance. You can see a replay of the webinar in case you missed it. Jim is our Data Quality Practice Leader and he has a very positive point of view when it comes to developing a successful data governance strategy.
In this webinar, Jim talks exclusively about the structure and the organization behind data governance. If you believe that data governance is people, process and technology, this webinar covers the "people" side of the equation.

Sunday, April 27, 2008

The Solution Maturity Cycle


I saw the news about Informatica’s acquisition of Identity Systems, and it got me thinking. I recognize a familiar pattern that all too often occurs in the enterprise software business. I’m going to call it the Solution Maturity Cycle. It goes something like this:

1. The Emergence Phase: A young, fledgling company emerges that provides an excellent product that fills a need in the industry. This was Informatica in the 90’s. Rather than hand coding a system of metadata management, companies could use a cool graphical user interface to get the job done. Customers were happy. Informatica became a success. Life was good.

2. The Mashup Phase: Customers begin to realize that if they mash up the features of say, an ETL tool and a data quality tool, they can reap huge benefit for their companies. Eventually, the companies see the benefit of working together, and even begin to talk to prospective customers together. This was Informatica in 2003-5, working with FirstLogic and Trillium Software. Customers could decide which solution to use. Customers were happy that they could mashup, and happy that others had found success in doing so.

3. The Market Consolidation Phase: Under pressure from stockholders to increase revenue, the company looks to buy a solution in order to sell it in-house. The pressure also comes from industry analysts, who if they’re doing their job properly, interpret the mashup as a hole in the product. Unfortunately, the established and proven technology companies are too expensive to buy, so the company looks to a young, fledgling data quality company. The decision on which company to buy is more influenced by bean counters than technologists. Even if there are limitations on the fledgling’s technology, the sales force pushes hard to eliminate mashup implementations, so that annual maintenance revenue will be recognized. This is what happened with Informatica and Similarity Systems in my opinion. Early adopters are confused by this and fearful that their mashup might not be supported. Some customers fight to keep their mashups, some yield to the pressure and install the new solution.

4. Buy and Grow Phase: When bean counters select technology to support the solution, they usually get some product synergies wrong. Sure, the acquisition works from a revenue-generating perspective, but from the technology solution perspective, it is limited. The customers are at the same time under pressure from the mega-vendors, who want to own the whole enterprise. What to do? Buy more technology. It’ll fill the holes, keep the mega-vendor wolves at bay, and build more revenue.

The Solution Maturity Cycle is something that we all must pay attention to when dealing with vendors. For example, I’m seeing phase 3 this cycle occur in the SAP world, where SAP’s acquisition of Business Objects dropped several data quality solutions in SAP’s lap. Now despite the many successful mashups of Trillium Software and SAP, customers are being shown other solutions from the acquisition. All along, history makes me question whether an ERP vendor will be committed long term to the data quality market.

After a merger occurs, a critical decision point comes to customers. Should a customer resist pulling out mashups, or should you try to unify the solution under one vendor? It's a tough decision. The decision may affect internal IT teams, causing conflict between those who have been working on the mashup versus the mega-vendor team. In making this decision, there are a couple of key questions to ask:

  • Is the newly acquired technology in the vendor’s core competency?
  • Is the vendor committed to interoperability with other enterprise applications, or just their own? How will this affect your efforts for an enterprise-wide data governance program?
  • Is the vendor committed to continual improvement this part of the solution?
  • How big is the development team and how many people has the vendor hired from the purchased company? (Take names.)
  • Can the vendor prove that taking out a successful solution to put in a new one will make you more successful?
  • Are there any competing solutions within the vendor’s own company, poised to become the standard?
  • Who has been successful with this solution, and do they have the same challenges that I have?
As customers of enterprise applications, we should be aware of history and the Solution Maturity Cycle.

Wednesday, April 9, 2008

Must-read Analyst Reports on Data Governance

If you’re thinking of implementing a data governance strategy at your company, here are some key analyst reports I believe are a must-read.

Data Governance: What Works And What Doesn't
by Rob Karel, Forrester
A high-level overview of data governance strategies. It’s a great report to hand to a c-level executive in your company who may need some nudging.

Data Governance Strategies
by Philip Russom and TDWI
A comprehensive overview of data governance, including extensive research and case studies. This one is hot off the presses from TDWI. Sponsored by many of the top information quality vendors.

The Forrester Wave™: Information Quality Software by J. Paul Kirby, Forrester
This report covers the strengths and weaknesses of top information quality software vendors. Many of the vendors covered here have been gobbled up by other companies, but the report is still worth a read. $$

Best Practices for Data Stewardship
Magic Quadrant for Data Quality Tools

by Ted Friedman, Gartner
I have included the names of two of Ted’s reports on this list, but Ted offers much insight in many forms. He has written and spoken often on the topic. (When you get to the Gartner web site, you're going to have to search on the above terms as Gartner makes it difficult to link directly.) $$
Ed Note: The latest quadrant (2008) is now available here.

The case for a data quality platform
Philip Howard, Bloor Research
Andy Hayler and Philip Howard are prolific writers on information quality at Bloor Research. They bring an international flair to the subject that you won’t find in the rest.

Disclaimer: The opinions expressed here are my own and don't necessarily reflect the opinion of my employer. The material written here is copyright (c) 2010 by Steve Sarsfield. To request permission to reuse, please e-mail me.