Showing posts with label politics. Show all posts
Showing posts with label politics. Show all posts

Wednesday, August 31, 2011

Top Ten Root Causes of Data Quality Problems: Part Five

Part 5 of 5: People Issues
In this continuing series, we're looking at root causes of data quality problems and the business processes you can put in place to solve them.  Companies rely on data to make significant decisions that can affect customer service, regulatory compliance, supply chain and many other areas. As you collect more and more information about customers, products, suppliers, transactions and billing, you must attack the root causes of data quality. 

Root Cause Number Nine: Defining Data Quality

More and more companies recognize the need for data quality, but there are different ways to   clean data and improve data quality.   You can:
  • Write some code and cleanse manually
  • Handle data quality within the source application
  • Buy tools to cleanse data
However, consider what happens when you have two or more of these types of data quality processes adjusting and massaging the data. Sales has one definition of customer, while billing has another.  Due to differing processes, they don’t agree on whether two records are a duplicate.

Root Cause Attack Plan
  • Standardize Tools – Whenever possible, choose tools that aren’t tied to a particular solution. Having data quality only in SAP, for example, won’t help your Oracle, Salesforce and MySQL data sets.  When picking a solution, select one that is capable of accessing any data, anywhere, at any time.  It shouldn't cost you a bundle to leverage a common solution across multiple platforms and solutions.
  • Data Governance – By setting up a cross-functional data governance team, you will have the people in place to define a common data model.

Root Cause Number Ten: Loss of Expertise

On almost every data intensive project, there is one person whose legacy data expertise is outstanding. These are the folks who understand why some employee date of hire information is stored in the date of birth field and why some of the name attributes also contain tax ID numbers. 
Data might be a kind of historical record for an organization. It might have come from legacy systems. In some cases, the same value in the same field will mean a totally different thing in different records. Knowledge of these anomalies allows experts to use the data properly.
If you encounter this situation, there are some business processes you can follow.

Root Cause Attack Plan
  • Profile and Monitor – Profiling the data will help you identify most of these types of issues.  For example, if you have a tax ID number embedded in the name field, analysis will let you quickly spot it. Monitoring will prevent a recurrence.
  • Document – Although they may be reluctant to do so for fear of losing job security, make sure experts document all of the anomalies and transformations that need to happen every time the data is moved.
  • Use Consultants – Expert employees may be so valuable and busy that there is no time to document the legacy anomalies. Outside consulting firms are usually very good at documenting issues and providing continuity between legacy and new employees.

This post is an excerpt from a white paper available here. More to come on this subject in the days ahead.

See also:


Monday, August 29, 2011

Top Ten Root Causes of Data Quality Problems: Part Three

Part 3 of 5: Secret Code and Corporate Evolution
In this continuing series, we're looking at root causes of data quality problems and the business processes you can put in place to solve them.  In part three, we examine secret code and corporate evolution as two of the root causes for data quality problems.

Root Cause Number Five: Corporate Evolution
Change is good… except for data quality
An organizations undergoes business process change to improve itself. Good, right?  Prime examples include:
  • Company expansion into new markets
  • New partnership deals
  • New regulatory reporting laws
  • Financial reporting to a parent company
  • Downsizing
If data quality is defined as “fitness for purpose,” what happens when the purpose changes? It’s these new data uses that bring about changes in perceived level of data quality even though underlying data is the same. It’s natural for data to change.  As it does, the data quality rules, business rules and data integration layers must also change.

Root Cause Attack Plan
  • Data Governance – By setting up a cross-functional data governance team, you will always have a team who will be looking at the changes your company is undergoing and considering its impact on information. In fact, this should be in the charter of a data governance team.
  • Communication – Regular communication and a well-documented metadata model will make the process of change much easier.
  • Tool Flexibility – One of the challenges of buying data quality tools embedded within enterprise applications is that they may not work in ALL all enterprise applications. When you choose tools, make sure they are flexible enough to work with data from any application and that the company is committed to flexibility and openness.

Root Cause Number Six: Secret Code
Databases rarely start begin their life empty. The starting point is typically a data conversion from some previously existing data source. The problem is that while the data may work perfectly well in the source application, it may fail in the target. It’s difficult to see all the custom code and special processes that happen beneath the data unless you profile.

Root Cause Attack Plan
  • Profile Early and Often – Don’t assume your data is fit for purpose because it works in the source application. Profiling will give you an exact evaluation of the shape and syntax of the data in the source.  It also will let you know how much work you need to do to make it work in the target.
  • Corporate Standards - Data governance will help you define corporate standards for data quality.
  • Apply Reusable Data Quality Tools When Possible – Rather than custom code in the application, a better strategy is to let data quality tools apply standards.  Data quality tools will apply corporate standards in a uniform way, leading to more accurate sharing of data.

This post is an excerpt from a white paper available here. The final posts on this subject will come in the days ahead.

Tuesday, November 16, 2010

Ideas Having Sex: The Path to Innovation in Data Management

I read a recent analyst report on the data quality market and “enterprise-class” data quality solutions. Per usual, the open source solutions were mentioned at a passing while the data quality solutions of the past were given high marks. Some of the solutions picked in the top originated from days when mainframe was king. Some of the top contenders still contained cobbled-together applications from ill-conceived acquisitions. It got me thinking about the way we do business today and how so much of it is changing.

Back in the 1990’s or earlier, if you had an idea for a new product, you’d work with an internal team of engineers and build the individual parts.  This innovation took time, as you might not always have exactly the right people working on the job.  It was slow and tedious. The product was always confined by its own lineage.

The Android phone market is a perfect examples of the modern way to innovate.  Today, when you want to build something groundbreaking like an Android, you pull in expertise from all around the world. Sure, Samsung might make the CPU and Video processing chips, but Primax Electronics in Taiwan might make the digital camera and Broadcomm in the US makes the touch screen, plus many others. Software vendors push the platform further with their cool apps. Innovation happens at break-neck speed because the Android is a collection of ideas that have sex and produce incredible offspring.

Isn’t that really the model of a modern company?  You have ideas getting together and making new ideas. When you have free exchange between people, there is no need to re-invent something that has already been invented. See the TED for more on this concept, where British author Matt Ridley argues that, through history, the engine of human progress and prosperity is "ideas having sex.”

The business model behind open source has a similar mission.  Open source simply creates better software. Everyone collaborates, not just within one company, but among an Internet-connected, worldwide community. As a result, the open source model often builds higher quality, more secure, more easily integrated software. It does so at a vastly accelerated pace and often at a lower cost.

So why do some industry analysts ignore it? There’s no denying that there are capitalist and financial reasons.  I think if an industry analyst were to actually come out and say that the open source solution is the best, it would be career suicide. The old-school would shun the analysts making him less relevant. The link between the way the industry pays and promotes analysts and vice versa seems to favor enterprise application vendors.

Yet the open source community along with Talend has developed a very strong data management offering that should be considered in the top of its class. The solution leverages other cutting edge solutions. To name just a few examples:
  • if you want to scale up, you can use distributed platform technology from Hadoop, which enables it to work with thousands of nodes and petabytes of data.
  • very strong enterprise class data profiling.  
  • matching that users can actually use and tune without having to jump between multiple applications.
  • a platform that grows with your data management strategy so that if your future is MDM, you can seamlessly move there without having to learn a new GUI.
The way we do business today has changed. Innovation can only happen when ideas have sex, as Matt Ridley puts it. As long as we’re engaged in exchange and specialization, we will achieve those new levels of innovation.

Thursday, May 13, 2010

Three Conversations to Have with an Executive - the Only Three

If you’re reading this, you’re most likely in the business of data management. In many companies, particularly large ones, the folks who manage data don’t much talk to the executives. But every so often, there is that luncheon, a chance meeting in the elevator, or even a break from a larger meeting where you and an executive are standing face to face.  (S)he asks, what you’re working on. Like a boy scout, be prepared.  Keep your response to one of these three things:

  1. Revenue – How has your team increased revenue for the corporation?
  2. Efficiency – How has your team lowered costs by improving efficiency for the corporation?
  3. Risk – How have you and your team lowered the risk to the corporation with better compliance to corporate regulations?

The executive doesn’t want to hear about schemas, transformations or even data quality. Some examples of appropriate responses might include:

  • We work on making the CRM/ERP system more efficient by keeping an eye on the information within it. My people ensure that the reports are accurate and complete so you have the tools to make the right decisions.
  • We’re doing things like making sure we’re in compliance with [HIPAA/Solvency II/Basel II/Antispam] so no one runs afoul of the law.
  • We’re speeding up the time it takes to get valuable information to the [marketing/sales/business development] team so they can react quickly to sales opportunities
  • We’re fixing [business problem] to [company benefit].

When you talk to your CEO, it’s your opportunity get him/her in the mindset that your team is beneficial, so when it comes to funding, it will be something they remember. It’s your chance to get out of the weeds and elevate the conversation.  Let the sales guys talk about deals. Let the marketing people talk about the market forces or campaigns. As data champions, we also need to be prepared to talk about the value we bring to the game.

Tuesday, November 10, 2009

Overcoming Objections to a Data Governance Program


You’ve created a wonderful proposal for a comprehensive data governance program. You’ve brought it up to management, but the chiefs tell you there’s just no budget for data governance. Now what?

The best thing you can do it to keep at it. It often takes time to win the hearts and minds of your company. You know that any money spent on data governance will usually come back with multipliers. It just may take some time for others to get on board. Be patient and continue to promote your quest.

Here are some ideas for thinking about your next steps for your data governance program:

Corporate Revenue
Today, companies manage spending tightly, looking at the expenses and revenue each fiscal quarter and each month to optimize the all-important operating income (revenue minus expenses equals operating income). If sales and revenue are weak, management gets miserly. On the other hand, if revenue is high and expenses are low, your high-ROI proposal will have a better chance for approval.

For many people, this corporate reality is hard to deal with. Logical thinkers would suggest that if something is broken, it should be fixed, no matter how well the sales team is performing. The people who run your business have their first priorities set on stockholder value. You too should pay attention to your company’s sales figures as they are announced each quarter. If your company has a quarterly revenue call, use it to strike when the environment for spending is right.

Cheap Wins
If there is no money to spend on information quality, there still may be potential for information quality wins for you to exploit. For example, let’s say you were to profile or make some SQL queries into your company’s supply chain system database and you found a part that has a near duplicate. So, part number “21-998 Condenser” and part number “2-1-998 Cndsr” exist as duplicated parts in your supply chain.

After verifying the fairly obvious duplicate, you can ask your friend on the procurement side how much it costs to store and hold these condensers in inventory. Then use some guerilla marketing techniques to extol the virtues of data governance. After all, if you could find this with just SQL queries, consider how much you could find with a data discovery/profiling tool. Better yet, consider how much you could find with a company-wide initiative.  In a previous blog post, I referred to this as the low-hanging fruit.

Case Studies
Case studies are a great way to spread the word about data governance. They usually contain real-world examples, often of your competitors, who are finding gold with better attention to information quality. Vendors in the data governance space will have case studies on their websites, or you can get unpublished studies by asking your sales representative.

Consider that built-in desire of your company to be competitive, and keep your Google searches and alerts tuned to what data management projects are underway at your competitors.

Analysts
Analysts are another valuable source for proving your point about the virtues of data governance. Your boss may have installed his own custom spam filter against your cajoling on data governance. But he doesn’t have to take your word for it; he can listen to an industry expert.

If you own a subscription to an analyst firm, use it to sell the power of data governance. Analysts offer telephone consultations, reports and webinars to clients. These offerings may be useful to sway your team.  If you are not a client of these firms, go to the vendors. If there is a crucial report, they will often license it to offer on their website for download, particularly if it speaks well about their solution.

Data Governance Expert Sessions
This technique also falls within the category of “don’t just take my word for it.” You can find a data governance workshop from many vendors to assist your organization with developing your data quality strategies. Often conducted for a group, the session leader interacts with a group of your choosing and presents the potential for improving the efficiency of your business with data governance. As the meeting leader, you would invite both technologists and business users. Include those who are skeptical of the value a data-quality program will bring to their company; a third-party opinion may sway them. The cost is usually reasonable and it can help the group understand and share key concepts of data governance.

Guerrilla Marketing
Why not start your own personal crusade, your own marketing initiative to drive home the power of information quality? In my previous installment of the data governance blog, I offer graphics for use in your signature file to drive home the importance of IQ to your organization. Use the power of a newsletter, blog, or e-mail signature to get your message across.


Excerpt from Steve Sarsfield's book "The Data Governance Imperative"

Thursday, June 25, 2009

Evil Dictators: You Can’t Rule the World without Data Governance

Buried in the lyrics of one of my favorite heavy metal songs are these beautiful words:

Now, what do you own the world? How do you own disorder, disorder? – System of the Down, Toxicity


System of the Down’s screamingly poetic lyrics reminds us of a very important lesson that we can take into the business. After all, it is the goal of many companies to “own their world”. If you’re Coke, you want to dominate over Pepsi. If you’re MacDonald’s, you want to crush Burger King. Yet to own competitive markets, you have to run your business with the utmost efficiency. Without data governance, or at least enterprise data quality initiatives, you won’t have that efficiency.

Your quest for world domination will be in jeopardy in many ways without data governance. If your evil world domination plan is to buy up companies, poor data quality and lack of continuity will prevent you from creating a unified environment after the merge. On the day of a merger, you may be asked to produce, one list of products, one list of customers, one list of employees, and one accurate financial report. Where is that data going to come from if it is not clean all over your company? How will the data get clean without data governance?

Data governance brings order to the business units. With order comes the ability to own the information of your business. The ownership brings the ability to make effective and timely decisions. In large companies, whose business units may be warring against each other for sales and control of the information, it’s impossible to own the chaos. It’s difficult to make good decisions and bring order to your people. If you want to own your market, you must have order.

Those companies succeeding in this data-centric world are treating their data assets just as they would treat cold, hard cash. With data governance, companies strive to protect their vast ecosystem of data like it is a monetary system. It can't be the data center's problem alone; it has to be everyone's responsibility throughout the entire company.

Data governance is the choice of CEOs and benevolent dictators, too. The choice about data governance is one about hearing the voices of your people. It's only when you harmonize the voices of technologists, executives and business teams that allow you produce a beautiful song; one that can bring your company teamwork, strategic direction and profit. When you choose data governance, you choose order, communication and hope for your world.

So megalomaniacs, benevolent dictators and CEOs pay heed. You can’t own the world without data governance.

Sunday, April 6, 2008

Politics, Presidents and Data Governance

I was curious about the presidential candidates and their plans to build national ID cards and a database of citizens, so I set out to do some research on the candidates stance on this issue. It strikes me as a particularly difficult task, given the size of the database that would be needed and the complexity. Just how realistic would the data governance strategy for the candidates be?

I searched the candidate’s web sites with the following Google commands:
database site:http://www.johnmccain.com
database site:http://www.barackobama.com
database site:http://www.hillaryclinton.com

Hardly scientific, but interesting results nonetheless. The candidates have very different data management plans for the country. This simple search gave some insight into the candidate’s data management priorities.

Clinton:
Focused on national health care and the accompanying data challenges.
• Patient Health Care Records Database
• Health Care Provider Performance Tracking Database
• Employer History of Complaints
Comments: It’s clear that starting a national database of doctors and patients is a step toward a national health plan. There are huge challenges with doctor data, however. Many doctors work in multiple locations, having a practice at a major medical center and a private practice, for example. Consolidating doctor lists from insurance companies would rely heavily on unique health care provider ID numbers, doctor age and sex, and factors other than name and address for information quality. This is an ambitious plan, particularly given data compliance regulations, but necessary for a national health plan.

Obama:
Not much about actual database plans, but Obama has commented in favor of:
• Lobbyist Activity Database
• National Sex Offender Database
Comments: Many states currently monitor sex offenders, so the challenge would be coordinating a process and managing the metadata from the states. Not a simple task to say the least. I suspect none of the candidates are really serious about this, but it’s a strong talk-track. Ultimately, this may be better left to the states to manage.
As far as the lobbyist activity database, I frankly can’t see how it’d work. Would lobbyists would complete online forms describing their activities with politicians. If lobbyists have to describe their interaction with the politician, would they be given an open slate in which to scribble some notes about the event/gift/dinner/meeting topics? This would likely be chock full of unstructured data, and its usefulness would be questionable in my opinion.

McCain:
• Grants and Contracts Database
• Lobbyist Activity Database
• National Sex Offender Database
Comments: Adding in the grants and contracts database into McCain’s plan, I see this as similar to Obama’s plan in that it’s storage of unstructured data.

To succeed in any of these plans from our major presidential candidates, I see a huge effort in the “people” and “process” components of data governance. Congress will have to enact laws that describe data models, data security, information quality, exceptions processing and much more. Clearly, this is not their area of expertise. Yet the candidates seem to be talking about technology as a magic wand to heal our country’s problems. It’s not going to be easy for any of them to make any of this a reality, even with all the government’s money.
Instead of these popular vote-grabbing initiatives, wouldn't the government be better served by a president who is understands data governance? When you think about it, the US Government is making the same mistake that businesses make, growing and expanding data silos, leading to more and more inefficiencies. I can’t help but thinking what we really need is a federal information quality and metadata management agency (since the government like acronyms, shall we call it FIQMM) to oversee the government’s data. The agency could be empowered by the president to have access to government data, define data models, and provide people, process and technologies to improve efficiency. Imagine what efficiencies we could gain with a federal data governance strategy. Just imagine.

Disclaimer: The opinions expressed here are my own and don't necessarily reflect the opinion of my employer. The material written here is copyright (c) 2010 by Steve Sarsfield. To request permission to reuse, please e-mail me.