Monday, May 4, 2009

Don’t Sweat the Small Stuff, Except in Data Quality

April was a busy month. I was the project manager on a new web application, nearly completed my first German web site (also as project manager) and released the book “Data Governance Imperative”. All this real work has taken me away from something I truly love – blogging.

I did want to share something that affected my project this month, however. Data issues can come in the smallest of places and can have a huge effect on your time line.

For the web project I completed this month, the goal was to replace a custom-coded application with a similar application built within a content management system. We had to migrate log in data of users of the application, all with various access levels, to the new system.

During go live, we were on a tight deadline to migrate the data, do final testing of the new application and seamlessly switch everyone over. That all had to happen on the weekend. No one would be the wiser come Monday morning. If you’ve ever done an enterprise application upgrade, you may have followed a similar plan.

We had done our profiling and knew that there were no data issues. However when the migration actually took place, lo and behold – the old system allowed # as a character in the username and password while the new system didn’t. It forced us to stop the migration and write a rule to handle the issue. Even with this simple issue, the time line came close to missing its Monday morning deadline.

Should we have spotted that issue? Yes, in hindsight we could have better understood the system restrictions on the username and password and set up a custom business rule in the data profiler to test it. We might have even forced the users to change the # before the switch while they were still using the old application.

The experience reminds me that data quality is not just about making the data right, it’s about making the data fit for business purpose – fit for the target application. When data is correct for one legacy application, it can be unfit for others. It reminds me that you can plan and test all you want, but you have to be ready for hiccups during the go live phase of the project. The tools, like profiling, are there to help you limit the damage. We were lucky in that this database was relatively small and reload was relatively simple once we figured it all out. For bigger projects, more complete staging of the project – making dry run before the go live phase would have been more effective.

Sunday, April 19, 2009

New Book - The Data Governance Imperative

My new book entitled The Data Governance Imperative is making its way to Amazon, Barnes and Noble, and other outlets this week. I’m very proud of this and happy to see it finally hit the streets. It was a lot of work and dedication to get it done.

I decided to write this book because I saw a common recurring question that arose during discussions about data governance. How do I get my boss to believe that data governance is important? How do I work with my colleagues to build better information and a better company? How do I break through the barriers preventing data governance maturity like getting money, resources and expertise to accomplish the task? When it comes to justifying the costs of data governance to their organization, building organizational processes, learning how to staff initiatives, understanding the role and importance of technologies, and dealing with corporate politics, there is little information available.

In my years working at Trillium Software, I have been exposed to many great projects in Fortune 1000 companies worldwide. Over the years, I’ve made note of the success factors that contribute to strong data governance. I’ve seen successful strategies for data governance and the common threads to success within and across the industry.

I’ve written the Data Governance Imperative to help readers pioneer data governance initiatives, breaking through political barriers by shining a light on the benefits of corporate information quality. This book is designed to give data governance team members insight into the art of starting data governance. It could be helpful to:

  • Data governance teams – those looking for direction/validation in starting a corporate data governance initiative.
  • Business stakeholders – those working in marketing, sales, finance and other business roles who need to understand the goals and functions of a data governance team.
  • C-level executives – those looking to learn about the benefits of data governance without having to read excessive technical jargon, or even those who need to be convinced that data governance is the right thing to do.
  • IT executives – those who believe in the power of information quality but have faced challenges in convincing others in their corporation of its value.
This book does not focus on the technical aspects of data governance, although technologies are discussed. There are some great books on the technology of data governance in the market today. Some are listed on the left side of this blog in the carousel.

Thursday, April 2, 2009

Next Week’s Can’t-Miss Webinars

Presenters can either make or break a webinar. Simply put, good webinars are given by people who are passionate and knowledgeable about their topic. In order to give give up an hour of a busy day, I have to believe that it will impart some knowledge beyond product demos and brochure-ware. In looking ahead to next week, I see a couple of high points:

Data Governance: Strategies for Building Business Value
Date: Tuesday, April 14, 2009 at 11 a.m. Eastern
Trillium Software will host a Web seminar that includes featured guest speaker Rob Karel of Forrester Research presenting a discussion titled: Data Governance: Strategies for Building Business Value. If you’ve never seen Rob Karel speak, I can tell you from experience that it’s a real treat. I played emcee to a 2008 webinar with Rob on data governance. It was very well attended and very positively reviewed. At that time, the webinar concluded with a lot of great questions on selling the business case for data governance. In this session, Rob plans to tackle that topic a bit more - outlining the best practices and skills needed to obtain executive buy-in for data governance projects.

How to Boost Service, Cut Costs and Deliver Great Customer Experiences - Even in an Economic Downturn
Date: Thursday, April 16, 2009 at 11 a.m. Eastern
Teradata and the SmartData Collective will co-sponsor a webinar on dealing with a down economy. We’ve seen a couple of companies cover this topic, but the panel looks very strong. Judging from the panel and the description, this webinar looks to have a CRM-focus - how technology can help you a) provide an experience that customers will love, and; b) cut costs and help you differentiate your communications strategies from your competition. Curtis Rapp from Air2Web will be in on the discussion, so I’m guessing there will be some talk about Teradata Relationship Manager Mobile and using text messaging in your Teradata apps.

The panel of experts will include:

  • Dave Schrader, Teradata - published author and long time Teradata employee
  • Lisa Loftis, CRM and BI Expert - author on CRM topics
  • Curtis Rapp, Air2Web – the partner responsible for some of Teradata’s mobile solution (CRM on your cell phone)
  • Rebecca Bucnis, Teradata - another long-time and experienced Teradata employee
For attending, you’ll also get a white paper by Lisa Loftis called Ringing in the Customers: Harnessing the power of Mobile Marketing.

Wednesday, March 25, 2009

A Brief History of Data Quality

Believe it or not, the concept of data quality has been touted as important since the beginning of the relational database. The original concept of a relational database came from Dr. Edgar Codd, who worked for IBM in the 1960s and 70s. Dr. Codd’s ideas about relational databases, storing data in cross-referenced tables, were groundbreaking, but largely ignored at IBM where he worked. It was only when Larry Ellison grabbed onto the idea and began to have success with a little company named Oracle that IBM did finally pay attention. Today, relational databases are everywhere.

Even then, Dr Codd advised about data integrity. He wrote about:

  • Entity integrity – every table must have a primary key and the column or columns chosen to be the primary key should be unique and not null.
  • Referential integrity – consistency between coupled tables. With certain values, there are obvious relationships between tables. The same ZIP code should always refer to the same town, for example.
  • Domain integrity – defining the possible values of a value stored in a database, including data type and length. So if the domain is a telephone number, the value shouldn’t be an address.

He put everything else into something he called 'business rules' to define specific standards for your company. An example of a business rule would be for companies who store part numbers. The part number field would have a certain length and data shape – domain integrity – but also have certain character combinations to designate the category and type of part – business rules.

The point is, information quality is not something new. It was something that the database pioneers even knew theoretically in the 1970s. In the old days, when the systems were inflexible, you may have been forced to break it.

For example, a programmer who may have worked for you in the past used 99/99/9999 in a date field to designate an inactive account. It all works fine when the data is used within the single application. However, these sorts of shortcuts cause huge headaches for the data governance team as they try to consolidate and move data from silo to enterprise-wide.

To solve these legacy issues, you have to:
  • Profile data to realize that some dates contain all 9s – one of the advantages of using data profiling tools in the beginning of the process.
  • Figure out what the 9s mean by collaborating with members of the business community.
  • Plan what to do to migrate that data over to a data model that makes more sense, like having an active/inactive account table.

If you take that one example and amplify it across thousands of tables in your company, you’ll begin to understand one of the many challenges that data stewards face as they work on migrating legacy data into MDM and data governance programs.

Friday, March 20, 2009

The Down Economy and Data Integration

Vendors, writers and analysts are generating a lot of buzz about the poor economic growth conditions in the world. It’s true that in tough times, large, well-managed companies tend to put off IT purchases until the picture gets a bit rosier. Some speculate that the poor economy will affect data integration vendors and their ability to advance big projects with customers. Yet, I don’t think it will have a deep or lasting impact. Here are just some of the signs still seem to point to a strong data integration economy.

Stephen Swoyer at TDWI wrote a very interesting article that attempts to prove that data integration and BI projects are going full-steam ahead, despite a lock-down on spending in other areas.

Research from Forrester seems to suggest that IT job cuts in 2009 won’t be as steep as they were in the 2001/2002 dot com bubble burst. Forrester says that the US market for jobs in information technology will not escape the recession, with total jobs in IT occupations down by 1.2% in 2009, but the pain will be relatively mild compared with past recessions. (You have to be a Forrester customer to get this report.)

You can read the article by Doug Henschen from Intelligent Enterprise for further proof on the impact of BI and real time analytics. The article contains success stories from Wal-Mart, Kimberly-Clark and Goodyear, too.

On this topic, SAP BusinessObjects recently asked me if I’d blog about their upcoming webinar on this topic entitled: Defy the Times: Business Growth in a Weak Economy. The concept of the webinar being that you can use business intelligence and analytics to cut operating expenses and discretionary spending and improve efficiencies. It might be a helpful webinar if you’re on a data warehouse team and trying to prove your importance to management during this economic down-turn. Use vendors to help you provide third-party confirmation of your value.

So, is the poor economy threatening the data integration economy? I don’t think so. When you look at the problems of growing data volumes and the value of data integration, I don’t see how these positive stories can change any time soon. You can run out of money, but the world will never run out of data.

Sunday, March 15, 2009

Data Governance and the Coke Machine Syndrome


I was in a meeting last week and recognized the Coke Machine Syndrome, an important business parable that I learned from an old boss. All meetings can fall victim to it, not just data governance meetings. Since meeting management is so crucial to the success of a data governance initiative, you should learn to recognize it and nip it in the bud as quickly as possible.

Data Governance and the Coke Machine Syndrome
The scene is your company’s conference room. You have just presented your new plan outlining the data governance projects for the entire year. The plan outlines where you’re going to spend this year to improve data quality. Each department argues persuasively for support from the data governance team. With some significant growth goals for the coming year, marketing and sales claims they can’t make it without better data for promotions. Manufacturing obviously can’t reach new goals for efficiency without improving the data within the ERP system. And administration simply must have better data for better metrics in the data warehouse to understand the business.

After limited discussion, the budget is approved and 95% of your team’s expenses have been committed for the current budget. This part of the meeting allocating millions of dollars and takes place in about 60 minutes.

The Coke Machine
At this point, the meeting leader mentions that the company has been considering the installation of a Coke machine in this section of the building. With a few minutes left in the meeting, he asks what drinks people want in the machine.

For the next 45 minutes, the debate rages with a heightened level of intensity. Should it be placed near the stairway, or in the employee cafeteria, or in the stairwell? Should it contain Pepsi products instead of Coke? Should it contain Red Bull? Should the bottles be recyclable, and how will the recyclable materials be handled?

By the time the meeting adjourns, nearly as much time has been spent on the Coke machine as has been spent on the entire data governance budget for the year. The Coke machine discussion is an incredible waste of management time and effort.

Why does it Happen
Coke machine syndromes happen because everyone knows about Coke machines and everyone has a stake in the decision. Knowledge about the issue makes it easier to speak up about the Coke machine than it would be to speak up about a complicated issue like the budget.

Managing it
To manage the Coke machine syndrome, you must recognize it when it occurs. You can identify this syndrome whenever a small, easily understood issue begins to consume more time than it should. There is usually a full range of logical, well-supported, and totally divergent opinions of what must be done, too.

Make sure you call it what it is. In other words, label it with the term: Coke Machine Syndrome and define it for your team. When it happens, you have a short-hand term that you can use to describe what’s happening.

Before each meeting, think about what items on your meeting agenda might turn into a Coke machine syndrome. If you can recognize it, that can be a big help. Many find it helpful to conduct pre-meetings with certain team members to prepare them for simple decisions without having to vet ideas in a meeting.

Finally, if calling it the Coke machine syndrome doesn't work, just use the phrase let’s take it off-line and move on.

Monday, March 2, 2009

Top Six Traits of a Data Champion


Data champions play a crucial role in making data governance successful. The data champions are enthusiastic about the power of data and in just about every company that has successfully implemented data governance, they often lead the way.

Let's take a look at what you must do in order to lead your organization to data governance. Here are the top six characteristics:

1. Passion. Champions are passionate about data governance and promote its benefit to all whom they meet. They are the vision of data governance, developing new efficient processes and working through any issues of non-cooperation that arise. If the data champion finds him/herself losing your passion for data management, it’s time for regime change.

2. Respect. A data champion is someone who is the glue between executives, business, IT and third-party providers. The data champion role requires someone who has both technology and business knowledge – someone who can communicate with others and build relationships as needed. In a way, a data champion is a translator, translating the technologist's jargon of schemas and metadata into business value, and vice versa. To do that, you really need to understand what makes all sides tick and have the respect of the team.

3. Maven-dom. A ‘maven’ is someone who wants to solve other people's problems, generally by solving his own, according to Malcom Gladwell, author of The Tipping Point (and another good book for data champions to read). A maven’s social skills and ability to communicate are powerful tools in evangelizing data governance. A data champion needs to be socially connected and willing to reach out and to share what is known about data governance. It is not easy for some to create and maintain relationships. If you’re the type of person who prefers closing the office door to avoid others, you may not be an effective data champion.

4. Persuasiveness. One of the success traits of a good data champion is that they have vision and they can sell it. Working with others within your organization to develop a vision is important, but the data champion is the primary marketer of the vision. Successful data champions understand the power of the elevator pitch and are willing to use it to promote the data governance vision to all who will listen. The term elevator pitch describes a sales message that can be delivered in the time span of an elevator ride. The pitch should have a clear, consistent message and reflects your goals to make the company more efficient through data governance. The more effective the speech, the more interested your colleagues will become.

5. Positive Attitude. A data champion must smile and train themselves to think positively. Why? Positive thinking is contagious and your optimism will build positive energy for your project. Data champions smile and speak optimistically to give others the confidence to agree with them. As a champion, you will encounter negative people who will attempt to set up road blocks in front of you. But as long you’re optimistic and respond positively, you will inspire team members to join your quest and share in your success.

6. Leadership. A data champion is a leader above all, so studying the qualities of successful leaders will serve you well. This is a catch-all category because leadership also has many faces and traits. Before you begin to champion the cause of data governance, read books like The 21 Indispensable Qualities of a Leader: Becoming the Person Others Will Want to Follow
where author John Maxwell identifies areas for you to work on.

Those are my top six qualities of a data champion. You’ll notice that I didn’t particularly put anything about technical expertise, although it is implied in number two. That’s because being a data champion is as much about managing people and resources than it is about technical know-how.

Disclaimer: The opinions expressed here are my own and don't necessarily reflect the opinion of my employer. The material written here is copyright (c) 2010 by Steve Sarsfield. To request permission to reuse, please e-mail me.