Monday, February 1, 2010

A Data Governance Mission Statement

Every organization, including your data governance team has a purpose and a mission. It can be very effective to communicate your mission in a mission statement to show the company that you mean business.  When you show the value of your team, it can change your relationship with management for the better.

The mission statement should pay tribute to the mission of the organization with regard to values, while defining why the data governance organization exists and setting a big picture goal for the future.
The data governance mission statement could revolve around any of the following key components:

  • increasing revenue
  • lowering costs
  • reducing risks (compliance)
  • meeting any of the organization’s other policies such as being green or socially responsible

The most popular format seems to follow:
Our mission is to [purpose] by doing [high level initiatives] to achieve [business benefits]

So, let’s try one:
Our mission is to ensure that the highest quality data is delivered via company-wide data governance strategy for the purpose of improving the efficiency, increasing the profitability and lowering the risk of the business units we serve.
Flopped around:
Our mission is to improve the efficiency, increase the profitability and lower the business risks to Acme’s business units by ensuring that the highest quality data is delivered via company-wide data governance strategy.
Not bad, but a mission statement should be inspiring to the team and to management. Since the passions of the company described above are unknown, it’s difficult for a generic mission statement to be inspirational about the data governance program. That’s up to you.
 
Goals & Objectives
There are mission statements and there are objectives. While every mission statement should say who you are and why you exist, every objective should specify what you’re going to do and the results you expect.  Objectives include activities that can be easily tracked, measured, achieved and, of course, meet the objectives of the mission.  When you start data governance projects, you can look back to the mission statement to make sure we’re on track. Are you using our people and technology in a way that will benefit the company?

Staying On Mission
When you take on a new project, the mission statement can help protect us and ensure that the project is worthwhile for both the team and the company. The mission statement should be considered as a way to block busy-work and unimportant projects.  In our mission statement example above, if the project doesn’t improve efficiency, lower costs or lower business risk, it should not be considered.


In this case, your can clearly map three projects to the mission, but the fourth project is not as clear.  Dig deeper into the mainframe project to see if any efficiency will come out of the migration.  Is the data being used by anyone for a business purpose?

A Mission Never Ends
A mission statement is a written declaration of a data governance team's purpose and focus. This focus  normally remains steady, while objectives may change often to adapt to changes in the business environment. A properly crafted mission statement will serve as a filter to separate what is important from what is not and to communicate your value to the entire organization.

.

Thursday, January 21, 2010

ETL, Data Quality and MDM for Mid-sized Business


Is data quality a luxury that only large companies should be able to afford?  Of course the answer is no. Your company should be paying attention to data quality no matter if you are a Fortune 1000 or a startup. Like a toothache, poor data quality will never get better on its own.

As a company naturally grows, the effects of poor data quality multiply.  When a small company expands, it naturally develops new IT systems. Mergers often bring in new IT systems, too. The impact of poor data quality slowly invades and hinders the company’s ability to service customers, keep the supply chain efficient and understand its own business. Paying attention to data quality early and often is a winning strategy for even the small and medium-sized enterprise (SME).

However, SME’s have challenges with the investment needed in enterprise level software. While it’s true that the benefit often outweighs the costs, it is difficult for the typical SME to invest in the license, maintenance and services needed to implement a major data integration, data quality or MDM solution.

At the beginning of this year, I started with a new employer, Talend. I became interested in them because they were offering something completely different in our world – open source data integration, data quality and MDM.  If you go to the Talend Web site, you can download some amazing free software, like:
  • a fully functional, very cool data integration package (ETL) called Talend Open Studio
  • a data profiling tool, called Talend Open Profiler, providing charts and graphs and some very useful analytics on your data
The two packages sit on top of a database, typically MySQL – also an open source success.

For these solutions, Talend uses a business model similar to what my friend Jim Harris has just blogged about – Freemium. Under this new model, free open source content is made available to everyone—providing the opportunity to “up-sell” premium content to a percentage of the audience. Talend works like this.  You can enhance your experience from Talend Open Studio by purchasing Talend Integration Suite (in various flavors).  You can take your data quality initiative to the next level by upgrading Talend Open Profiler to Talend Data Quality.

If you want to take the combined data integration and data quality to an even higher level, Talend just announced a complete Master Data Management (MDM) solution, which you can use in a more enterprise-wide approach to data governance. There’s a very inexpensive place to start and an evolutionary path your company can take as it matures its data management strategy.

The solutions have been made possible by the combined efforts of the open source community and Talend, the corporation. If you’d like, you can take a peek at some source code, use the basic software and try your hand at coding an enhancement. Sharing that enhancement with community will only lead to a world full of better data, and that’s a very good thing.

Monday, December 21, 2009

The World is Addicted to Data (and that's good for us)


In the famous book “The Transparent Society”, we are asked to consider some of the privacy ills we will be facing as technology improves and our society gains access to more data sets. The book was groundbreaking when it was written in 1999. It imagines the emergence of groups who are more powerful because they own the data. However, as we sit here ten years later with 20/20 hindsight, it’s clear that the existence and access to specialized data sets makes our life better, not worse.

There are countless examples of this daily improvement in our lives, but some personal ones:
  • I was in the supermarket recently and per usual, there was a long line at the deli. On the other hand, there was no line at the “deli kiosk” so I gave it a try. Based on my frequent shopper card number and underlying database, the deli kiosk already knew my preferred brand and type of cheese and delicious deli meats. Ordering was a snap thanks to a database, and I didn’t even have to mispronounce “Deutschmacher” to the deli man, like I usually do.
  • For Thanksgiving, I visited some relatives that I don’t often see. My GPS led me there thanks to a geospatial database. It told me how long it was going to take based on traffic data, which is often aggregated from several sources, including road sensors and car and taxi fleets. I also was informed about all the coffee shops along the way, thanks to the data set provided by the Dunkin Donuts. Before I left, I used Google Street View and Microsoft Bing’s Birds Eye view to see what the destination looked like. Ten years ago, all of this was pretty much unheard of, but thanks to the coming together of geospatial data, real-time traffic data, satellite and airplane imagery, street view imagery, Dunkin Donuts franchise data, and small, cheap processors, my trip was fantastic.
  • Fantasy Football is a new phenomenon, made possible by data our addiction to data. We know exactly where we stand on any given Sunday as player stats are made available instantly during the games. When Wes Welker scores, I see the six points reflected on my score instantly. Companies like STATS not only cover football, but according to their web site - 234 sports.
  • For iPhone users, there are tons of data-centric applications. For example, Wait Watchers is an app that uses user submissions to generate and display a table of the current ride wait times at major theme parks throughout the world. As this information is updated by users, other users at Disney can make decisions about whether to go to Space Mountain or It’s a small world, for example.

In the corporate world, it’s much of the same and even more important to our society. Marketing teams are addicted to information from web analytics and use marketing automation tools to track the success of their programs. Operations teams track assets like computers, buildings, trucks and people with data. Sales has been and will continue to track customers with data. Finance relies on the collision of credit scores data, invoice and payment data as well as making sure they have enough money in reserves to meet regulations. Executives will continue to rely on business intelligence and data. In fact, it’s hard to find anyone in the business world who doesn’t rely on data.

Of course, much of this is anecdotal. I haven’t found any specific study on the increase in database use, but we do know from an old IDC study that the number of servers in use worldwide, presumably some used for database, has roughly doubled from 2000 to 2005. A doubling of servers, combined with a typically bigger hard drive capacity, point to higher database use.

It was difficult to imagine us here ten years ago, and it’s even more difficult to imagine where we’ll be at the beginning of 2020.  It seems to me that we'll have more opportunity to create and use information with applications on our mobile devices. The collision of iPhone/Droid devices with increasing bandwidths of 3G and 4G networks on the major mobile phone carriers tells me that data in the future will let us do things we can only imagine today.

The world is addicted to data and that bodes well for anyone who helps the world manage it. In 2010, no matter if the economy turns up or down, our industry will continue to feed the addiction to good, clean data.

Tuesday, November 10, 2009

Overcoming Objections to a Data Governance Program


You’ve created a wonderful proposal for a comprehensive data governance program. You’ve brought it up to management, but the chiefs tell you there’s just no budget for data governance. Now what?

The best thing you can do it to keep at it. It often takes time to win the hearts and minds of your company. You know that any money spent on data governance will usually come back with multipliers. It just may take some time for others to get on board. Be patient and continue to promote your quest.

Here are some ideas for thinking about your next steps for your data governance program:

Corporate Revenue
Today, companies manage spending tightly, looking at the expenses and revenue each fiscal quarter and each month to optimize the all-important operating income (revenue minus expenses equals operating income). If sales and revenue are weak, management gets miserly. On the other hand, if revenue is high and expenses are low, your high-ROI proposal will have a better chance for approval.

For many people, this corporate reality is hard to deal with. Logical thinkers would suggest that if something is broken, it should be fixed, no matter how well the sales team is performing. The people who run your business have their first priorities set on stockholder value. You too should pay attention to your company’s sales figures as they are announced each quarter. If your company has a quarterly revenue call, use it to strike when the environment for spending is right.

Cheap Wins
If there is no money to spend on information quality, there still may be potential for information quality wins for you to exploit. For example, let’s say you were to profile or make some SQL queries into your company’s supply chain system database and you found a part that has a near duplicate. So, part number “21-998 Condenser” and part number “2-1-998 Cndsr” exist as duplicated parts in your supply chain.

After verifying the fairly obvious duplicate, you can ask your friend on the procurement side how much it costs to store and hold these condensers in inventory. Then use some guerilla marketing techniques to extol the virtues of data governance. After all, if you could find this with just SQL queries, consider how much you could find with a data discovery/profiling tool. Better yet, consider how much you could find with a company-wide initiative.  In a previous blog post, I referred to this as the low-hanging fruit.

Case Studies
Case studies are a great way to spread the word about data governance. They usually contain real-world examples, often of your competitors, who are finding gold with better attention to information quality. Vendors in the data governance space will have case studies on their websites, or you can get unpublished studies by asking your sales representative.

Consider that built-in desire of your company to be competitive, and keep your Google searches and alerts tuned to what data management projects are underway at your competitors.

Analysts
Analysts are another valuable source for proving your point about the virtues of data governance. Your boss may have installed his own custom spam filter against your cajoling on data governance. But he doesn’t have to take your word for it; he can listen to an industry expert.

If you own a subscription to an analyst firm, use it to sell the power of data governance. Analysts offer telephone consultations, reports and webinars to clients. These offerings may be useful to sway your team.  If you are not a client of these firms, go to the vendors. If there is a crucial report, they will often license it to offer on their website for download, particularly if it speaks well about their solution.

Data Governance Expert Sessions
This technique also falls within the category of “don’t just take my word for it.” You can find a data governance workshop from many vendors to assist your organization with developing your data quality strategies. Often conducted for a group, the session leader interacts with a group of your choosing and presents the potential for improving the efficiency of your business with data governance. As the meeting leader, you would invite both technologists and business users. Include those who are skeptical of the value a data-quality program will bring to their company; a third-party opinion may sway them. The cost is usually reasonable and it can help the group understand and share key concepts of data governance.

Guerrilla Marketing
Why not start your own personal crusade, your own marketing initiative to drive home the power of information quality? In my previous installment of the data governance blog, I offer graphics for use in your signature file to drive home the importance of IQ to your organization. Use the power of a newsletter, blog, or e-mail signature to get your message across.


Excerpt from Steve Sarsfield's book "The Data Governance Imperative"

Thursday, October 22, 2009

Book Review: Data Modeling for Business


A couple of weeks ago, I book-swapped with author Donna Burbank. She has a new book entitled Data Modeling for Business. Donna, an experienced consultant by trade, has teamed up with Steve Hoberman, a previous published author and technologist and Chris Bradley, also a consultant, for an excellent exploration of the process of creating a data model. With a subtitle like “A handbook for Aligning the Business with IT using a High-Level Data Model” I knew I was going to find some value in the swap.

The book describes in plain English the proper way to create a data model, but that simple description doesn’t do it justice. The book is designed for those who are learning from scratch – those who only vaguely understand what a data model is. It uses commonly understood concepts to describe data model concepts. The book describes the impact of the data model to the project’s success and digs into setting up data definitions and the levels of detail necessary for them to be effective. All of this is accomplished in a very plain-talk, straight-forward tone without the pretentiousness you sometimes get in books about data modeling.

We often talk about the need for business and IT to work together to build a data governance initiative. But many, including myself, have pointed to the communication gap that can exist in a cross-functional team. In order to bridge the gap, a couple of things need to happen. First, IT teams need to expand their knowledge of business processes, budgets and corporate politics. Second, business team members need to expand their knowledge of metadata and data modeling. This book provides an insightful education for the latter. In my book, the Data Governance Imperative, the goal was the former.

The book is well-written and complete. It’s a perfect companion for those who are trying to build a knowledgeable, cross-function team for data warehouse, MDM or data governance projects. Therefore, I’ve added it to my recommended reading list on my blog.

Monday, October 12, 2009

Data May Require Unique Data Quality Processes


A few things in life have the same appearance, but the details can vary widely.  For example, planets and stars look the same in the night sky, but traveling to them and surviving once you get there are two completely different problems. It’s only when you get close to your destination that you can see the difference.

All data quality projects can appear the same from afar but ultimately can be as different as stars and planets. One of the biggest ways they vary is in the data itself and whether it is chiefly made up of name and address data or some other type of data.

Name and Address Data
A customer database or CRM system contains data that we know much about. We know that letters will be transposed, names will be comma reversed, postal codes will be missing and more.  There are millions of things that good data quality tools know about broken name and address data since so many name and address records have been processed over the years. Over time, business rules and processes are fine-tuned for name and address data.  Methods of matching up names and addresses become more and more powerful.

Data quality solutions also understand what name and addresses are supposed to look like since the postal authorities provide them with correct formatting. If you’re somewhat precise about following the rules of the postal authorities, most mail makes it to its destination.  If we’re very precise, the postal services can offer discounts. The rules are clear in most parts of the civilized world. Everyone follows the same rules for name and address data because it makes for better efficiency.

So, if we know what the broken item looks like and we know what the fixed item is supposed to look like, you can design and develop processes that involve trained, knowledgeable workers and automated solutions to solve real business problems. There’s knowledge inherent in the system and you don’t have to start from scratch every time you want to cleanse it.

ERP, Supply Chain Data
However, when we take a look at other types of data domains, the picture is very different.  There isn’t a clear set of knowledge what is typically input and what is typically output and therefore you must set up processes for doing so. In supply chain data or ERP data, we can’t immediately see why the data is broken or what we need to do to fix it.  ERP data is likely to be sort of a history lesson of your company’s origins, the acquisitions that were made, and the partnership changes throughout the years. We don’t immediately have an idea about how the data should ultimately look. The data that exists in this world is specific to one client or a single use scenario which cannot be handled by existing out-of-the-box rules

With this type of data you may find the need to collaborate more with the business users of the data, who expertise in determining the correct context for the information comes more quickly, and therefore enable you to effect change more rapidly. Because of the inherent unknowns about the data, few of the steps for fixing the data are done for you ahead of time. It then becomes critical to establish a methodology for:
  • Data profiling in order to understanding what issues and challenges.
  • Discussions with the users of the data to understand context, how it’s used and the most desired representation.  Since there are few governing bodies for ERP and supply chain data, the corporation and its partners must often come up with an agreed-upon standard.
  • Setting up business rules, usually from scratch, to transform the data
  • Testing the data in the new systems
I write about this because I’ve read so much about this topic lately. As practitioners you should be aware that the problem is not the same across all domains. While you can generally solve name and address data problems with a technology focus, you will often rely more on collaboration with subject matter experts to solve issues in other data domains.

Monday, August 24, 2009

9 Questions CEOs Should Ask About Data Governance

When it comes to data governance, the one most influential power in an organization with respect to data governance is the executive team (presidents, vice presidents, managing directors, and CxOs). Sure, business users control certain aspects of the initiative and may even want to hold them back to maintain data ownership. It’s also true that the technology team is influential, but may be short on staff, short on budget and busy with projects like software upgrades. So, it sometimes falls to executives to push data governance as a strategic initiative when the vision doesn’t come from elsewhere.

It makes sense. Executives have the most to gain from a data governance program. Data governance brings order to the business, offering the ability to make effective and timely decisions. By implementing a data governance program, you can make fewer decisions based on ‘gut’ and better decisions based on knowledge. It’s an executive’s job to strive for greater control and lower risk, and that can’t be achieved without some form of data governance.

Rather than issuing edicts, a tactic of many smart executives implement is to ask questions. Questioning your IT and business teams is a form of fact-checking your decisions, understanding shortcomings in skills and resources and empowering your people. It ultimately allows your people to come to the same decision at which you may have already arrived. It is a very gracious way to manage.

Therefore asking questions about data governance is an important job of a CEO. Some of the questions you should be asking your technology leaders are as follows:

Question

Impact

Do we have a data management strategy?

Ask the question to understand if your people have considered data governance. If you have a strategy, you should know who are the people and how are they organized around providing information to the corporation. What are the process for information in the organization?

Are we ahead or behind our competitors with regard to business intelligence and data governance?

Case studies on managing data are widely available on vendor web sites. It’s important to understand if any of your competitors are outflanking you on the efficiencies gained from data governance.

What is poor information quality costing us?

Has your technology team even considered the business impact of information quality on the bottom line, or are they just accepting these costs as standard operating procedure?

What confidence level do you have in my revenue reports?

Has your team considered the impact of information on the business intelligence and therefore the reports they are handing you?

Are we in compliance with all laws regarding our governance of data?

Executives are often culpable for non-compliance, so you should be concerned about any laws that govern the company’s industry. This holds especially true in banking and healthcare, but even in unregulated industries, organizations must comply with spam laws and “do not mail” laws for marketing, for example.

Are you working across business units to work towards data governance, or is data quality done in silos?

To provide the utmost efficiency, information quality processes should be reusable and implemented in similar manner across business units. This is done for exactly the same reason you might standardize on a type of desktop computer or software package for your business – it’s more efficient to share training resources and support to work better as a team. Taking successful processes from one business unit and extending them to others is the best strategy.

Do you have the access to data you need?

The CEO should understand if any office politics are getting in the way of ensuring that the business has the information it need. This question opens the door to that discussion.

How many people in your business unit are managing data?

To really understand if you need to a unified process for managing data, it often helps to look at the organizational chart and try to figure out how many people already manage it. A centralized strategy for data governance may actually prove more efficient.

Who owns the information in your business unit? If something goes right, who should I praise, and if something is wrong, who should I reprimand?

The business should understand who is culpable for adverse events with regard to information. If, for example, you lose revenue by sending the wrong type of customer discount offers, or if you can’t deliver your product because of problems with inventory data, there should be someone responsible. Take action if the answer cannot easily be given.




By asking these questions, you’ll open up the door to some great discussions about data governance. It should allow you to be a maverick for all of your company’s data needs. Thanks to Ajay Ohri for posing this question to me in last week’s interview; it’s something every executive should consider.

Disclaimer: The opinions expressed here are my own and don't necessarily reflect the opinion of my employer. The material written here is copyright (c) 2010 by Steve Sarsfield. To request permission to reuse, please e-mail me.