One of my first jobs was that of assistant cook at a summer camp. (In this case, the term ‘cook’ was loosely applied meaning to scrub pots and pans for the head cook.) It was there I learned that most cooks have ingredients that they tend to use more often. The cook at Camp Marlin tended to use honey where applicable. Food TV star Emeril likes to use garlic and pork fat. Some cooks add a little hot pepper to their chocolate recipes – it is said to bring out the flavor of the chocolate. Definitely a secret ingredient.
For head chefs taking on major IT initiatives the secret ingredient is always data quality technology. Attention to data quality doesn’t make the recipe of an IT initiative alone so much as it makes an IT initiative better. Let’s take a look at how this happens.
Profiling
No matter what the project, data profiling provides a complete understanding of the data before the project team attempts to migrate it. This can help the project team create a more accurate plan for integration. On the other hand, it is ill-advised to migrate data to your new solution as-is, as it can lead to major costs over-runs and project delays as you have to load and reload it.
Customer Relationship Management (CRM)
By using data quality technology in CRM, the organization will benefit from a cleaner customer list with fewer duplicate records. Data quality technology can work as a real-time process, limiting the amount of typos and duplicates in the system, thus leading to improved call center efficiency. Data profiling can also help an organization understand and monitor the quality of a purchased list for integration will avoid issues with third-party data.
Enterprise Resource Planning (ERP) and Supply Chain Management (SCM)
If data is accurate, you will have a more complete picture of the supply chain. Data quality technology can be used to more accurately report inventory levels, lowering inventory costs. When you make it part of your ERP project, you may also be able to improve bargaining power with suppliers by gaining improved intelligence about their corporate buying power.
Data Warehouse and Business Intelligence
Data quality helps disparate data sources to act as one when migrated to a data warehouse. Data quality makes data warehouse possible by standardizing disparate data. You will be able to generate more accurate reports when trying to understand sales patterns, revenue, customer demographics and more.
Master Data Management (MDM)
Data quality is a key component of master data management. An integral part of making applications communicate and share data is to have standardized data. MDM enhances the basic premise of data quality with additional features like persistent keys, a graphical user interface to mitigate matching, the ability to publish and subscribe to enterprise applications, and more.
So keep in mind, when you decide to improve data quality, it is often because of your need to make a major IT initiative even stronger. In most projects, data quality is the secret ingredient to make your IT projects extraordinary. Share the recipe.
Tuesday, February 16, 2010
Monday, February 1, 2010
A Data Governance Mission Statement
Every organization, including your data governance team has a purpose and a mission. It can be very effective to communicate your mission in a mission statement to show the company that you mean business. When you show the value of your team, it can change your relationship with management for the better.
The mission statement should pay tribute to the mission of the organization with regard to values, while defining why the data governance organization exists and setting a big picture goal for the future.
The data governance mission statement could revolve around any of the following key components:
- increasing revenue
- lowering costs
- reducing risks (compliance)
- meeting any of the organization’s other policies such as being green or socially responsible
The most popular format seems to follow:
Our mission is to [purpose] by doing [high level initiatives] to achieve [business benefits]
So, let’s try one:
Our mission is to ensure that the highest quality data is delivered via company-wide data governance strategy for the purpose of improving the efficiency, increasing the profitability and lowering the risk of the business units we serve.Flopped around:
Our mission is to improve the efficiency, increase the profitability and lower the business risks to Acme’s business units by ensuring that the highest quality data is delivered via company-wide data governance strategy.Not bad, but a mission statement should be inspiring to the team and to management. Since the passions of the company described above are unknown, it’s difficult for a generic mission statement to be inspirational about the data governance program. That’s up to you.
Goals & Objectives
There are mission statements and there are objectives. While every mission statement should say who you are and why you exist, every objective should specify what you’re going to do and the results you expect. Objectives include activities that can be easily tracked, measured, achieved and, of course, meet the objectives of the mission. When you start data governance projects, you can look back to the mission statement to make sure we’re on track. Are you using our people and technology in a way that will benefit the company?
Staying On Mission
When you take on a new project, the mission statement can help protect us and ensure that the project is worthwhile for both the team and the company. The mission statement should be considered as a way to block busy-work and unimportant projects. In our mission statement example above, if the project doesn’t improve efficiency, lower costs or lower business risk, it should not be considered.
In this case, your can clearly map three projects to the mission, but the fourth project is not as clear. Dig deeper into the mainframe project to see if any efficiency will come out of the migration. Is the data being used by anyone for a business purpose?
A Mission Never Ends
A mission statement is a written declaration of a data governance team's purpose and focus. This focus normally remains steady, while objectives may change often to adapt to changes in the business environment. A properly crafted mission statement will serve as a filter to separate what is important from what is not and to communicate your value to the entire organization.
.
Thursday, January 21, 2010
ETL, Data Quality and MDM for Mid-sized Business
Is data quality a luxury that only large companies should be able to afford? Of course the answer is no. Your company should be paying attention to data quality no matter if you are a Fortune 1000 or a startup. Like a toothache, poor data quality will never get better on its own.
As a company naturally grows, the effects of poor data quality multiply. When a small company expands, it naturally develops new IT systems. Mergers often bring in new IT systems, too. The impact of poor data quality slowly invades and hinders the company’s ability to service customers, keep the supply chain efficient and understand its own business. Paying attention to data quality early and often is a winning strategy for even the small and medium-sized enterprise (SME).
However, SME’s have challenges with the investment needed in enterprise level software. While it’s true that the benefit often outweighs the costs, it is difficult for the typical SME to invest in the license, maintenance and services needed to implement a major data integration, data quality or MDM solution.
At the beginning of this year, I started with a new employer, Talend. I became interested in them because they were offering something completely different in our world – open source data integration, data quality and MDM. If you go to the Talend Web site, you can download some amazing free software, like:
For these solutions, Talend uses a business model similar to what my friend Jim Harris has just blogged about – Freemium. Under this new model, free open source content is made available to everyone—providing the opportunity to “up-sell” premium content to a percentage of the audience. Talend works like this. You can enhance your experience from Talend Open Studio by purchasing Talend Integration Suite (in various flavors). You can take your data quality initiative to the next level by upgrading Talend Open Profiler to Talend Data Quality.
If you want to take the combined data integration and data quality to an even higher level, Talend just announced a complete Master Data Management (MDM) solution, which you can use in a more enterprise-wide approach to data governance. There’s a very inexpensive place to start and an evolutionary path your company can take as it matures its data management strategy.
The solutions have been made possible by the combined efforts of the open source community and Talend, the corporation. If you’d like, you can take a peek at some source code, use the basic software and try your hand at coding an enhancement. Sharing that enhancement with community will only lead to a world full of better data, and that’s a very good thing.
As a company naturally grows, the effects of poor data quality multiply. When a small company expands, it naturally develops new IT systems. Mergers often bring in new IT systems, too. The impact of poor data quality slowly invades and hinders the company’s ability to service customers, keep the supply chain efficient and understand its own business. Paying attention to data quality early and often is a winning strategy for even the small and medium-sized enterprise (SME).
However, SME’s have challenges with the investment needed in enterprise level software. While it’s true that the benefit often outweighs the costs, it is difficult for the typical SME to invest in the license, maintenance and services needed to implement a major data integration, data quality or MDM solution.
At the beginning of this year, I started with a new employer, Talend. I became interested in them because they were offering something completely different in our world – open source data integration, data quality and MDM. If you go to the Talend Web site, you can download some amazing free software, like:
- a fully functional, very cool data integration package (ETL) called Talend Open Studio
- a data profiling tool, called Talend Open Profiler, providing charts and graphs and some very useful analytics on your data
For these solutions, Talend uses a business model similar to what my friend Jim Harris has just blogged about – Freemium. Under this new model, free open source content is made available to everyone—providing the opportunity to “up-sell” premium content to a percentage of the audience. Talend works like this. You can enhance your experience from Talend Open Studio by purchasing Talend Integration Suite (in various flavors). You can take your data quality initiative to the next level by upgrading Talend Open Profiler to Talend Data Quality.
If you want to take the combined data integration and data quality to an even higher level, Talend just announced a complete Master Data Management (MDM) solution, which you can use in a more enterprise-wide approach to data governance. There’s a very inexpensive place to start and an evolutionary path your company can take as it matures its data management strategy.
The solutions have been made possible by the combined efforts of the open source community and Talend, the corporation. If you’d like, you can take a peek at some source code, use the basic software and try your hand at coding an enhancement. Sharing that enhancement with community will only lead to a world full of better data, and that’s a very good thing.
Monday, December 21, 2009
The World is Addicted to Data (and that's good for us)
In the famous book “The Transparent Society”, we are asked to consider some of the privacy ills we will be facing as technology improves and our society gains access to more data sets. The book was groundbreaking when it was written in 1999. It imagines the emergence of groups who are more powerful because they own the data. However, as we sit here ten years later with 20/20 hindsight, it’s clear that the existence and access to specialized data sets makes our life better, not worse.
There are countless examples of this daily improvement in our lives, but some personal ones:
In the corporate world, it’s much of the same and even more important to our society. Marketing teams are addicted to information from web analytics and use marketing automation tools to track the success of their programs. Operations teams track assets like computers, buildings, trucks and people with data. Sales has been and will continue to track customers with data. Finance relies on the collision of credit scores data, invoice and payment data as well as making sure they have enough money in reserves to meet regulations. Executives will continue to rely on business intelligence and data. In fact, it’s hard to find anyone in the business world who doesn’t rely on data.
Of course, much of this is anecdotal. I haven’t found any specific study on the increase in database use, but we do know from an old IDC study that the number of servers in use worldwide, presumably some used for database, has roughly doubled from 2000 to 2005. A doubling of servers, combined with a typically bigger hard drive capacity, point to higher database use.
It was difficult to imagine us here ten years ago, and it’s even more difficult to imagine where we’ll be at the beginning of 2020. It seems to me that we'll have more opportunity to create and use information with applications on our mobile devices. The collision of iPhone/Droid devices with increasing bandwidths of 3G and 4G networks on the major mobile phone carriers tells me that data in the future will let us do things we can only imagine today.
The world is addicted to data and that bodes well for anyone who helps the world manage it. In 2010, no matter if the economy turns up or down, our industry will continue to feed the addiction to good, clean data.
There are countless examples of this daily improvement in our lives, but some personal ones:
- I was in the supermarket recently and per usual, there was a long line at the deli. On the other hand, there was no line at the “deli kiosk” so I gave it a try. Based on my frequent shopper card number and underlying database, the deli kiosk already knew my preferred brand and type of cheese and delicious deli meats. Ordering was a snap thanks to a database, and I didn’t even have to mispronounce “Deutschmacher” to the deli man, like I usually do.
- For Thanksgiving, I visited some relatives that I don’t often see. My GPS led me there thanks to a geospatial database. It told me how long it was going to take based on traffic data, which is often aggregated from several sources, including road sensors and car and taxi fleets. I also was informed about all the coffee shops along the way, thanks to the data set provided by the Dunkin Donuts. Before I left, I used Google Street View and Microsoft Bing’s Birds Eye view to see what the destination looked like. Ten years ago, all of this was pretty much unheard of, but thanks to the coming together of geospatial data, real-time traffic data, satellite and airplane imagery, street view imagery, Dunkin Donuts franchise data, and small, cheap processors, my trip was fantastic.
- Fantasy Football is a new phenomenon, made possible by data our addiction to data. We know exactly where we stand on any given Sunday as player stats are made available instantly during the games. When Wes Welker scores, I see the six points reflected on my score instantly. Companies like STATS not only cover football, but according to their web site - 234 sports.
- For iPhone users, there are tons of data-centric applications. For example, Wait Watchers is an app that uses user submissions to generate and display a table of the current ride wait times at major theme parks throughout the world. As this information is updated by users, other users at Disney can make decisions about whether to go to Space Mountain or It’s a small world, for example.
In the corporate world, it’s much of the same and even more important to our society. Marketing teams are addicted to information from web analytics and use marketing automation tools to track the success of their programs. Operations teams track assets like computers, buildings, trucks and people with data. Sales has been and will continue to track customers with data. Finance relies on the collision of credit scores data, invoice and payment data as well as making sure they have enough money in reserves to meet regulations. Executives will continue to rely on business intelligence and data. In fact, it’s hard to find anyone in the business world who doesn’t rely on data.
Of course, much of this is anecdotal. I haven’t found any specific study on the increase in database use, but we do know from an old IDC study that the number of servers in use worldwide, presumably some used for database, has roughly doubled from 2000 to 2005. A doubling of servers, combined with a typically bigger hard drive capacity, point to higher database use.
It was difficult to imagine us here ten years ago, and it’s even more difficult to imagine where we’ll be at the beginning of 2020. It seems to me that we'll have more opportunity to create and use information with applications on our mobile devices. The collision of iPhone/Droid devices with increasing bandwidths of 3G and 4G networks on the major mobile phone carriers tells me that data in the future will let us do things we can only imagine today.
The world is addicted to data and that bodes well for anyone who helps the world manage it. In 2010, no matter if the economy turns up or down, our industry will continue to feed the addiction to good, clean data.
Tuesday, November 10, 2009
Overcoming Objections to a Data Governance Program
You’ve created a wonderful proposal for a comprehensive data governance program. You’ve brought it up to management, but the chiefs tell you there’s just no budget for data governance. Now what?
The best thing you can do it to keep at it. It often takes time to win the hearts and minds of your company. You know that any money spent on data governance will usually come back with multipliers. It just may take some time for others to get on board. Be patient and continue to promote your quest.
Here are some ideas for thinking about your next steps for your data governance program:
Corporate Revenue
Today, companies manage spending tightly, looking at the expenses and revenue each fiscal quarter and each month to optimize the all-important operating income (revenue minus expenses equals operating income). If sales and revenue are weak, management gets miserly. On the other hand, if revenue is high and expenses are low, your high-ROI proposal will have a better chance for approval.
For many people, this corporate reality is hard to deal with. Logical thinkers would suggest that if something is broken, it should be fixed, no matter how well the sales team is performing. The people who run your business have their first priorities set on stockholder value. You too should pay attention to your company’s sales figures as they are announced each quarter. If your company has a quarterly revenue call, use it to strike when the environment for spending is right.
Cheap Wins
If there is no money to spend on information quality, there still may be potential for information quality wins for you to exploit. For example, let’s say you were to profile or make some SQL queries into your company’s supply chain system database and you found a part that has a near duplicate. So, part number “21-998 Condenser” and part number “2-1-998 Cndsr” exist as duplicated parts in your supply chain.
After verifying the fairly obvious duplicate, you can ask your friend on the procurement side how much it costs to store and hold these condensers in inventory. Then use some guerilla marketing techniques to extol the virtues of data governance. After all, if you could find this with just SQL queries, consider how much you could find with a data discovery/profiling tool. Better yet, consider how much you could find with a company-wide initiative. In a previous blog post, I referred to this as the low-hanging fruit.
Case Studies
Case studies are a great way to spread the word about data governance. They usually contain real-world examples, often of your competitors, who are finding gold with better attention to information quality. Vendors in the data governance space will have case studies on their websites, or you can get unpublished studies by asking your sales representative.
Consider that built-in desire of your company to be competitive, and keep your Google searches and alerts tuned to what data management projects are underway at your competitors.
Analysts
Analysts are another valuable source for proving your point about the virtues of data governance. Your boss may have installed his own custom spam filter against your cajoling on data governance. But he doesn’t have to take your word for it; he can listen to an industry expert.
If you own a subscription to an analyst firm, use it to sell the power of data governance. Analysts offer telephone consultations, reports and webinars to clients. These offerings may be useful to sway your team. If you are not a client of these firms, go to the vendors. If there is a crucial report, they will often license it to offer on their website for download, particularly if it speaks well about their solution.
Data Governance Expert Sessions
This technique also falls within the category of “don’t just take my word for it.” You can find a data governance workshop from many vendors to assist your organization with developing your data quality strategies. Often conducted for a group, the session leader interacts with a group of your choosing and presents the potential for improving the efficiency of your business with data governance. As the meeting leader, you would invite both technologists and business users. Include those who are skeptical of the value a data-quality program will bring to their company; a third-party opinion may sway them. The cost is usually reasonable and it can help the group understand and share key concepts of data governance.
Guerrilla Marketing
Why not start your own personal crusade, your own marketing initiative to drive home the power of information quality? In my previous installment of the data governance blog, I offer graphics for use in your signature file to drive home the importance of IQ to your organization. Use the power of a newsletter, blog, or e-mail signature to get your message across.
Excerpt from Steve Sarsfield's book "The Data Governance Imperative"
The best thing you can do it to keep at it. It often takes time to win the hearts and minds of your company. You know that any money spent on data governance will usually come back with multipliers. It just may take some time for others to get on board. Be patient and continue to promote your quest.
Here are some ideas for thinking about your next steps for your data governance program:
Corporate Revenue
Today, companies manage spending tightly, looking at the expenses and revenue each fiscal quarter and each month to optimize the all-important operating income (revenue minus expenses equals operating income). If sales and revenue are weak, management gets miserly. On the other hand, if revenue is high and expenses are low, your high-ROI proposal will have a better chance for approval.
For many people, this corporate reality is hard to deal with. Logical thinkers would suggest that if something is broken, it should be fixed, no matter how well the sales team is performing. The people who run your business have their first priorities set on stockholder value. You too should pay attention to your company’s sales figures as they are announced each quarter. If your company has a quarterly revenue call, use it to strike when the environment for spending is right.
Cheap Wins
If there is no money to spend on information quality, there still may be potential for information quality wins for you to exploit. For example, let’s say you were to profile or make some SQL queries into your company’s supply chain system database and you found a part that has a near duplicate. So, part number “21-998 Condenser” and part number “2-1-998 Cndsr” exist as duplicated parts in your supply chain.
After verifying the fairly obvious duplicate, you can ask your friend on the procurement side how much it costs to store and hold these condensers in inventory. Then use some guerilla marketing techniques to extol the virtues of data governance. After all, if you could find this with just SQL queries, consider how much you could find with a data discovery/profiling tool. Better yet, consider how much you could find with a company-wide initiative. In a previous blog post, I referred to this as the low-hanging fruit.
Case Studies
Case studies are a great way to spread the word about data governance. They usually contain real-world examples, often of your competitors, who are finding gold with better attention to information quality. Vendors in the data governance space will have case studies on their websites, or you can get unpublished studies by asking your sales representative.
Consider that built-in desire of your company to be competitive, and keep your Google searches and alerts tuned to what data management projects are underway at your competitors.
Analysts
Analysts are another valuable source for proving your point about the virtues of data governance. Your boss may have installed his own custom spam filter against your cajoling on data governance. But he doesn’t have to take your word for it; he can listen to an industry expert.
If you own a subscription to an analyst firm, use it to sell the power of data governance. Analysts offer telephone consultations, reports and webinars to clients. These offerings may be useful to sway your team. If you are not a client of these firms, go to the vendors. If there is a crucial report, they will often license it to offer on their website for download, particularly if it speaks well about their solution.
Data Governance Expert Sessions
This technique also falls within the category of “don’t just take my word for it.” You can find a data governance workshop from many vendors to assist your organization with developing your data quality strategies. Often conducted for a group, the session leader interacts with a group of your choosing and presents the potential for improving the efficiency of your business with data governance. As the meeting leader, you would invite both technologists and business users. Include those who are skeptical of the value a data-quality program will bring to their company; a third-party opinion may sway them. The cost is usually reasonable and it can help the group understand and share key concepts of data governance.
Guerrilla Marketing
Why not start your own personal crusade, your own marketing initiative to drive home the power of information quality? In my previous installment of the data governance blog, I offer graphics for use in your signature file to drive home the importance of IQ to your organization. Use the power of a newsletter, blog, or e-mail signature to get your message across.
Excerpt from Steve Sarsfield's book "The Data Governance Imperative"
Thursday, October 22, 2009
Book Review: Data Modeling for Business
A couple of weeks ago, I book-swapped with author Donna Burbank. She has a new book entitled Data Modeling for Business. Donna, an experienced consultant by trade, has teamed up with Steve Hoberman, a previous published author and technologist and Chris Bradley, also a consultant, for an excellent exploration of the process of creating a data model. With a subtitle like “A handbook for Aligning the Business with IT using a High-Level Data Model” I knew I was going to find some value in the swap.
The book describes in plain English the proper way to create a data model, but that simple description doesn’t do it justice. The book is designed for those who are learning from scratch – those who only vaguely understand what a data model is. It uses commonly understood concepts to describe data model concepts. The book describes the impact of the data model to the project’s success and digs into setting up data definitions and the levels of detail necessary for them to be effective. All of this is accomplished in a very plain-talk, straight-forward tone without the pretentiousness you sometimes get in books about data modeling.
We often talk about the need for business and IT to work together to build a data governance initiative. But many, including myself, have pointed to the communication gap that can exist in a cross-functional team. In order to bridge the gap, a couple of things need to happen. First, IT teams need to expand their knowledge of business processes, budgets and corporate politics. Second, business team members need to expand their knowledge of metadata and data modeling. This book provides an insightful education for the latter. In my book, the Data Governance Imperative, the goal was the former.
The book is well-written and complete. It’s a perfect companion for those who are trying to build a knowledgeable, cross-function team for data warehouse, MDM or data governance projects. Therefore, I’ve added it to my recommended reading list on my blog.
The book describes in plain English the proper way to create a data model, but that simple description doesn’t do it justice. The book is designed for those who are learning from scratch – those who only vaguely understand what a data model is. It uses commonly understood concepts to describe data model concepts. The book describes the impact of the data model to the project’s success and digs into setting up data definitions and the levels of detail necessary for them to be effective. All of this is accomplished in a very plain-talk, straight-forward tone without the pretentiousness you sometimes get in books about data modeling.
We often talk about the need for business and IT to work together to build a data governance initiative. But many, including myself, have pointed to the communication gap that can exist in a cross-functional team. In order to bridge the gap, a couple of things need to happen. First, IT teams need to expand their knowledge of business processes, budgets and corporate politics. Second, business team members need to expand their knowledge of metadata and data modeling. This book provides an insightful education for the latter. In my book, the Data Governance Imperative, the goal was the former.
The book is well-written and complete. It’s a perfect companion for those who are trying to build a knowledgeable, cross-function team for data warehouse, MDM or data governance projects. Therefore, I’ve added it to my recommended reading list on my blog.
Monday, October 12, 2009
Data May Require Unique Data Quality Processes
A few things in life have the same appearance, but the details can vary widely. For example, planets and stars look the same in the night sky, but traveling to them and surviving once you get there are two completely different problems. It’s only when you get close to your destination that you can see the difference.
All data quality projects can appear the same from afar but ultimately can be as different as stars and planets. One of the biggest ways they vary is in the data itself and whether it is chiefly made up of name and address data or some other type of data.
Name and Address Data
A customer database or CRM system contains data that we know much about. We know that letters will be transposed, names will be comma reversed, postal codes will be missing and more. There are millions of things that good data quality tools know about broken name and address data since so many name and address records have been processed over the years. Over time, business rules and processes are fine-tuned for name and address data. Methods of matching up names and addresses become more and more powerful.
Data quality solutions also understand what name and addresses are supposed to look like since the postal authorities provide them with correct formatting. If you’re somewhat precise about following the rules of the postal authorities, most mail makes it to its destination. If we’re very precise, the postal services can offer discounts. The rules are clear in most parts of the civilized world. Everyone follows the same rules for name and address data because it makes for better efficiency.
So, if we know what the broken item looks like and we know what the fixed item is supposed to look like, you can design and develop processes that involve trained, knowledgeable workers and automated solutions to solve real business problems. There’s knowledge inherent in the system and you don’t have to start from scratch every time you want to cleanse it.
ERP, Supply Chain Data
However, when we take a look at other types of data domains, the picture is very different. There isn’t a clear set of knowledge what is typically input and what is typically output and therefore you must set up processes for doing so. In supply chain data or ERP data, we can’t immediately see why the data is broken or what we need to do to fix it. ERP data is likely to be sort of a history lesson of your company’s origins, the acquisitions that were made, and the partnership changes throughout the years. We don’t immediately have an idea about how the data should ultimately look. The data that exists in this world is specific to one client or a single use scenario which cannot be handled by existing out-of-the-box rules
With this type of data you may find the need to collaborate more with the business users of the data, who expertise in determining the correct context for the information comes more quickly, and therefore enable you to effect change more rapidly. Because of the inherent unknowns about the data, few of the steps for fixing the data are done for you ahead of time. It then becomes critical to establish a methodology for:
All data quality projects can appear the same from afar but ultimately can be as different as stars and planets. One of the biggest ways they vary is in the data itself and whether it is chiefly made up of name and address data or some other type of data.
Name and Address Data
A customer database or CRM system contains data that we know much about. We know that letters will be transposed, names will be comma reversed, postal codes will be missing and more. There are millions of things that good data quality tools know about broken name and address data since so many name and address records have been processed over the years. Over time, business rules and processes are fine-tuned for name and address data. Methods of matching up names and addresses become more and more powerful.
Data quality solutions also understand what name and addresses are supposed to look like since the postal authorities provide them with correct formatting. If you’re somewhat precise about following the rules of the postal authorities, most mail makes it to its destination. If we’re very precise, the postal services can offer discounts. The rules are clear in most parts of the civilized world. Everyone follows the same rules for name and address data because it makes for better efficiency.
So, if we know what the broken item looks like and we know what the fixed item is supposed to look like, you can design and develop processes that involve trained, knowledgeable workers and automated solutions to solve real business problems. There’s knowledge inherent in the system and you don’t have to start from scratch every time you want to cleanse it.
ERP, Supply Chain Data
However, when we take a look at other types of data domains, the picture is very different. There isn’t a clear set of knowledge what is typically input and what is typically output and therefore you must set up processes for doing so. In supply chain data or ERP data, we can’t immediately see why the data is broken or what we need to do to fix it. ERP data is likely to be sort of a history lesson of your company’s origins, the acquisitions that were made, and the partnership changes throughout the years. We don’t immediately have an idea about how the data should ultimately look. The data that exists in this world is specific to one client or a single use scenario which cannot be handled by existing out-of-the-box rules
With this type of data you may find the need to collaborate more with the business users of the data, who expertise in determining the correct context for the information comes more quickly, and therefore enable you to effect change more rapidly. Because of the inherent unknowns about the data, few of the steps for fixing the data are done for you ahead of time. It then becomes critical to establish a methodology for:
- Data profiling in order to understanding what issues and challenges.
- Discussions with the users of the data to understand context, how it’s used and the most desired representation. Since there are few governing bodies for ERP and supply chain data, the corporation and its partners must often come up with an agreed-upon standard.
- Setting up business rules, usually from scratch, to transform the data
- Testing the data in the new systems
Labels:
data quality,
erp,
supply chain,
tools
Subscribe to:
Comments (Atom)
Disclaimer: The opinions expressed here are my own and don't necessarily reflect the opinion of my employer. The material written here is copyright (c) 2010 by Steve Sarsfield. To request permission to reuse, please e-mail me.








