Showing posts with label mainframe. Show all posts
Showing posts with label mainframe. Show all posts

Saturday, November 12, 2011

The ‘Time’ Factor in Data Management

I've been thinking about how many ways time influences the data management world. When it comes to managing data, we think about improving processes, coercing the needs and desires of people and how technology comes to help us manage it all. However, an often overlooked aspect of data management is time. Time impacts data management from many different directions.

Time Means Technology Will Improve
As time marches on, technology offers twists and turns to the data steward through innovation.  20 years ago, mainframes ruled the world.  We’ve migrated through relational databases on powerful servers to a place where we see our immediate future in cloud and big data. As technology shifts, you must consider the impact of data.

The good news is that with these huge challenges, you also get access to new tools.  In general, tools have become less arcane and more business-user focused as time marches on. 

Time Causes People to Change

Like changes in technology, people also mature, change careers, retire. With regard to data management, the corporation must think about the expertise needed to complete the data mission. Data management must pass the “hit by a bus” test where the company would not suffer if one or more key people were to be hit by a Greyhound traveling from Newark to Richmond.

Here, time is requiring us to be more diligent in documenting our processes.  It is requiring us to avoid undocumented hand-coding and pick a reproducible data management platform.  It helps to have third-party continuity, like consultants who, although will also experience changes in personnel, will change on a different schedule than their clients.

Time Leads to Clarity in the Imperative of Data Management

With regard to data management, corporations have a maturity process they go through. They often start as chaotic immature organizations and realize the power of data management in a tactical maturity stage. Finally, they realize data management is a strategic initiative when they begin to govern the data.  Throughout it all, people, process and technologies change.

Knowing where you are in this maturity cycle can help you plan where you want to go from here and what tactics you need to put in place to get there. For example, very few companies go from chaotic, ad hoc data management to full-blown MDM. For the most part, they get there through making little changes, seeing the positive impact of the little changes and wanting more. Rather, a chaotic organization might be more apt to evolve their data management maturity by consolidating two or more ERP systems and revel in its efficiency.

Time Prevents Us from Achieving Successful Projects
When it comes to specific projects, taking too much time can lead to failure in projects.  In the not so distant past, circa 2007, the industry commonly took on massive, multi-year, multimillion dollar MDM projects. We now know that these projects are not the best way to manage data. Why? Think about how much your own company has changed in the last two years.  If it is a dynamic, growing company, it likely has different goals, different markets, different partners and new leadership. The world has changed significantly, too.  Today’s worldwide economy is so much different that even one year ago. (Have you heard about the recession and European debt crisis?) The goals of a project that you set up two years ago will never achieve success today.

Time makes us take an agile approach to data management. It requires that we pick off small portions of our problems, solve them, prove value and re-use what we’ve learned on the next agile project.  Limit and hold scope to achieve success.

Time Achieves Corporate Growth (which is counter to data management)
Companies who are just starting out generally have fewer data management problems than those who are mature. Time pushes our data complexity deeper and deeper. Therefore time dictates that even small companies should have some sort of data management strategy.  The good news is that now achievable with help from open source and lower cost data management solutions. Proper data management tools are affordable by both Fortune 1000 and small to medium-sized enterprises.

Time Holds Us Responsible
That said, the longer a corporation is in business, the longer it can be held responsible for lower revenue, decreased efficiency and lack of compliance due to poor data management. The company decides how it is going to govern (or not govern) data, what data is acceptable in the CRM and who is responsible for the mistakes that happen due to poor data management. The longer you are in business, the more responsible the corporation is for its governance. Time holds us responsible if the problems aren’t solved.

Time and Success Lead to Apathy

Finally, time often brings us success in data management.  With success, there is a propensity for corporations to take the eye off the prize and spend monies on more pressing issues.  Time and success can lead to a certain apathy, believing that the data management problem is solved.  But, as time marches on, new partners, new data sources, new business processes. Time requires us to be ever vigilant in our efforts to manage data.

Monday, May 19, 2008

Unusual Data Quality Problems

When I talk to folks who are struggling with data quality issues, there are some who are worried that they have data unlike any data anyone has ever seen. Often there’s a nervous laugh in the voice as if the data is so unusual and so poor that an automated solution can’t possibly help.

Yes, there are wide variations in data quality and consistency and it might be unlike any we’ve seen. On the other hand, we’ve seen a lot of unusual data over the years. For example:

  • A major motorcycle manufacturer used data quality tools to pull out nicknames from their customer records. Many of the names they had acquired for their prospect list were from motorcycle events and contests where the entries were, shall we say, colorful. The name fields contained data like “John the Mad Dog Smith” or “Frank Motor-head Jones”. The client used the tool to separate the name from the nickname, making it a more valuable marketing list.
  • One major utility company used our data quality tools to identify and record notations on meter-reader records that were important to keep for operational uses, but not in the customer billing record. Upon analysis of the data, the company noticed random text like “LDIY" and "MOR" along with the customer records. After some investigation, they figured out that LDIY meant “Large Dog in Yard” which was particularly important for meter readers. MOR meant “Meter in Right, which was also valuable. The readers were given their own notes field, so that they could maintain the integrity of the name and address while also keeping this valuable data. IT probably saved a lot of meter readers from dog bite situations.
  • Banks have used our data quality tools to separate items like "John and Judy Smith/221453789 ITF George Smith". The organization wanted to consider this type of record as three separate records "John Smith" and "Judy Smith" and "George Smith" with obvious linkage between the individuals. This type of data is actually quite common on mainframe migrations.
  • A food manufacturer standardizes and cleanses ingredient names to get better control of manufacturing costs. In data from their worldwide manufacturing plants, an ingredient might be “carrots” “chopped frozen carrots” “frozen carrots, chopped” “chopped carrots, frozen” and so on. (Not to mention all the possible abbreviations for the words carrots, chopped and frozen.) Without standardization of these ingredients, there was really no way to tell how many carrots the company purchased worldwide. There was no bargaining leverage with the carrot supplier, and all the other ingredient suppliers, until the data was fixed.

Not all data quality solutions can handle all of these types of anomalies. They will pass these "odd" values without attempting to cleanse them. It’s key to have a system that will learn from your data and allow you to develop business rules that meet the organization’s needs.

Now there are times, quite frankly, when data gets so bad, that automated tools can do nothing about it, but that’s where data profiling comes in. Before you attempt to cleanse or migrate data, you should profile it to have a complete understanding of it. This will let you weigh the cost of fixing very poor data against the value that it will bring to the organization.

Sunday, February 10, 2008

Mainframe Computing and Information Quality

Looking for new ways to use the power of your mainframe? My friend Wally called me the other day and was talking about moving applications off the mainframe to the Unix platform and cleansing data during the migration. “Sure, we can help you with that.” I said. But he was surprised to hear that there is a version of the Trillium Software System that is optimized for the Mainframe (z/OS server). We’ve continually updated our mainframe data quality solution and we have no plans to stop.

Mainframe computers still play a central role in the daily operations of many large companies. Mainframes are designed to allow many simultaneous users and applications access to the same data without interfering with one other. Security, scalability, and reliability are key factors to the mainframe’s power in mission-critical applications. These applications typically include customer order processing, financial transactions, production and inventory control, payroll, and others.

While others have abandoned the mainframe platform, the Trillium Software System supports the z/OS (formerly known as OS/390) environment. Batch data standardization executes on either a 64-bit or 31-bit system. It also supports CICS, the transactional-based processing system designed for real time processing. z/OS and CICS easily support thousands of transactions per second, making it a very powerful data quality platform. The Trillium Software System can power your mainframe with an outstanding data quality engine, no matter if your data is stored in DB2, text files, COBOL copybooks, or XML.

The Trillium Software System will standardize, cleanse and match data using our proprietary rules engine. You can remove duplicates, ensure that your name and address data will mail properly, CASS certify data and more. It’s a great way to get your data ready for SOA on the mainframe, too.

My hats off to Clara C. on our development team, who heads up the project for maintaining the mainframe version of the Trillium Software System. She’s well-known at Trillium Software for her mainframe acumen and for hosting the annual pot-luck lunch around the holidays. (She makes an excellent mini hot dog in Jack Daniels sauce.)

I’m not sure whether Wally will stick with his mainframe or migrate the whole thing to UNIX servers, but he was happy to know he has an option. With an open data quality platform, like the Trillium Software System, it’s not a huge job to move the whole process from the mainframe to UNIX by leveraging the business rules developed on one platform and copying them to the other.

Disclaimer: The opinions expressed here are my own and don't necessarily reflect the opinion of my employer. The material written here is copyright (c) 2010 by Steve Sarsfield. To request permission to reuse, please e-mail me.