Sunday, November 23, 2008

Picking the Boardwalk and Park Place DQ Projects

This weekend, I was playing a game of Monopoly with my kids. Monopoly is the ultimate game of capitalism. It’s a great way to teach a young one about money. (Given the length of the game, a single game can be a weekend long lesson.) The companies that we work for are also playing the capitalism game. So, it’s not a stretch that there are lessons to be learned while playing this game.

As I took in hefty rents from Pacific Ave, I could see that my daughter was beginning to realize that it’s really tough to win if you buy low-end properties like Baltic and Mediterranean, or any of the properties on that side of the board. Even with hotels, Baltic will only get you $450. It’s only with the yellow, green and blue properties that you can really make an impression on your fellow players. She got excited by finally getting a hold of Boardwalk and Park Place.

Likewise, it’s difficult to win at the data governance game if you pick projects that have limited upside. The tendency might be to fix the data of the business users who are complaining the most or those that the CEO tells you to fix. The key is to keep capitalism and the game of monopoly in mind when you pick projects.

When you begin picking high value targets with huge upside potential, you’ll begin to win at the data governance game. People will stand up and notice when you begin to bring in the high-end returns that Boardwalk and Park Place can bring in. You’ll get better traction in the organization. You’ll be able to expand your domain across Ventnor, St. James Place, gathering up other clean data monopolies.

This is the tactic that I’ve see so many successful data governance initiatives take at Trillium Software. The most successful project managers are also good marketers, promoting their success inside the company. And if no one will listen inside the company, they promote it to trade journals, analysts and industry awards. There’s nothing like a little press to make the company look up and notice.

So take the $200 you get from passing GO and focus on high value, high impact projects. When you land on Baltic, pass it by, at least at first. By focusing on the high impact data properties, you’ll get a better payoff in the end.

To hear a few more tips, I recommend the webinar by my friend Jim Orr at Trillium Software. You can listen to his webinar here.

Wednesday, November 19, 2008

What is DIG?

In case you haven’t heard, financial services companies are in a crunch time right now. Some say the current stormy conditions are unprecedented. Some say it’s a rocky time, but certainly manageable. Either way, financial service companies have to be smarter than ever in managing risk.

That’s what DIG is all about, helping financial services companies manage risk from their data. It's a new solution set from Trillium Software.

In Europe, BASEL II is standard operating procedure at many financial services companies and the US is starting to come on board. BASEL II is complex, but includes mandates for increased transparency of key performance indicators, such as probability of default (PD) and of Loss Given Default (LGD) to better determine Exposure At Default (EAD). Strict rules on capital risks reserve provisions penalize those institutions highly exposed to risk and those unable to provide ‘provably correct’ analysis of their risk position.

Clearly, the lack of risk calculations had something to do with the situation that banks are in today. Consider all the data that it takes to make a risk compliance calculation: customer credit quality measurements, agency debt ratings, accounts receivables, and current market exposures. When this type of data is spread out over multiple systems, it introduces risk that can shake the financial world.

To comply with BASEL II, financial services companies and those who issue credit have to be smarter than ever in managing data. Data drives decision-making and risk calculation models. For example, let’s say you’re a bank and you’re calculating the risk of your debtors. You enrich your data with Standard & Poor's ratings to understand the risk. But if the data is non-standardized, you may have a hard time matching the Standard & Poor's data to your customer. If not found, a company with a AA- bond rating might default as BB- in the database. After all, it is prudent to be conservative if you don’t know the risk. But that error can cause thousands, even millions to be set unnecessarily aside. These additional capital reserves can be a major drag on the company.

With the Data Intelligence and Governance (DIG) announcement from Trillium Software, we’re starting to leverage our enterprise technology platform to fix the risk rating process to become proactive participants in the validation, measurement, and management of all data fed into risk models. The key is to establish a framework for the context of data and best practices for enterprise governance. When we leverage our software and services to work on key data attributes and set up rules to ensure the accuracy of data, it could work to save the financial services companies a ton of money.

To support DIG, we’ve brought on board some additional financial services expertise. We’ve revamped our professional services and are working closely with some of our partners on the DIG initiative. We’ve also been updating our software, like our data quality dashboard, TS Insight, to help meet financial services challenges. For more information, see the DIG page on the Trillium Software web site.

Wednesday, November 12, 2008

The Data Governance Insider - Year in Review

Today is the one year anniversary of this blog. We’ve covered some interesting ground this year. It’s great to look back and to see if the thoughts I had in my 48 blog entries made any sense at all. For the most part, I’m proud of what I said this year.


Probably the most controversial entries this year were the ones on probabilistic matching. This was where I pointed out some of the shortcomings of the probabilistic technique to matching data. Some people read and agreed. Others added their dissension.


Visitors seemed to like the entry on approaching data intensive projects with data quality in mind. This is a popular white paper on Trilliumsoftware.com, too. We'll have to do more of those nuts and bolts articles in the year ahead.


As a data guy, I like reviewing the stats from Google Analytics. In terms of traffic, it was very slow going at first, but as traffic started to build, we were able to eke out 3,506 Visits with 2,327 of those visits unique. That means that either someone came back 1,179 times or 1,179 people came back… or some combination of the two. Maybe my mother just loves reading my stuff.


The visitors came from the places you’d expect. The top ten were United States, United Kingdom, Canada, Australia, India, Germany, France, Netherlands, Belgium, and Israel. We had a few visitors from unexpected places - one visitor from Kazakhstan apparently liked my entry on the Trillium Software integration with Oracle, but not enough to come back. A visitor from the Cayman Islands took a breaking from SCUBA diving to read my story on the successes Trillium Software has had with SAP implementations. There's a nice webinar that we recorded that's available there. A visitor from Croatia took time to read my story about data quality on the mainframe. Even outside Croatia, the mainframe is still a viable platform for data management.


I’m looking forward to another year of writing about data governance and data quality. Thanks for all your visits!

Tuesday, October 21, 2008

Financial Service Companies Need to Prepare for New Regulation

We’re in the midst of a mortgage crisis. Call it a natural extension of capitalism, where greed can inspire unregulated “innovation”. That greed is now coming home to roost.

This problem has many moving pieces and it's difficult to describe in an elevator pitch. By the actions of our leaders and bankers, mortgage lenders were inspired to write dubious mortgages, and the US population was encouraged to apply for them. At first, these unchecked mortgages lead to more free cash, more spending, and a boom in the economy. Unfortunately, the boom was built on a foundation of quicksand. It forced us to take drastic measures to bring balance back to the system. The $700 billion bill already passed is an example to those measures. Hopefully for our government, we won’t need too many more balancing measures.

So where do we go from here? The experts say that the best-case scenario would be for the world economy to do well - unemployment stays low, personal income keeps pace with inflation and real estate prices find a bottom. I'm optimistic that we'll see that day soon. Many of the experts aren't so sure.

One thing that history teaches us is that regulatory oversight is bound to get stiffer after this fiasco. We had similar “innovations” in capitalism with the savings and loan scandal, the artificial dot-com boom, Enron, Tyco and WorldCom. Those scandals where followed up by new worldwide regulations like Sarbanes-Oxley, Bill 198 (Canada), JSOX (Japan) and Deutscher Corporate Governance Kodex (Germany) to name just a few. These laws tightened oversight of the accounting industry and toughen corporate disclosure rules. They also moved to make the leaders of corporations more personally liable for reporting irregularities.

The same should be true after the mortgage crisis. The types of loans that have brought us to this situation may only exist in tightly regulated form in the future. In the coming months, we should see a renewed emphasis on detecting fraud at every step of the process. For the financial services industry especially, it will be more important than ever to have good clean data, accurate business intelligence and holistic data governance to achieve the regulations to come.

If you’re running a company that still can’t get a handle on your customers, has a hard time detecting fraud, has a lot of missing and outlier values in your data, has many systems with many duplicated forms of data values, you’ll want to get started now on your governing your data. Go now, run, since data governance and building intelligence can take years of hard work. The goal here would be to begin to mitigate the potential risk you have in meeting regulatory edicts. If you get going now, you’ll not only beat the rush to comply, but you'll reap the real and immediate benefits of data governance.

Thursday, October 9, 2008

Teradata Partners User Group

Road trip! Next week, I’m heading to Teradata Partners User Group and Conference in Las Vegas, and I’m looking forward to it. The event should be a fantastic opportunity to take a peak inside the Teradata world.

The event is a way for Trillium Software to celebrate its partnership with Teradata. This partnership has always made a lot of sense to me. Teradata and Trillium Software have had similar game-plans throughout the years – focus on your core competency, be the best you can be at it, but maintain an open and connectible architecture that allows in other high-end technologies. There are many similarities in the philosophies of the two companies.

Both companies have architecture that works well in particularly large organizations with vast amounts of data. One key feature with Teradata, for example, is that you can linearly expand the database capacity response time by adding more nodes to the existing database. Similarly, with Trillium Software, you can expand the number of records cleansed in real-time by adding more nodes to the cleansing system. Trillium Software uses a load balancing technology called the Director to manage cleansing and matching on multiple servers. In short, both technologies will scale to support very large volumes of complex, global data.

The estimate is for about 4000 Teradata enthusiasts to show up and participate in the event. So, if you’re among them, please come by the Trillium Software exhibit and say hello.

Monday, October 6, 2008

Data Governance and Chicken Parmesan

With the tough economy and shrinking 401K’s, some of my co-workers at Trillium are starting to cut back a bit in personal spending. They talk about how expensive everything is and speak with regret if they don’t bring a lunch instead of buying one at the Trillium cafeteria. Until now, I’ve kept quiet about this topic and waited politely until the conversation turned to say, fantasy football. But between you and me, I don’t agree that there is a huge cost savings with making your own.


Case in point – this weekend, I got the urge to cook a nice chicken parmesan dinner for my family. This time, I was going to make the tomato sauce from scratch and not use the jarred stuff. A trip to the local market to buy all the ingredients for the meal probably cost $12 for the tomatoes, pasta, chicken, breading and spices. There were transportation costs – I spent abut $1.50 on half a gallon of gas. Then, I came home at 2:30 PM and cooked until 5:30, slowly simmering the sauce and using electricity on my stove. So let’s say I used an extra $.50 in electricity. It’s difficult to account for my time. I could have been working on the honey-do list, thus saving me from having to pay someone to do it. Hopefully, that plumbing problem will keep until next weekend. I even could have worked at a minimum wage job at $8/hour or about $24.

When I add up all the hidden costs, the chicken parmesan easily cost me $35-40. Meanwhile, a local restaurant has chicken parmesan for $15.99… and it’s pretty darn good… and it comes with a loaf of homemade bread.

Now I’m sure many of you are asking – what does this have to do with data governance? Everything! When you begin to develop your plan and strategies for your data governance initiative, you have to think about cooking your own or ordering out. Does it make sense to build your own data profiling and data quality processes, or does it make sense to buy one? Will you be pushing off the plumbing to make the meal – in other words, will a more manual home-grown data governance initiative take your team away from necessary tasks that will require emergency service later? Does it make sense to have a team work on hand-coding database extractions and transformation, or would the total economics be better if you bought an ETL tool and focused their time on other pursuits?

Restaurants can sell chicken parmesan for $15.99 and still make a profit because they have the system of making it that uses economy of scale. They buy ingredients cheaper, and because they use the sauce in other dishes, have ‘reusability’ working for them, too. They use the sauce in their eggplant parmesan, spaghetti with meatballs, and many other dishes, and that reuse is powerful. Most of the high-end technologies you choose for your company have to have the same reusability as the sauce for the maximum benefit. Using data quality technologies that only plug into SAP, for example, when your future data governance projects may lead you to Oracle and Tibco and Siperian just doesn’t make sense.

One other consideration - what if something goes wrong with my homemade chicken parmesan? I had little recourse if my own home-cooked solution were to go up in flames, except to get into even more expense and order out. But if the restaurant chicken parmesan is bad, you can call them and they’ll make me another one at no charge. Likewise, you have contractual recourse when a vendor solution doesn’t do what they say it will.

If you’re thinking of cooking up your own technical solutions for data governance hoping to save a ton of money, think again. Your most economical solution might just be to order out.

Monday, September 29, 2008

The Data Intelligence Gap: Part Two

In part one, I wrote about the evolution of a corporation and how rapid growth leads to a data intelligence gap. It makes sense that a combination of people, process and technology combine to close the gap, but just what kind of technology can be used to help you cross the divide and connect the needs of business with the data available in the corporation?

Of course, the technology needed depends on the company’s needs and how mature they are about managing their data. Many technologies exist to help close the gap, improve information quality and meet the business needs of the organization. Let’s look at them:



CATEGORY

TECHNOLOGIES

HOW IT CLOSES THE GAP

Preventative

Type-Ahead Technology

This technology watches the user type helps completes the data entry in real time. For example, products like Harte-Hanks Global Address help call center staff and others who enter address data into your system by speeding up the process and ensuring the data is correct.

Data Quality Dashboard

Dashboards allow business users and IT users to keep an eye on data anomalies by constantly checking if the data meets business specifications. Products like TS Insight even give you some attractive charts and graphs on the status of data compliance and the trend of its conformity. Dashboards are also a great way to communicate the importance of closing the data intelligence gap. When your people get smarter about it, they will help you achieve cleaner, more useful information.

Diagnostic and Health

Data Profiling

Not sure about the health and suitability of any given data set? Profile it with products like TS Discovery, and you’ll begin to understand how much data is missing, outlier values in the data, and many other anomalies. Only then will you be able to understand the scope of your data quality project.

Batch Data Quality

Once the anomalies are discovered. A batch cleansing process can solve many problems with name and address data, supply chain data and more. Some solutions are batch-centric, while others can do both batch cleansing and scalable enterprise-class data quality (see below).

Infrastructure

Master Data Management (MDM)

Products from the mega-vendors like SAP and Oracle or products from smaller specialists like Siperian and Tibco provide master data management technology. It features, for example, data connectivity between applications, the ability to create a “gold” customer or supply chain record that can be shared between applications in a publish and subscribe model.

Enterprise-Class Data Quality

Products like the Trillium Software System provide real time data quality to any application in the enterprise, including the MDM solution. Beyond the desktop data quality system, the enterprise-class system should be fast enough and scalable enough to provide an instant check of information quality in almost any application with any number of users.

Data Monitoring

You can often use the same technology to monitor data as you do for profiling data. These tools keep track of the quality of the data. Unlike data quality dashboards, the IT staff can really dig into the nitty-gritty if necessary.

Enrichment

Services and Data Sources

Companies like Harte-Hanks offer data sources that can help fill the gaps when mission-critical data is missing. You can buy data and services to segment your database, check customer lists for change of address, look for customers on the do-not-call list, reverse phone number look ups, and more.


These are just some of the technologies involved in closing the data intelligence gap. In my next installment of this series, I’ll look at people and process. Stay tuned.

Disclaimer: The opinions expressed here are my own and don't necessarily reflect the opinion of my employer. The material written here is copyright (c) 2010 by Steve Sarsfield. To request permission to reuse, please e-mail me.