Tuesday, November 20, 2007

Creating Structure from Unstructured Data


A lot of focus in the data quality industry has turned to cleansing and standardizing unstructured data. An example of this is shown above.

At Trillium Software, we continue to teach the engine more and more about supply chain and ERP data as well. We can take what would otherwise be very difficult to use description data, and putting it into buckets. The Trillium Software System understands the distinction between an item name, and size, and packaging and is able to standardize that information into proper fields.

Of course, the benefit of this is that if you want to understand how much polypropylene you have in inventory, you can't easily do it with the data at the top of the diagram. However, you can get a complete understanding after it has been put into its proper buckets (data on the lower part of the diagram). It comes in handy for that meeting with the polypropylene sales rep, since now you can fully understand the volume of your purchases.

One of the first customers for whom Trillium accomplished such a task was back in 1996 at a major food manufacturing company. The company had descriptions of ingredients such as “Frozen Carrots”, “Carrots, Frozen”, and “Frz Car” in their supply chain systems, and Trillium was able to sort it out.

More recently, there was Bombardier, which is available as a case study. It took only three months for a small team of engineers to design, develop, and implement a new process for standardizing 2.9 million inventory items using the Trillium Software System. Now, reports that once took months to generate are created weekly, providing high-quality information for streamlining procurement, reducing inventory, increasing on-time delivery, and boosting sales.

This is my last web log entry before the Thanksgiving break. Have a happy and safe holiday!


No comments:

There was an error in this gadget
Disclaimer: The opinions expressed here are my own and don't necessarily reflect the opinion of my employer. The material written here is copyright (c) 2010 by Steve Sarsfield. To request permission to reuse, please e-mail me.