Tuesday, January 24, 2012

Big Data, Enterprise Data and Discrete Data

Total Data Management©
The data management world is buzzing about big data.  Many are the number of blog posts articles and white papers covering this new area. Just about every data management vendor is scrambling to build tools to meet the needs of big data.

The world is correct to pay notice. The ability for companies to handle big data represents exciting innovation where large relational databases with high price tags are sometimes replaced with flat files, technologies like Hadoop and intelligent parsers to create analytics from massive amounts of data.  It’s a game-changer for those in the Business Intelligence and relational database business.  It’s about managing an increasingly common huge data problem more effectively and at lower cost.

However, where there is big data, there is also enterprise (medium) data and discrete (small) data. With each size of data come very specific challenges.   



BIG DATA
ENTERPRISE DATA
DISCRETE DATA
Technologies
Hadoop and flat files to reduce costs and avoid relational database costs.
Relational databases
Spreadsheets and flat files and flat databases. May come from other non-relational sources, such as e-mail attachments, social media JSON, and XML data.
Use Cases
Real-time analytics of a large number of transactions, including web analytics, SaaS up-time optimization, mission-critical analysis of transactions
Just about every business application today, including CRM, ERP, Data Warehouse, and MDM.
Companies with no or little data management strategy, or for those companies dealing with immature data architecture. Companies who receive mission-critical data via e-mail.  Companies who need to closely follow social media streams.
Innovation
Handles huge amounts of data that is predominantly used for business analytics and operational BI.
Provides a power data management architecture that can be accessed by a common language (SQL).
Handles more diverse and more dynamic sources.
Positives
Replaces high cost multi-server relational databases with lower costs flat files and Hadoop server farms.
Provides a scalable, reproducible environment in which database applications and solutions can be developed. Replaces unwieldy human-intensive data processes with streamlined central repository of information. Used in many businesses in day-to-day operations.
‘Simplifies’ the data management process to the point of being completely within the grasp of the business users without too much complicated technology.  In the long run, however, data management is more costly and unwieldy when it is in spreadmarts.
Negatives
Relatively new technology with limited pool of Big Data experts. Legacy medium-sized systems can sometimes scale.
Can be costly when data volumes become high, as new servers and new enterprise licenses get more common.  Also, the number of sources and diversity of data types.
Error-prone and labor intensive.
Cost Focus
Expertise
Servers and licenses/ Connectors and database technology
Efficiency and productivity























Growing Up
An organization’s data management maturity plays a role in big and little data.  If you’re still managing your customer list in a spreadsheet, it’s probably something you started when your company was fairly young.  Now, the uses for the data should be expanded and you are still stuck in the young company’s process. Something that was agile when you were young is inefficient today.

Your pain may also have something to do with your partners’ data management maturity.  While the other companies you do business with are good at what they do, supplying products and services to your company, they may not be as good at data management. The new parts catalog comes every so often as an e-mail attachment.  You need an efficient process to update whoever uses it.

No matter how mature you are, it is likely that you will have to deal with all types of data. When selecting tools, make sure you examine the cost and efficiency of all of these types, not just big data.


Tuesday, January 10, 2012

What is Data Governance?

I recently did a quick movie for a Talend promotion to define data governance. It turns out that defining data governance is trickier than you think. Here, I examine the characteristics of data management initiative and how they define data governance.

Disclaimer: The opinions expressed here are my own and don't necessarily reflect the opinion of my employer. The material written here is copyright (c) 2010 by Steve Sarsfield. To request permission to reuse, please e-mail me.