Tuesday, November 16, 2010

Ideas Having Sex: The Path to Innovation in Data Management

I read a recent analyst report on the data quality market and “enterprise-class” data quality solutions. Per usual, the open source solutions were mentioned at a passing while the data quality solutions of the past were given high marks. Some of the solutions picked in the top originated from days when mainframe was king. Some of the top contenders still contained cobbled-together applications from ill-conceived acquisitions. It got me thinking about the way we do business today and how so much of it is changing.

Back in the 1990’s or earlier, if you had an idea for a new product, you’d work with an internal team of engineers and build the individual parts.  This innovation took time, as you might not always have exactly the right people working on the job.  It was slow and tedious. The product was always confined by its own lineage.

The Android phone market is a perfect examples of the modern way to innovate.  Today, when you want to build something groundbreaking like an Android, you pull in expertise from all around the world. Sure, Samsung might make the CPU and Video processing chips, but Primax Electronics in Taiwan might make the digital camera and Broadcomm in the US makes the touch screen, plus many others. Software vendors push the platform further with their cool apps. Innovation happens at break-neck speed because the Android is a collection of ideas that have sex and produce incredible offspring.

Isn’t that really the model of a modern company?  You have ideas getting together and making new ideas. When you have free exchange between people, there is no need to re-invent something that has already been invented. See the TED for more on this concept, where British author Matt Ridley argues that, through history, the engine of human progress and prosperity is "ideas having sex.”

The business model behind open source has a similar mission.  Open source simply creates better software. Everyone collaborates, not just within one company, but among an Internet-connected, worldwide community. As a result, the open source model often builds higher quality, more secure, more easily integrated software. It does so at a vastly accelerated pace and often at a lower cost.

So why do some industry analysts ignore it? There’s no denying that there are capitalist and financial reasons.  I think if an industry analyst were to actually come out and say that the open source solution is the best, it would be career suicide. The old-school would shun the analysts making him less relevant. The link between the way the industry pays and promotes analysts and vice versa seems to favor enterprise application vendors.

Yet the open source community along with Talend has developed a very strong data management offering that should be considered in the top of its class. The solution leverages other cutting edge solutions. To name just a few examples:
  • if you want to scale up, you can use distributed platform technology from Hadoop, which enables it to work with thousands of nodes and petabytes of data.
  • very strong enterprise class data profiling.  
  • matching that users can actually use and tune without having to jump between multiple applications.
  • a platform that grows with your data management strategy so that if your future is MDM, you can seamlessly move there without having to learn a new GUI.
The way we do business today has changed. Innovation can only happen when ideas have sex, as Matt Ridley puts it. As long as we’re engaged in exchange and specialization, we will achieve those new levels of innovation.


Renat Zubairov said...

Interesting blog post Steve, though you forgot to mention the "other side" of the Open Source medal. Such as for example fragmentation.

Retuning to your example of Android platform. Despite the fact that Mr. Schmidt denies it fragmentation does exists there. Another good example is Linux: incompatibilities between different linux distributions are very well showing that fragmentation can dramatically decrease innovation power of open source communities.

Another interesting point "commercial" open source, or more precisely - open core. For example a very well known "open-source" BPM vendor that stands behind the best open-source BPEL runtime also deliver an open-core product based on open-source. To make "commercial" version more attractive not all fixes and/or patches are immediately contributed back to the open-source trunk.

I really like your approach to doing open-source though. Open and clear communication, public bug-tracking and roadmap, public source repositories make community feel welcome, however I'm sure it's a hard task to balance between commercially attractive enterprise sales and free community version distribution.

Steve Sarsfield said...

Thanks for the kind words on the Talend community.
Regarding releases, it makes no sense to deliver an inferior or buggy product in an open source version, while fixing bugs in a commercial version, Renat. Many people are trying out the open source version in preparation for the commercial version and if it doesn't work, it's bad business. On the contrary, I've seen it to be very good business to provide a really strong open source product to build goodwill.
Regarding fragmentation, the innovation model that I discussed works for the company who is undoubtedly perpetuating the fragmentation FUD - Apple. They rely on specialists to build the iPhone, just as the Android vendors do. It's only the software that's locked down.

Naiem Yeganeh said...

Absolutely! I looked into talend a while ago when I was compiling a report on DQ softwares. It did not catch my eyes then for two reasons, first because I could not make it run on my machine and second because it was not named in the Gartner's report on Data Quality.
I always wanted to contribute in a DQ open source project, but being a C# developer, programming with java or C++ feels like going back to the age of steam engines. Do you have a plan to open some space for people like me in talent? Like DI extensions for SQL Server, etc.

Steve Sarsfield said...

There is some beauty in using JAVA, Naiem. I believe the pool of technicians who know JAVA is a big one, so modifying the code or customizing is relatively easy.
Having said that, I don't think most of Talend's users even bother with the JAVA. They have a tool that makes the transformation code and runs... and that's enough.

Disclaimer: The opinions expressed here are my own and don't necessarily reflect the opinion of my employer. The material written here is copyright (c) 2010 by Steve Sarsfield. To request permission to reuse, please e-mail me.