Much has been said about the adoption of open source software for data management in businesses, apparently replacing Informatica PowerCenter, Business Objects Data Integrator, Trillium Data Quality and the like.  In practice our experience so far has been that few companies have made the jump.  We thought it could be helpful to share some of our practical experience working with Talend Open Studio, so below are a few snippets.

What is Talend Open Studio and when might you use it?

Talend Open Studio is a data management platform providing data integration, data quality and master data management capabilities.  Typical uses include connecting to lots of data sources and providing access to their data from one place; enabling discovery and then cleansing of data that is not of the correct quality; carrying out data migration such as for a business change initiative or integrating data such as that needed to feed a data warehouse.

Benefits of Talend Open Studio

Several large software companies have grown up as a result of selling this kind of software to businesses for whom data management is otherwise a tricky and costly essential activity.  On the face of it the chance to have those benefits for free is very attractive.  But is that borne out in reality?

Red Olive has worked with several organisations that already have an investment in software for e.g. data integration and now wish to begin more rigorous data cleansing and so wish also to begin using a data quality tool.  One of the benefits of Talend Open Studio is that it provides exactly that, a wide range of data management capabilities. Because the “community edition” is free, it provides a very cost-effective way to begin a data quality or master data management initiative without the need for any significant investment.  This mindset also fits well with the current preference for iterative and incremental projects, reducing business risk by enabling the delivery of a small but valuable piece of business functionality in a short timeframe.

Because Talend is committed to open-source, they as an organisation also seem to invest in making sure that their products support other open-source products.  For example, Talend’s data integration tool transparently supports the use of Hadoop clusters and of the Hive data warehousing environment.  Finally, Red Olive’s experience of calling on Talend through their public forum has also been pretty good.  Remember that we’re talking about the open-source version here so no support fees are being paid, but generally where a genuine bug has been spotted we have found that we have had a response from a technical resource at Talend in a few days.

Cautions regarding the use of Talend Open Studio

As you’d expect there are some limitations to the community edition of Talend Open Studio.  Firstly, it is positioned as a product for use by one individual only and so it is only possible to have one user.  I don’t just mean one user at a time, I mean one user.  This once caused us a problem when we had carried out an installation and configured it, and then came to hand over the system to the client.  It was not possible for them to use a different user id on the system at all.  The advice we received from Talend was to re-install the software from scratch starting with a different user name, if we wanted to use a different user name.

Another limitation is automation.  Automated scheduling of activities such as routine data loading is disabled in the community version, which means if it is used for production purposes then someone has to manually run jobs as they are needed.

Things to bear in mind if considering buying the Enterprise edition of Talend

Another limitation worth mentioning is of course the lack of any commercial support.  If you run into any difficulties then you are quite literally on your own, other than public forums where there are no guarantees of any response.  That changes if an Enterprise edition subscription is taken out.

If you have purchased business software before, you may be familiar with the idea of a perpetual licence which then has additional annual support and maintenance fees enabling you to call upon support from the vendor at will and to upgrade to later versions.  This is not the model applied by Talend, instead Enterprise is licenced on an annual subscription basis.  New in 2013, it is no longer possible to licence individual modules such as MDM or data quality but a licence agreement must be taken out for the whole platform.  This increases the cost if you are only interested in one part of the integrated functionality and it looks to me like this has been done with the aim of selling Talend to satisfy a company’s total data management needs rather than just one portion of them.

[Update on 23rd Oct 2013: Red Olive attended the Talend Connect event in London recently, to see a review of the event click here.]

[Update on 1st Nov 2013: To read about Talend’s product roadmap click here.]

As a consultancy we are often looking for skilled practitioners to recruit.  So far in the UK we have found very few such people and this should be considered carefully when considering investing.  We have been forced to look overseas to find good candidates and scarcity has also pushed up day rates, so you would be wise to think about training your own staff.  Red Olive can help you to develop your internal skills as well as working with you as a consulting partner.  Please call us on 01256 831100 or contact us via e-mail for more information.