Recently there has been a good Kimball/Inmon debate on the TDWI forum of Linked In.
It was sparked off by Bill Inmon publishing a white paper. The assertion made in the white paper was that while the Inmon architecture could be used to address all types of data in an enterprise, the Kimball dimensional approach was limited to so-called “structured” data which only represents about 20% of business data. Never without an opinion, I chewed it over with a couple of colleagues and below is my take on the eternal Kimball/Inmon question.
Our main interest at Red Olive is always in extracting business value from data rather than architecture per se. I have personally worked with both Inmon and Kimball based implementations. My original experience at Unilever was all based on Inmon design principles and then I joined a BI consultancy who favoured Kimball’s approach.
Kimball’s book “The Data Warehouse Toolkit” is a practical “how to…” book. Its focus is really on providing useful information to business people. It helps you decide which data and information will be of most value to your organisation and therefore where you should start. It explains to you how to come up with a data model in business language that your business users will understand. It does talk about architecture, but that’s largely as a bi-product of focusing on business issues.
Inmon’s book “Corporate Information Factory” (CIF) is much more conceptual. It starts off by defining a set of terms for components of an information architecture: Data Warehouse, Data Mart, Operational Data Store etc. These are defined both in terms of their role in the architecture and the nature of the data they contain. The vision laid out for handling all kinds of data certainly seems all-encompassing.
We tried to implement the CIF principles over a number of years at Unilever but we found that hard. I felt that CIF was short of information on how to make the concepts work in practice. Not all the issues were technical. For example, we struggled with where the data warehouse ends and where a data mart starts. DW 2.0 seems to have more practical information in it, although I think it’s still conceptual and that probably reflects the author’s own style.
I wasn’t wholly convinced by the white paper. It starts by saying that only 20% of data is structured and so 80% can’t be addressed by dimensional modelling. As an example, the point of text mining algorithms is that they convert text into a form which can be loaded into a relational database and then usefully queried. That means the text has become structured, so its data model can be dimensional. I take my hat off to Bill Inmon because he was gracious enough to jump into the debate himself, acknowledge the comments above and give some ideas about where to find more hands-on help.
My measure of success for a project is whether business people use a new system for decision making once it’s live. On balance, I’ve seen more successful projects using Kimball techniques. The key things to my mind are less about the architecture of the system and more to do with selecting the right information in the first place and making it accessible to business people in terms they can understand.
Jefferson Lynch, Red Olive