Friday, November 2, 2012

Housebuilding, Data Architecture and Data warehousing




Introduction

Traditional Data warehousing focuses on designing and developing information systems to support an organization's BI initiatives and BI tooling. With the advent of new techniques, technologies, architectures and approaches we see the line between Data warehousing and other classes of information systems (transaction processing, interfacing, analyzing) blurring more and more.

Building a House

Building up your IT structure is a lot like building a house. The same goes for your data architecture. However, a data architecture is usually not a building built from scratch by an Architect, but a building made of a lot of prefab elements like rooms (applications), plumbings (Interfaces) and foundations (IT infrastructure). The end result depends, but it is usually more comparable to the Winchester Mystery House, or an Escher drawing than a regular building we work or live in. 
All elements we buy are usually standardized by the supplier or builder, but still needs serious adaptation before it will fit in your data 'house'. Even then, it won't look pretty since size, look and makeup will usually differ considerably between elements from different suppliers (and even from the same supplier). 'Window dressing' will only make it palatable by the casual observer, not by the persons living inside.

Data Warehousing

While a Data warehouse can be seen as just another room (application), there are some special considerations. Data warehouses are supporting information systems, more a support and bridging structure for your house. They are usually custom built, bridging gaps between standard elements and supporting new types of rooms. This is different from most of the other elements/rooms of your 'data' house (Even if the Data warehouse itself is made up of standard elements). From this new support structure your BI applications (new rooms!) can now be built. 
If our other (prefab) elements where well designed and structured in a way to connect seamlessly we might have no need of this new type custom support structure, but in practice this level of design is difficult and not implemented very much.

But why focus on a support structure for certain types of rooms (applications)? A normal house has just one intergated support structure that supports all the rooms and plumbings. Given current state of IT this overall support structure is usually built around the (IT) organization and is almost always an exercise in custom building, not an element of your 'data house' you can just buy directly of the shelf.

Building Better 'Data (Ware)houses'

Instead of looking directly for a Data warehouse as a supporting structure, we should look at supporting our total Data Architecture, which could include subjects like Data Warehousing, Federation/Virtualization, Data storage, Business Intelligence, Data Migration or data interfacing. We  should focus on building a whole 'Data house' with just one 'supporting structure', of which our familiar 'Data warehouse' can be an intgrated part instead of just another element grafted on our already baroque 'Data house' landscape.