Tuesday, February 19, 2013

Data Vault: An excuses for not modeling?

Introduction


I do sometimes hear about Data Vault initiatives with questionable directives, architectures or data model schema's. I think this is usually due to a lack of serious data modeling (skills). An argument I see a lot is that Data Vault allows you to be flexible modeling wise, so serious well though out (big, enterprise) data models are not required (they where too expensive and difficult anyway ;). Nothing could be further from the truth. Data Vault is a technique that allows you to implement a schema that is conductive to postponing certain modeling choices and activities, but that is no excuse to abolish them outright! All data modeling tools (ER tooling), skills, techniques (NIAM/FCO-IM, ER diagramming) and formalization (Relational Model) are still extremely relevant (even more so, but for practitioners maybe sometimes hard to correlate to Data Vault) when creating a DWH architecture with a Data Vault. Data Vault does allow you to align data model schema better with an Agile approach to DWH development, but that's about it, modeling is still required. In fact, since we want automation and also data model schema transformation/derivation, good modeling is even more relevant than with Kimball or classic Inmon style Data warehousing, where we could push modeling issues to the underlying (complex) ETL layer. With Data Vault we should never shove complexities under the "ETL carpet", but model them explicitly.

Personally, I can teach a lot of Data Vault and it's variation, implementation and architectures to someone very well versed in serious data modeling with e.g. NIAM/FCO-IM. The other way around is definitely harder. Data Vault (modeling) is NOT an island, even if it's fairly 'stand alone' definition might give people that impression.

No comments: