Tuesday, February 19, 2013

Data Vault: An excuses for not modeling?


I do sometimes hear about Data Vault initiatives with questionable directives, architectures or data model schema's. I think this is usually due to a lack of serious data modeling (skills). An argument I see a lot is that Data Vault allows you to be flexible modeling wise, so serious well though out (big, enterprise) data models are not required (they where too expensive and difficult anyway ;). Nothing could be further from the truth. Data Vault is a technique that allows you to implement a schema that is conductive to postponing certain modeling choices and activities, but that is no excuse to abolish them outright! All data modeling tools (ER tooling), skills, techniques (NIAM/FCO-IM, ER diagramming) and formalization (Relational Model) are still extremely relevant (even more so, but for practitioners maybe sometimes hard to correlate to Data Vault) when creating a DWH architecture with a Data Vault. Data Vault does allow you to align data model schema better with an Agile approach to DWH development, but that's about it, modeling is still required. In fact, since we want automation and also data model schema transformation/derivation, good modeling is even more relevant than with Kimball or classic Inmon style Data warehousing, where we could push modeling issues to the underlying (complex) ETL layer. With Data Vault we should never shove complexities under the "ETL carpet", but model them explicitly.

Personally, I can teach a lot of Data Vault and it's variation, implementation and architectures to someone very well versed in serious data modeling with e.g. NIAM/FCO-IM. The other way around is definitely harder. Data Vault (modeling) is NOT an island, even if it's fairly 'stand alone' definition might give people that impression.

Finding time for understanding


This post is not about Data vault, data modeling or Data warehouse architecture. It is about how I look at learning and training from the standpoint of an experienced professional (which we eventually all become, don't we?).


Most knowledge workers like IT professionals (either employees or self employed) either have time allotted to them to go on training or reserve time for self study and training. Most of them try to separate the time for doing and time for learning (this is usually related to the way their work is financed and organized). They either follow courses and/or read books, and after that go back to the work at hand. In my opinion this approach seldom reaps great benefits. You usually forget what you just learned because you don't have the opportunity to directly use your new gained insights and skills. There is usually something getting in the way of implementing your new gained knowledge on the spot.

In my view we should therefore focus on time for understanding during our regular work related activities and try to make the most out of that. It should be an integral part of our work, and not a separate activity. So study, self study or training should ideally be done in a flow of work related activities. And it is important you increase the understanding of what you do, because if you don't understand something work related you might jeopardize your personal conviction (you get frustrated, bored or disillusioned) or your professional commitment (you might no (care to) get an optimal solution).

We humans do need LOTS of time to understand things. We're humans and prone to focus on looking for well known patterns instead of really understanding what we do. We need time to grasp new ideas. Don't see your training time as the only time to try to understand things, see it more as a time to trigger better understanding. You should be training and learning all the time.

Before looking around for all kinds of courses first look at yourself. You might just have taken too little time to understand and focused too much time on just doing. Fix this first and then look for complementing learning or training opportunities.