Friday, January 25, 2013

Metadata as a perspective on data.


Introduction

When discussing metadata I see often 2 types of reactions. The first one is assigning special meaning to metadata, to be different than just data. The other reaction is to say metadata is just data. As is often the case, both reactions are 'true' for a given value of 'true'

A perspective on data

For me, it is all about perspective. From one perspective we might talk about data (the data that describes a database schema) and from another perspective we talk about metadata (the metadata that defines the database).

Example

Let's look at data from the perspective of an information system and it's proper functioning:

From the systems perspective, when removing the data does not make a system malfunction, it was simple data, when the system itself fails, it was (operational) metadata for that system.

Example: Deleting records does not usually make any COTS OLTP system fail. Dropping tables (=deleting entries in the DBMS catalog table that lists tables) usually does. So the DBMS catalog is considered metadata here. 

Example: Dropping reporting tables in a data warehouse *usually* does not stop the whole data warehouse from running (although some reports might now not work anymore). Deleting data required for running ETL processes however will stop the data warehouse from functioning correctly, so that should be considered operational metadata from the perspective of the Data warehouse system.