Surrogate keys

June 17 - I was not too happy about the way this post was written - I will be revising parts of this - I will use italics to highlight changes.

I must admit my view on surrogate keys in a data warehouse is far from the orthodox. For me a surrogate key on a dimension is the exception, not the norm. If an item has an immutable natural single column key then why not use it? For most businesses, the renumbering all customers or all products is a big change; it runs across the whole business and all systems – logistics, ERP, finance, and data warehouse will all be affected. This is the type of change that does not happen often; it may, therefore, be pragmatic to consider customer and product to immutable keys.

I certainly would not consider a surrogate numeric key to replace a natural date key on the time dimension – date has ‘real’ meaning, it can easily be used in a partition key, it also facilitates ‘between’ type queries; numeric surrogates on the other hand can make partitioning schemes more complex (perhaps list instead of range partitioning) and ‘between dates’ queries become ‘in list’ ones.

In a comment to this post, Tom Kyte referred me to a recent thread on AskTom where an improvement to query plan through the use of a surrogate key joining to the dimension table over the plan obtained for a direct select from a fact table using a natural key was discussed. To me, the key point here is the join between fact and dimension tables (on the dimension surrogate key) however I feel that similar (if not identical) plans would result by joining on natural single column keys.

It could be argued that surrogate keys ‘force’ a casual DW user to create queries that join the fact tables to its dimension tables and hence improve query optimisation; however, if you consider that most casual users will use query tools and do not write SQL directly against the database that point becomes less compelling; the same joins to the dimension tables will be required to resolve the key codes into more friendly descriptions.