Dimensions on the side
It is quite a while since I blogged the mini-series on data warehouse design (it started here) and was proceeded by a series with the grand title of DW Wisdom - where I strayed into talking about parallel queries, RAID5 and massive disks. But I did not really write much about dimensions. So to make up for it:
Thinking in dimensional terms is (fortunately) an easy thing for BI users to do; for example looking at sales data there is typically a "What" was sold, a "Where" was sold, a "When" and a often a to "Who". Knowing the key values for these four Ws will enable a single transaction to be found, but it also gives the prospect of being able to aggregate on one or more of the dimensions to provide summary information that is useful to the business. Often these dimensions have a hierarchical structure - a store is on a street, in a city, in a state, in a region, in a country in a continent (and for Star Wars fans and others with special needs) on a planet, in a solar system, in a galaxy...
Sometimes there are more than one hierarchy within a dimension - date for example could have both a fiscal hierarchy and a calendar year hierarchy, sharing a common point of 'DAY'. Other dimensions could have alternate navigation paths that re-meet at the total level - for example a store can be part of hierarchy that goes store -> state -> region -> all stores and also in a store -> store size -> all stores hierarchy.
But real data warehouse often have more than four dimensions. I'll write about that next time (after I have thought up some more "W" words to name them)