Hybrid SCDs using OWB and OBIEE
If you attended my ODTUG Kaleidoscope presentation on Oracle Warehouse Builder (feel free to view the presentation), then you know that certain aspects of the product leave me scratching my head, especially as a follower of the dimensional modeling approach popularized by Ralph Kimball. One of the subjects covered was dimensional operators and the lack of support for hybrid slowly-changing dimensions (SCD's). For an enterprise data warehouse, this is a real deal-breaker. When you look at the Customer Dimension, for instance, why would we ever want to track historical changes to attributes such as Birthdate or Ethnicity? Changes to these attributes can only be seen as corrections, and these corrections would need to be made to ALL rows for that particular customer: the current row, and all other rows inserted as a result of Type 2 changes through the life of that customer. There are other examples that might not be so black and white: sometimes, the end user simply needs to see both Type 1 and Type 2 changes in the same table.
So what are we left with? We have two choices really. First, we can choose an alternative for our SCD processing. This could entail custom coding our SCD handling so that we can represent both Type 1 and Type 2 changes, or possibly using a third-party add-in, such as the Transcend Framework, which I developed to handle situations such as these (Transcend will hopefully be available as an option from Rittman Mead, so watch the blog for news). The other more interesting option would be to go ahead and use OWB for our SCD processing and then attempt to represent some of our Type 2 changes as Type 1's in the reporting layer. With the flexibility that OBIEE provides, I should be able to make a go of it.
First, I'll use OWB to create a fact table based on SH.SALES called SALES_FACT, a dimension table based on SH.PRODUCTS called PRODUCT_DIM, and the necessary operators and mappings to load the two. For the PRODUCT_DIM table, I used a standard dimension operator with two levels.
You can see that I configured all my attributes as Type 2 attributes, including the VALID column. For the purposes of this example, imagine this to be a kind of Discontinued Flag, set to (A)vailable or (N)ot Available. Currently, all the products in my warehouse are available for purchase... at least they are right now. The business informed me that they would like this attribute represented as a Type 1 column. When forecasting sales for the coming year, they'd like to see how their discontinued products performed in the last year or two to make sure adjustments for these products are made.I created a mapping to load the data, pulling all the rows from SH.PRODUCTS initially... using the PROD_EFF_FROM and PROD_EFF_TO dates to populate the effective dates in our dimension.
After running this mapping, we can see a small data set from the PRODUCT_DIM table, specifically, the Photo and Hardware categories.Suppose that our client decides to get out of the Hardware business (did you look at the specs for the PC's they are selling... no wonder!), discontinuing all the products in the Hardware category. So this change from the inventory system makes it's way to our source table in the form of the following two rows:Now, I run the PRODUCT_DIM mapping again, and have the following rows in the PRODUCT_DIM table for the Hardware category:If we are triggering history for all the attributes in the PRODUCT_DIM table, what value will these two new rows provide? Currently, no rows from the SALES_FACT table reference them, and seeing as this change designates that these products are no longer available, it's unlikely that any new sales are going to be associated with them either. Furthermore, we want to see how our discontinued products stack up with our current product line... and the only way to address this kind of reporting is with Type 1 attributes.So now I have to try and address this issue using OBIEE. When I create the Physical Layer, I bring the SALES_FACT table in verbatim from the database, but I adjust the PRODUCT_DIM slightly, using a Table Type of "Select" instead of "Physical". Basically, this means that the table in the Physical Model will actually be the results of a SELECT statement from the database:
I'm using the Oracle analytic function LAST_VALUE to represent all values for the VALID attribute across each natural key (PRODUCT_SOURCE_ID) according to the most recent value, using PRODUCT_EFFECTIVE_DATE as the ordering mechanism. Now, whenever we update our product table to discontinue a particular product, we can use this attribute to provide impact reports on how those products affected our sales in the past. Below is a quick Answers report that demonstrates the behavior of this new Type 1 attribute, starting first with the criteria:and then the results:When you consider the modeling effort that goes into assigning Type 1 and Type 2 attributes, you might say that this approach is not merely a replacement for a feature that OWB lacks, but it might actually be the preferred approach. Consider the case when the business decides, after analyzing their reports for several months, that they would like to change the SCD Type of a particular column. While it's relatively easy to go from a Type 2 to a Type 1, the reverse is nearly impossible.Depending on the environment, there would be rows and rows of historical changes never inserted. We would have to tell the business that we can affect that change going forward, but the historical data of all those Type 2 changes simply doesn't exist. But with our new OBIEE approach... it's as simple as making a few changes to the Physical Model in our repository. You can also see that I brought the VALID column in as well, and mapped it in the Business Model to an attribute called Historical Valid. This gives me the ability to report on it as a Type 1 when necessary, and the standard Type 2 manner in other reports.