Data Warehousing

Accidents waiting to happen

Peter Scott

Dec 28, 2005 • 1 min read

A while back I blogged about a data warehouse review we are doing for a customer. At the time I was less than impressed by the consistency of the approach to DW design and the lack of knowledge of how Oracle works and what is ‘built-in’ as functionality. I have now moved on to another of their legacy systems. On the surface things look better. Source code has version control headers, they are using a source control tool (OK, it is not one that I like, but if it works for them, so what?), even the variable names are intelligible (self documenting) and there are comments in the code. Lots of ticks in the good practice checklist, or is this just too good to be true?
One of the modules investigated populates staging tables from a source (non-Oracle) database and then applies this to the ‘published’ layers of the data warehouse (I would have probably specified two modules, one to stage the data and one to publish – but I am not here to split hairs). Looking at the code in more detail I see variable names such as reject_count, commit_frequency. Could this mean that someone has written set-based code that also can handle data exceptions efficiently? Sadly, no; all of these carefully crafted, well named variables appear only once in the procedure, just before the ‘BEGIN’ statement!
Worse still was the technique used to populate the staging table – the developers used a cursor loop to identify the tables to populate but held the source table names in a hard coded array accessed by loop counter. An accident waiting to happen!

Sign up for more like this.