Capturing change

Dominic Brooks posts a link to a completely fabricated tale that must be true or at least it rings true for those of use who design applications (of any kind)

I have been looking at how we can interface to a legacy Oracle 8 based CRM application and extract data to populate a new corporate data model (with surrounding ODS) and the obligatory data warehouse to give some much needed reporting. Then comes the customer's curved balls - the ODS must be updated every 30 minutes and our extracts should be non-invasive on the production system; helpfully it is suggested that we source our data from the warm-standby database. Sadly, the "standby" option is not going to help me; it is really just a cloned set of data files and redo logs that can be recovered on to another instance if the source system fails.

That leaves me with using the live system as my data source and the need to develop a low-invasive way to extract changed data. If this was Oracle 10g the first choice answer would be to mine the logs for data changes and apply them to my target system. But this is old Oracle, and the only options open to me will result in the database having to do extra work to give me my data. I have two options: extract everything and calculate the changes by comparing them to what I saw last time, or to use triggers to capture the values of every insert or update (deletes don't often worry me in data warehousing).

One way to use 'triggers' without the hassle of writing the code is to create materialized view logs on the source tables. These can be used to populate fast-refresh materialized views (which should minimise the amount of data moving about) But another option may be to use the Mview logs as the source of my extract data - it identifies exactly the data I need to extract. This is, of course, a completely unsupported way to do things - it may be worth a bit of exploration time though as it does seem to have some strong plus points (and some big minus ones too!)