Smoke Testing with OWB
I have been putting together some thoughts on Agile Development techniques for a paper for the BIWA event later in the year. I have seen a couple of articles on Agile development notably on the Amis blog, here, in the Oracle space, so thought I would add my tuppence worth.
One of the principles of Agile Development is continuous and/or frequent delivery. In order to do this you must have the ability to test frequently, in order for this process not to take all of your time it really should be automated. Smoke testing is a way of achieving this. I first heard the term from reading Steve McConnell (very good author, excellent read about software engineering and projects), and first used it with OWB when I was working with a very good Project Manger, Gerry Williams, enabling us to develop a very portable and robust ETL suite. Steve McConnell's definition, full version here is as follows:
Every file is compiled, linked, and combined into an executable program every day, and the program is then put through a "smoke test," a relatively simple check to see whether the product "smokes" when it runs.
Why do we want to do this? It seems like a lot of overhead and possibly a lot of unnecessary work. ETL code is typically not executed frequently enough during the development process, especially with realistic sets of data. This can lead to a number of problems both in terms of the quality or robustness of the code and can lead to a whole load of data quality issues. Something I read recently (sorry can't remember where from) says that 50% of ETL projects overrun due to data quality issues. If we are only seeing the data we are loading once the system is live, then there is a high risk of problems. So the answer is that we want our code to work and keep on working once it goes live, admittedly no great revelation, but you'd be surprised...
So how can we use this in OWB? First we need to develop within a framework. We need to execute the code before it is all built. We need to start with the shell of our ETL process and gradually fill in the blanks. One of the ideas of Smoke Testing is Don't Break the Build. There is nothing worse than a project team coding and coding and coding to meet deadlines and then when all the code gets put together then it takes another week to integrate it. I will write another posting detailing how to put such a framework together in OWB, but it is based on using a number of process flows executing template mappings, gradually developing these mappings and providing a mechanism to selectively execute various areas of the process.
Second we need an automated way to promote and execute the code we are building on a daily basis. What we are looking to do is replicate Ant for OWB. This means automation and scripting and is centred on OMBPlus. We can use a script that literally destroys (cleans) the existing environment, recreates the repository including the latest delta of the OWB components, the target users, deploys the code, grant permissions and executes the ETL process. If we build from scratch each time then we protect ourselves from any kind of teeth (read permission) problems when promoting the code. We know that when we install the code in a new environment it will work.
Once we have these components we can automate the whole process. For example developers can put any competed work into a predetermined collection. This is then automatically promoted, built and executed every night. By the time you come back into work the next morning you will know a lot about your ETL code and the data it has run against.
This can be used as a regression testing tool, or to examine data quality - the business can be involved in the process with a review of the data every morning, or to just ensure the quality of the code. The real payoff is the automation, so once the framework exists there no extra effort required to meet this principle of continuous software delivery.