OBIEE performance tuning myth : BI Server logging
One of the frequent recommendations around performance in OBIEE that one hears is a blanket insistence on disabling the BI Server log. It is a line that is repeated by Oracle support, propogated in "Best Practice" guides, and repeated throughout blog posts on the subject. Antony Heljula did a talk on the subject at the recent RittmanMead BI Forum in Brighton, and I would like to echo and expand on it here.
The Myth:
If you are having performance problems in OBIEE, you should switch off BI Server logging
The arguments for:
- Instinct would tell us that writing a log is going to take longer than not writing to a log
- On a system with high user concurrency, we would expect to see contention for writing to the log file
- Usage Tracking records report response times, so why do we also need the server logging
- Log files will cause the disk to fill up, which left uncontrolled could cause system instability
The arguments against:
- If you have performance problems in OBIEE, then you need logging in place to be able to trace and diagnose them. The BI Server log gives us vital information such as what physical SQL results from a logical query from the front end. If you turn off logging, you lose all visibility of query behaviour, timings, and row counts.
- OBIEE writes lots of logs, more so now in 11g. Why only disable one of them? Why not all logs?
- If a query takes 30 seconds to run, how much of that 30 seconds is actually going to be in log overhead? You disable logging and now your query runs in 29.999 seconds. It's still slow, it's still a performance problem - and now you don't have the data available with which to diagnose the problem!
- Usage Tracking doesn't record the same level of detail around a query's behaviour (response time profile, row counts) that the server log does.
- By default, Usage Tracking chops off Logical SQL above 1024 characters in length.
- Sometimes you need the log file to confirm that Usage Tracking is reporting correctly (especially in circumstances where report run times seem unusually high)
- Error messages returned from the database are not captured in Usage Tracking
It Depends
To a point, I am being contrary in arguing this specific issue, but it is important with this and other broad-stroke pronouncements around performance that get regurgitating without context and caveats that they are understood. In particular, labelling it a "Best Practice" is a dangerous fallacy as it implies that it should be done without much further thought or consideration of its consequences.
If the NFR for a report's performance is [sub]-second and it is not being met, then profiling of the end-to-end response time breakdown should be done, and it might be that it demonstrates that the logging is impeding performance. But the point is that it is proven rather than done blindly.
Further reading
Cary Millsap's paper, Thinking Clearly About Performance, is an excellent starting point for developing an understanding of a logical and methodical approach to performance problem solving.
James Morle wrote an great blog post on the subject of "Best Practice" and why it is dangerous terminology, entitled "Right Practice"
Thanks to Tony for reviewing & making further suggestions for this article.