Digging into the Oracle BI Server Query Log File

As a follow-up to the posting I did last week on Oracle BI Enterprise Edition performance tuning, I thought I'd take a closer look a the query log file generated by the Oracle BI Server, the "virtual database engine" that ships with Oracle BI EE. For some time now I've been looking to see if there is any diagnostic data produced by the BI Server so that, for example, I can see the query plans put together when a query comes in, particularly in the situation where, for example, the BI Server joins together (or "stitches", in Oracle BI Server terminology) data from two different database queries into one logical dataset. So, let's take a simple example, where we have a fact table containing 14m rows of sales data and a product table containing 72 rows of product data, with the data all being stored in a single Oracle database and a single physical model describing the source tables and their relationships.

Running a simple query like this, against the source data:

Would generate the following log entries, at $ORACLEBI_HOME/server/Log/NQQuery.log, with query logging set to 3:
##############################################

-------------------- SQL Request:

SET VARIABLE QUERY_SRC_CD='Report';SELECT PRODUCTS_LOCAL.PROD_SUBCATEGORY_DESC saw_0, SALES_BIG_LOCAL.QUANTITY_SOLD saw_1 FROM "Log & Federated Query Test" ORDER BY saw_0

+++Administrator:2c0000:2c0003:----2008/10/02 13:29:24

-------------------- General Query Info:

Repository: Star, Subject Area: Log & Federated Query Test, Presentation: Log & Federated Query Test

+++Administrator:2c0000:2c0003:----2008/10/02 13:29:24

-------------------- Logical Request (before navigation):

RqList
PRODUCTS_LOCAL.PROD_SUBCATEGORY_DESC as c1 GB,
QUANTITY_SOLD:[DAggr(SALES_BIG_LOCAL.QUANTITY_SOLD by [ PRODUCTS_LOCAL.PROD_SUBCATEGORY_DESC] )] as c2 GB
OrderBy: c1 asc

+++Administrator:2c0000:2c0003:----2008/10/02 13:29:24

-------------------- Sending query to database named ora11g (id: <<11172>>):

select T6303.PROD_SUBCATEGORY_DESC as c1,
sum(T6826.QUANTITY_SOLD) as c2
from
SH_COPY.PRODUCTS T6303,
SH_COPY.SALES_BIG T6826
where ( T6303.PROD_ID = T6826.PROD_ID )
group by T6303.PROD_SUBCATEGORY_DESC
order by c1

+++Administrator:2c0000:2c0003:----2008/10/02 13:32:39

-------------------- Query Status: Successful Completion

+++Administrator:2c0000:2c0003:----2008/10/02 13:32:39

-------------------- Rows 21, bytes 84504 retrieved from database query id: <<11172>>

+++Administrator:2c0000:2c0003:----2008/10/02 13:32:39

-------------------- Physical query response time 194 (seconds), id <<11172>>

+++Administrator:2c0000:2c0003:----2008/10/02 13:32:39

-------------------- Physical Query Summary Stats: Number of physical queries 1, Cumulative time 194, DB-connect time 0 (seconds)

+++Administrator:2c0000:2c0003:----2008/10/02 13:32:39

-------------------- Rows returned to Client 21

+++Administrator:2c0000:2c0003:----2008/10/02 13:32:39

-------------------- Logical Query Summary Stats: Elapsed time 195, Response time 194, Compilation time 0 (seconds)

So let's step through this log entry and look at each part in detail.

The first section sets out the general details of the query, and contains details of the QUERY_SRC_CD variable containing your query, the name of the Oracle BI Server repository and subject area used for the query.

##############################################
-------------------- SQL Request:
SET VARIABLE QUERY_SRC_CD='Report';SELECT PRODUCTS_LOCAL.PROD_SUBCATEGORY_DESC saw_0, SALES_BIG_LOCAL.QUANTITY_SOLD saw_1 FROM "Log & Federated Query Test" ORDER BY saw_0

+++Administrator:2c0000:2c0003:----2008/10/02 13:29:24

-------------------- General Query Info:
Repository: Star, Subject Area: Log & Federated Query Test, Presentation: Log & Federated Query Test

+++Administrator:2c0000:2c0003:----2008/10/02 13:29:24
Notice that each entry is the log is also timestamped with the username of the BI Server user that was responsible for the query, and the timestamp for the log entry.

Next is the logical, "Request List" that sets out the columns that form the criteria for the report, and the grouping and aggregation applied to the data.

-------------------- Logical Request (before navigation):

RqList
    PRODUCTS_LOCAL.PROD_SUBCATEGORY_DESC as c1 GB,
    QUANTITY_SOLD:[DAggr(SALES_BIG_LOCAL.QUANTITY_SOLD by [ PRODUCTS_LOCAL.PROD_SUBCATEGORY_DESC] )] as c2 GB
OrderBy: c1 asc

+++Administrator:2c0000:2c0003:----2008/10/02 13:29:24
If my report criteria included a filter, the filter columns would appear in the request list as well, like this:
-------------------- Logical Request (before navigation):

RqList
    PRODUCTS_LOCAL.PROD_SUBCATEGORY_DESC as c1 GB,
    QUANTITY_SOLD:[DAggr(SALES_BIG_LOCAL.QUANTITY_SOLD by [ PRODUCTS_LOCAL.PROD_SUBCATEGORY_DESC] )] as c2 GB
DetailFilter: PRODUCTS_LOCAL.PROD_SUBCATEGORY_DESC in ([ 'Camcorders', 'Camera Batteries', 'Camera Media'] )
OrderBy: c1 asc

+++Administrator:2c0000:2c0005:----2008/10/02 13:42:09
So at this point we're dealing with the logical view of the data, which relates back to the Presentation Layer of the Oracle BI Server metadata layer. The next part of the log file shows the physical SQL that was generated to correspond with the logical request list of columns specified by the user. In this case, both logicals tables have logical table sources that come from the same physical Oracle database, and so a single physical SQL query is generated.
-------------------- Sending query to database named ora11g (id: <<11172>>):

select T6303.PROD_SUBCATEGORY_DESC as c1,
     sum(T6826.QUANTITY_SOLD) as c2
from
     SH_COPY.PRODUCTS T6303,
     SH_COPY.SALES_BIG T6826
where  ( T6303.PROD_ID = T6826.PROD_ID )
group by T6303.PROD_SUBCATEGORY_DESC
order by c1

+++Administrator:2c0000:2c0003:----2008/10/02 13:32:39
Notice the fact that Oracle BI Server automatically aggregates your data (this is behavoir specified in the default aggregation settings in the BI Server metadata layer, you can specify non-grouping, i.e. detail display of data instead), and applies a GROUP BY on the dimension data.

The next part of the log file details whether the query completed successfully, how long it took to return data, how many rows were returned and the response time of the underlying database.

-------------------- Query Status: Successful Completion

+++Administrator:2c0000:2c0003:----2008/10/02 13:32:39

-------------------- Rows 21, bytes 84504 retrieved from database query id: <<11172>>

+++Administrator:2c0000:2c0003:----2008/10/02 13:32:39

-------------------- Physical query response time 194 (seconds), id <<11172>>

+++Administrator:2c0000:2c0003:----2008/10/02 13:32:39

-------------------- Physical Query Summary Stats: Number of physical queries 1, Cumulative time 194, DB-connect time 0 (seconds)

+++Administrator:2c0000:2c0003:----2008/10/02 13:32:39

-------------------- Rows returned to Client 21

+++Administrator:2c0000:2c0003:----2008/10/02 13:32:39

-------------------- Logical Query Summary Stats: Elapsed time 195, Response time 194, Compilation time 0 (seconds)
So what we've got here is a fairly straightforward record of the logical query requested by the user (the request list of columns, the criteria for your request), the physical SQL generated to return the required list of columns, any filters, aggregation or grouping applied to the data, and statistics on how long the query took to run. Putting my Oracle Database hat on, we haven't got anything equivalent to the "wait events" you get when tuning the Oracle database, where it tells you where the time has been spent when executing a query (waiting for disk I/O, waiting for network I/O, etc) but then again, in this case it's the database that's doing the vast majority of the work and there's not really much overhead added by the BI Server here. The database took 194 seconds to return data, and the BI server only added a second to this, giving a total response time of 195.

There are actually several levels of logging that you can turn on for users and queries, with levels going from 0 (no logging) to 5 (lots and lots of logging), with the general advice being to only really turn this on when you're debugging a query, as the logs are written to disk and can slow down a real-life system if left on. In a bit more detail, the logging levels currently available are as follows:

Level 0 = no logging, the default setting for a user or query

Level 1 = logs the logical SQL (the request list), the elapsed time, whether the cache was used, query status, user name and so on.

Level 2 = Everything in level 1, plus repository name, subject area name, physical SQL generated, number of rows returned and so on

Level 3 = Everything in level 2 plus some additional data on cache usage. This is the level I usually set, though the docs suggest you should just use level 2

Level 4 = Everything in level 3 plus the physical SQL execution plan

Level 5 = Everything in level 4 plus intermediate row counts at various stages, particularly useful when the BI Server is stitching data sets together.

So how does the above query look when run at level 5 logging? Let's take a look, this time with the filter added in that I showed a minute ago. I've bolded the areas that are new in this level of logging:

-------------------- SQL Request:

SET VARIABLE QUERY_SRC_CD='Report',SAW_SRC_PATH='/users/administrator/Simple Log Test Query';SELECT PRODUCTS_LOCAL.PROD_SUBCATEGORY_DESC saw_0, SALES_BIG_LOCAL.QUANTITY_SOLD saw_1 FROM "Log & Federated Query Test" WHERE PRODUCTS_LOCAL.PROD_SUBCATEGORY_DESC IN ('Camcorders', 'Camera Batteries', 'Camera Media') ORDER BY saw_0

+++Administrator:2b0000:2b0001:----2008/10/02 14:04:10

-------------------- General Query Info:

Repository: Star, Subject Area: Log & Federated Query Test, Presentation: Log & Federated Query Test

+++Administrator:2b0000:2b0001:----2008/10/02 14:04:10

-------------------- Logical Request (before navigation):

RqList
    PRODUCTS_LOCAL.PROD_SUBCATEGORY_DESC as c1 GB,
    QUANTITY_SOLD:[DAggr(SALES_BIG_LOCAL.QUANTITY_SOLD by [ PRODUCTS_LOCAL.PROD_SUBCATEGORY_DESC] )] as c2 GB
DetailFilter: PRODUCTS_LOCAL.PROD_SUBCATEGORY_DESC in ([ 'Camcorders', 'Camera Batteries', 'Camera Media'] )
OrderBy: c1 asc

+++Administrator:2b0000:2b0001:----2008/10/02 14:04:10

-------------------- Execution plan:

RqList <<9790>> [for database 3023:2350:ora11g,44]
    PRODUCTS.PROD_SUBCATEGORY_DESC as c1 GB [for database 3023:2350,44],
    sum(SALES_BIG.QUANTITY_SOLD by [ PRODUCTS.PROD_SUBCATEGORY_DESC] ) as c2 GB [for database 3023:2350,44]
Child Nodes (RqJoinSpec): <<9851>> [for database 3023:2350:ora11g,44]
    PRODUCTS T6303
    SALES_BIG T6826
DetailFilter: PRODUCTS.PROD_ID = SALES_BIG.PROD_ID and (PRODUCTS.PROD_SUBCATEGORY_DESC = 'Camcorders' or PRODUCTS.PROD_SUBCATEGORY_DESC = 'Camera Batteries' or PRODUCTS.PROD_SUBCATEGORY_DESC = 'Camera Media') [for database 0:0]
GroupBy: [ PRODUCTS.PROD_SUBCATEGORY_DESC]  [for database 3023:2350,44]
OrderBy: c1 asc [for database 3023:2350,44]

+++Administrator:2b0000:2b0001:----2008/10/02 14:04:10

-------------------- Sending query to database named ora11g (id: <<9790>>):

select T6303.PROD_SUBCATEGORY_DESC as c1,
     sum(T6826.QUANTITY_SOLD) as c2
from
     SH_COPY.PRODUCTS T6303,
     SH_COPY.SALES_BIG T6826
where  ( T6303.PROD_ID = T6826.PROD_ID and (T6303.PROD_SUBCATEGORY_DESC in ('Camcorders', 'Camera Batteries', 'Camera Media')) )
group by T6303.PROD_SUBCATEGORY_DESC
order by c1

+++Administrator:2b0000:2b0001:----2008/10/02 14:07:25

-------------------- Execution Node: <<9790>>, Close Row Count = 3, Row Width = 4024 bytes

+++Administrator:2b0000:2b0001:----2008/10/02 14:07:25

-------------------- Execution Node: <<9790>> DbGateway Exchange, Close Row Count = 3, Row Width = 4024 bytes

+++Administrator:2b0000:2b0001:----2008/10/02 14:07:25

-------------------- Execution Node: <<9790>> DbGateway Exchange, Close Row Count = 3, Row Width = 4024 bytes

+++Administrator:2b0000:2b0001:----2008/10/02 14:07:25

-------------------- Query Status: Successful Completion

+++Administrator:2b0000:2b0001:----2008/10/02 14:07:25

-------------------- Rows 3, bytes 12072 retrieved from database query id: <<9790>>

+++Administrator:2b0000:2b0001:----2008/10/02 14:07:25

-------------------- Physical query response time 195 (seconds), id <<9790>>

+++Administrator:2b0000:2b0001:----2008/10/02 14:07:25

-------------------- Physical Query Summary Stats: Number of physical queries 1, Cumulative time 195, DB-connect time 0 (seconds)

+++Administrator:2b0000:2b0001:----2008/10/02 14:07:25

-------------------- Rows returned to Client 3

+++Administrator:2b0000:2b0001:----2008/10/02 14:07:25

-------------------- Logical Query Summary Stats: Elapsed time 195, Response time 195, Compilation time 0 (seconds)
So the execution plan that the log file contains is the logical execution plan, not the Oracle execution plan corresponding to the physical SQL that is generated. Let's take a look at this logical execution plan in more details.
-------------------- Execution plan:

RqList <<9790>> [for database 3023:2350:ora11g,44]
    PRODUCTS.PROD_SUBCATEGORY_DESC as c1 GB [for database 3023:2350,44],
    sum(SALES_BIG.QUANTITY_SOLD by [ PRODUCTS.PROD_SUBCATEGORY_DESC] ) as c2 GB [for database 3023:2350,44]
Child Nodes (RqJoinSpec): <<9851>> [for database 3023:2350:ora11g,44]
    PRODUCTS T6303
    SALES_BIG T6826
DetailFilter: PRODUCTS.PROD_ID = SALES_BIG.PROD_ID and (PRODUCTS.PROD_SUBCATEGORY_DESC = 'Camcorders' or PRODUCTS.PROD_SUBCATEGORY_DESC = 'Camera Batteries' or PRODUCTS.PROD_SUBCATEGORY_DESC = 'Camera Media') [for database 0:0]
GroupBy: [ PRODUCTS.PROD_SUBCATEGORY_DESC]  [for database 3023:2350,44]
OrderBy: c1 asc [for database 3023:2350,44]
So what this contains is the request list as before, and then a list of "child nodes" with each node corresponding to a step in the execution plan (I think). Then we've got the filter applied to the nodes, the group by and the order by, and the query again takes 195 seconds, with the BI Server adding virtually nothing to the response time.

So far it all looks straightforward, but what happens if instead of both logical table sources pointing to the same one physical database, we source the dimension table from a second physical database, so that the BI Server generates a federated query and stitches the result set together in the BI Server's memory. How would this extended query log look then? Let's take a look.

-------------------- SQL Request:

SET VARIABLE QUERY_SRC_CD='Report';SELECT SALES_BIG_LOCAL.QUANTITY_SOLD saw_0, PRODUCTS_REMOTE.PROD_SUBCATEGORY_DESC saw_1 FROM "Log & Federated Query Test" WHERE PRODUCTS_REMOTE.PROD_SUBCATEGORY_DESC IN ('Camcorders', 'Camera Batteries', 'Camera Media') ORDER BY saw_1

+++Administrator:2a0000:2a0002:----2008/10/02 15:22:38

-------------------- General Query Info:

Repository: Star, Subject Area: Log & Federated Query Test, Presentation: Log & Federated Query Test

+++Administrator:2a0000:2a0002:----2008/10/02 15:22:38

-------------------- Logical Request (before navigation):

RqList
    QUANTITY_SOLD:[DAggr(SALES_BIG_LOCAL.QUANTITY_SOLD by [ PRODUCTS_REMOTE.PROD_SUBCATEGORY_DESC] )] as c1 GB,
    PRODUCTS_REMOTE.PROD_SUBCATEGORY_DESC as c2 GB
DetailFilter: PRODUCTS_REMOTE.PROD_SUBCATEGORY_DESC in ([ 'Camcorders', 'Camera Batteries', 'Camera Media'] )
OrderBy: c2 asc

+++Administrator:2a0000:2a0002:----2008/10/02 15:22:38

-------------------- Execution plan:

RqBreakFilter <<9948>>[2] [for database 0:0,0]
    RqList <<10056>> [for database 0:0,0]
        sum(D1.c5 by [ D1.c2]  at_distinct [ D1.c2, D1.c3] ) as c1 [for database 0:0,0],
        D1.c2 as c2 [for database 3023:6422,44]
    Child Nodes (RqJoinSpec): <<10068>> [for database 0:0,0]
        (
            RqList <<9865>> [for database 0:0,0]
                D902.c1 as c2 GB [for database 3023:6422,44],
                D901.c2 as c3 [for database 3023:2350,44],
                D901.c3 as c5 [for database 3023:2350,44]
            Child Nodes (RqJoinSpec): <<9950>> [for database 0:0,0]
                    (
                        RqList <<9964>> [for database 3023:2350:ora11g,44]
                            SALES_BIG.PROD_ID as c2 [for database 3023:2350,44],
                            sum(SALES_BIG.QUANTITY_SOLD by [ SALES_BIG.PROD_ID] ) as c3 [for database 3023:2350,44]
                        Child Nodes (RqJoinSpec): <<9967>> [for database 3023:2350:ora11g,44]
                            SALES_BIG T6826
                        GroupBy: [ SALES_BIG.PROD_ID]  [for database 3023:2350,44]
                        OrderBy: c2 asc [for database 3023:2350,44]
                    ) as D901
                InnerJoin <<10001>> On D901.c2 = D902.c2; actual join vectors:  [ 0 ] =  [ 1 ]
                    (
                        RqList <<9981>> [for database 3023:6422:remote_ora11g,44]
                            PRODUCTS.PROD_SUBCATEGORY_DESC as c1 GB [for database 3023:6422,44],
                            PRODUCTS.PROD_ID as c2 [for database 3023:6422,44]
                        Child Nodes (RqJoinSpec): <<9984>> [for database 3023:6422:remote_ora11g,44]
                            PRODUCTS T6864
                        DetailFilter: PRODUCTS.PROD_SUBCATEGORY_DESC = 'Camcorders' or PRODUCTS.PROD_SUBCATEGORY_DESC = 'Camera Batteries' or PRODUCTS.PROD_SUBCATEGORY_DESC = 'Camera Media' [for database 0:0]
                        OrderBy: c2 asc [for database 3023:6422,44]
                    ) as D902
            OrderBy: c2, c3 [for database 0:0,0]
        ) as D1
    OrderBy: c2 asc [for database 0:0,0]

+++Administrator:2a0000:2a0002:----2008/10/02 15:22:38

-------------------- Sending query to database named ora11g (id: <<9964>>):

select T6826.PROD_ID as c2,
     sum(T6826.QUANTITY_SOLD) as c3
from
     SH_COPY.SALES_BIG T6826
group by T6826.PROD_ID
order by c2

+++Administrator:2a0000:2a0002:----2008/10/02 15:22:38

-------------------- Sending query to database named remote_ora11g (id: <<9981>>):

select T6864.PROD_SUBCATEGORY_DESC as c1,
     T6864.PROD_ID as c2
from
     PRODUCTS T6864
where  ( T6864.PROD_SUBCATEGORY_DESC in ('Camcorders', 'Camera Batteries', 'Camera Media') )
order by c2

+++Administrator:2a0000:2a0002:----2008/10/02 15:22:38

-------------------- Execution Node: <<9981>>, Close Row Count = 9, Row Width = 4012 bytes

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Execution Node: <<9964>>, Close Row Count = 72, Row Width = 24 bytes

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Cancel initiated for Exchange Producer: <<9964>> DbGateway Exchange

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Execution Node: <<9964>> DbGateway Exchange, Close Row Count = 63, Row Width = 24 bytes

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Execution Node: <<9981>> DbGateway Exchange, Close Row Count = 9, Row Width = 4012 bytes

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Execution Node: <<10001>> Binary Join, Close Row Count = 9, Row Width = 4032 bytes

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Execution Node: <<9865>> Projection, Close Row Count = 9, Row Width = 4032 bytes

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Execution Node: <<9865>> Sort, Close Row Count = 9, Row Width = 4032 bytes

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Execution Node: <<10056>> Pre-aggr Projection, Close Row Count = 9, Row Width = 4032 bytes

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Execution Node: <<10056>> Aggregation, Close Row Count = 9, Row Width = 4040 bytes

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Execution Node: <<10056>> Post-aggr Projection, Close Row Count = 9, Row Width = 4016 bytes

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Execution Node: <<10056>> Sort, Close Row Count = 9, Row Width = 4016 bytes

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Execution Node: <<9948>> GroupByNoSort, Close Row Count = 3, Row Width = 4016 bytes

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Execution Node: <<9964>> DbGateway Exchange, Close Row Count = 63, Row Width = 24 bytes

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Execution Node: <<9981>> DbGateway Exchange, Close Row Count = 9, Row Width = 4012 bytes

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Query Status: Successful Completion

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Rows 72, bytes 1728 retrieved from database query id: <<9964>>

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Physical query response time 172 (seconds), id <<9964>>

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Rows 9, bytes 36108 retrieved from database query id: <<9981>>

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Physical query response time 0 (seconds), id <<9981>>

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Physical Query Summary Stats: Number of physical queries 2, Cumulative time 172, DB-connect time 0 (seconds)

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Rows returned to Client 3

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Logical Query Summary Stats: Elapsed time 173, Response time 172, Compilation time 0 (seconds)
So this looks a bit more interesting, as we can see that the BI Server is doing some of the work now, generating an SQL query for each physical data source and then stitching the data together. There's also lot more nodes, which suggests that the BI Server is going through a lot more steps to return the data. Let's a take a look at the more interesting parts of the log, starting with the logical execution plan, which now contains a number of child nodes; ones for each of the two database queries, further ones for the stitched-together data set and a node for the join between the two result sets .
-------------------- Execution plan:

RqBreakFilter <<9948>>[2] [for database 0:0,0]
    RqList <<10056>> [for database 0:0,0]
        sum(D1.c5 by [ D1.c2]  at_distinct [ D1.c2, D1.c3] ) as c1 [for database 0:0,0],
        D1.c2 as c2 [for database 3023:6422,44]
    Child Nodes (RqJoinSpec): <<10068>> [for database 0:0,0]
        (
            RqList <<9865>> [for database 0:0,0]
                D902.c1 as c2 GB [for database 3023:6422,44],
                D901.c2 as c3 [for database 3023:2350,44],
                D901.c3 as c5 [for database 3023:2350,44]
            Child Nodes (RqJoinSpec): <<9950>> [for database 0:0,0]
                    (
                        RqList <<9964>> [for database 3023:2350:ora11g,44]
                            SALES_BIG.PROD_ID as c2 [for database 3023:2350,44],
                            sum(SALES_BIG.QUANTITY_SOLD by [ SALES_BIG.PROD_ID] ) as c3 [for database 3023:2350,44]
                        Child Nodes (RqJoinSpec): <<9967>> [for database 3023:2350:ora11g,44]
                            SALES_BIG T6826
                        GroupBy: [ SALES_BIG.PROD_ID]  [for database 3023:2350,44]
                        OrderBy: c2 asc [for database 3023:2350,44]
                    ) as D901
                InnerJoin <<10001>> On D901.c2 = D902.c2; actual join vectors:  [ 0 ] =  [ 1 ]
                    (
                        RqList <<9981>> [for database 3023:6422:remote_ora11g,44]
                            PRODUCTS.PROD_SUBCATEGORY_DESC as c1 GB [for database 3023:6422,44],
                            PRODUCTS.PROD_ID as c2 [for database 3023:6422,44]
                        Child Nodes (RqJoinSpec): <<9984>> [for database 3023:6422:remote_ora11g,44]
                            PRODUCTS T6864
                        DetailFilter: PRODUCTS.PROD_SUBCATEGORY_DESC = 'Camcorders' or PRODUCTS.PROD_SUBCATEGORY_DESC = 'Camera Batteries' or PRODUCTS.PROD_SUBCATEGORY_DESC = 'Camera Media' [for database 0:0]
                        OrderBy: c2 asc [for database 3023:6422,44]
                    ) as D902
            OrderBy: c2, c3 [for database 0:0,0]
        ) as D1
    OrderBy: c2 asc [for database 0:0,0]
Notice that each of the main nodes has IDs assigned to them (9981, 9964 and 9865) so that you can follow them through the rest of the log file.

Later on down the log file we can see the physical SQL generated for each node, and the row count resulting from each physical SQL execution. Then, after the data is retrieved from the various physical database connection, you can see how the BI Server then goes on to inner join the data (the binary join on node 10001), remove unneeded columns from the initial result set (projection), sort the data, again remove unneeded columns and then aggregate the data. You can also see what operations are carried out on the data before and after aggregation, where sorting is carried out and so on.

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Execution Node: <<10001>> Binary Join, Close Row Count = 9, Row Width = 4032 bytes

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Execution Node: <<9865>> Projection, Close Row Count = 9, Row Width = 4032 bytes

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Execution Node: <<9865>> Sort, Close Row Count = 9, Row Width = 4032 bytes

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Execution Node: <<10056>> Pre-aggr Projection, Close Row Count = 9, Row Width = 4032 bytes

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Execution Node: <<10056>> Aggregation, Close Row Count = 9, Row Width = 4040 bytes

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Execution Node: <<10056>> Post-aggr Projection, Close Row Count = 9, Row Width = 4016 bytes

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Execution Node: <<10056>> Sort, Close Row Count = 9, Row Width = 4016 bytes

+++Administrator:2a0000:2a0002:----2008/10/02 15:25:31

-------------------- Execution Node: <<9948>> GroupByNoSort, Close Row Count = 3, Row Width = 4016 bytes
From looking this log file, you can pretty much determine the steps the BI Server is going through to add aggregations, sorting, result set stitching and so on to your data, if you're interested in seeing how data from multiple results sets is brought together, I don't know if this is publicly documented anywhere and whether other join types are possible (I've heard of nested loop joins also being performed), but this certainly looks interesting if you're into Oracle BI Server internals. Something I'm certainly looking to do is to document these steps properly and come up with the complete set of steps that can be carried out, if anyone's interested in helping then let me know as I said, I don't think this is publicly documented.

So looking at the log file output, this query has actually taken just 173 seconds rather than the 195 required for the single-source query, the BI Server only adding a second to this. I wouldn't place too much emphasis on the actual times taken for each variation on the query as I'm only running them once and on a single laptop, but you can see from the timings that Oracle BI Server doesn't seem to add much overhead, even when you're using the federated query feature (at least on my dataset, which isn't all that big).

Now one of the things you can do in the BI Server metadata layer, is select the "driving table" for these types of cross-database joins. This is particularly useful if one of your tables is small (the suggested limit is 1000 rows, in my case it's 70, and it tells the BI Server to firstly query the driving table to return a small amount of interesting rows, then it applies these these rows to the other table in batches so that only a much smaller set of the big table rows are then loaded into the BI Server memory.

The rationale here is that a normal federated query join would require both tables to be loaded into the BI Server memory, and if one of them is big, then this will require a full table scan of the physical source table and potentially take up a lot of memory in the BI Server. Using this driving table technique you can fairly quickly come up with a much smaller set of initial rows and row ids, this can then be used to filter the second table and therefore reduce the I/O usage and memory usage required by the federated query join. There are a couple of parameters you can set in the NQSCONFIG.INI file to determine the size of the batches that are applied to the larger table, and you can switch this feature off completely, but how does this look in the query log file, and does it make my query run faster? Let's take a look.

The first difference in this "driving table" query log entry is the logical execution plan, than now retrieves data from the second, non-driving table via a set of parameter values.

RqBreakFilter <<9961>>[2] [for database 0:0,0]
    RqList <<10027>> [for database 0:0,0]
        sum(D1.c5 by [ D1.c2]  at_distinct [ D1.c2, D1.c3] ) as c1 [for database 0:0,0],
        D1.c2 as c2 [for database 3023:6422,44]
    Child Nodes (RqJoinSpec): <<10039>> [for database 0:0,0]
        (
            RqList <<9918>> [for database 0:0,0]
                D901.c1 as c2 GB [for database 3023:6422,44],
                D902.c2 as c3 [for database 3023:2350,44],
                D902.c3 as c5 [for database 3023:2350,44]
            Child Nodes (RqJoinSpec): <<9963>> [for database 0:0,0]
                    (
                        RqList <<9966>> [for database 3023:6422:remote_ora11g,44]
                            PRODUCTS.PROD_SUBCATEGORY_DESC as c1 GB [for database 3023:6422,44],
                            PRODUCTS.PROD_ID as c2 [for database 3023:6422,44]
                        Child Nodes (RqJoinSpec): <<9969>> [for database 3023:6422:remote_ora11g,44]
                            PRODUCTS T6864
                        OrderBy: c2 asc [for database 3023:6422,44]
                    ) as D901
                InnerJoin  (left drive) <<10000>> On D902.c2 = D901.c2; actual join vectors:  [ 1 ] =  [ 0 ]
                    (
                        RqList <<9984>> [for database 3023:2350:ora11g,44]
                            SALES_BIG.PROD_ID as c2 [for database 3023:2350,44],
                            sum(SALES_BIG.QUANTITY_SOLD by [ SALES_BIG.PROD_ID] ) as c3 [for database 3023:2350,44]
                        Child Nodes (RqJoinSpec): <<9987>> [for database 3023:2350:ora11g,44]
                            SALES_BIG T6826
                        DetailFilter: SALES_BIG.PROD_ID = ?1 or SALES_BIG.PROD_ID = ?2 or SALES_BIG.PROD_ID = ?3 or SALES_BIG.PROD_ID = ?4 or SALES_BIG.PROD_ID = ?5 or SALES_BIG.PROD_ID = ?6 or SALES_BIG.PROD_ID = ?7 or SALES_BIG.PROD_ID = ?8 or SALES_BIG.PROD_ID = ?9 or SALES_BIG.PROD_ID = ?10 or SALES_BIG.PROD_ID = ?11 or SALES_BIG.PROD_ID = ?12 or SALES_BIG.PROD_ID = ?13 or SALES_BIG.PROD_ID = ?14 or SALES_BIG.PROD_ID = ?15 or SALES_BIG.PROD_ID = ?16 or SALES_BIG.PROD_ID = ?17 or SALES_BIG.PROD_ID = ?18 or SALES_BIG.PROD_ID = ?19 or SALES_BIG.PROD_ID = ?20 [for database 0:0]
                        GroupBy: [ SALES_BIG.PROD_ID]  [for database 3023:2350,44]
                        OrderBy: c2 asc [for database 3023:2350,44]
                    ) as D902
            OrderBy: c2, c3 [for database 0:0,0]
        ) as D1
    OrderBy: c2 asc [for database 0:0,0]

+++Administrator:2a0000:2a0004:----2008/10/02 15:39:08
You can specify the maximum amount of parameters in the table filter through the MAX_PARAMETERS_PER_DRIVE_JOIN parameter in the NQSCONFIG.INI file, with the limit really being down to how many filter values the back-end database can handle. The MAX_QUERIES_PER_DRIVE_JOIN parameter thereafter limits the total number of separate calls to the non-driving table that can take place (a function of the total number of rows returned by the initial driving table query divided by the MAX_PARAMETERS_PER_DRIVE_JOIN parameter setting), to stop run-away queries occuring that generate hundreds of individual calls to the non-driving table.

The other interesting part is the actual physical SQL sent to the non-driving table, which looks like this:

-------------------- Sending query to database named ora11g (id: <<9984>>):

select T6826.PROD_ID as c2,
     sum(T6826.QUANTITY_SOLD) as c3
from
     SH_COPY.SALES_BIG T6826
where  ( T6826.PROD_ID in (:PARAM1, :PARAM2, :PARAM3, :PARAM4, :PARAM5, :PARAM6, :PARAM7, :PARAM8, :PARAM9, :PARAM10, :PARAM11, :PARAM12, :PARAM13, :PARAM14, :PARAM15, :PARAM16, :PARAM17, :PARAM18, :PARAM19, :PARAM20) )
group by T6826.PROD_ID
order by c2

+++Administrator:2a0000:2a0004:----2008/10/02 15:39:08
So in this case, the driving table feature will initially run the query against the smaller (dimension) table and return the rows specified by the filter on the query. Then, in batches of twenty, individual selects will be made against the larger, non-driving table, until all the interested rows are returned. These will then be joined in memory in the BI Server and the results returned to the dashboard, in theory faster than if we'd tried to load both tables into the BI Server memory and join them both there.

So now, to take this one step further, what does the query log file look like if we run the Aggregate Persistence Wizard on this data set, generate and aggregate at the PROD_SUBCATEGORY_DESC, plug the aggregates into the logical model and then run a query? Where (if anywhere) does aggregate navigation future in the logical execution plan and query steps?

Well, after running the wizard and registering the aggregates, I run the query again and here's the output:

-------------------- Execution plan:

RqBreakFilter <<10079>>[2] [for database 0:0,0]
    RqList <<10037>> [for database 3023:2350:ora11g,44]
        ag_SALES_BIG_LOCAL.QUANTITY_S0001562C as c1 GB [for database 3023:2350,44],
        SA_Subcate0001593D.PROD_SUBCA0001561B as c2 GB [for database 3023:2350,44]
    Child Nodes (RqJoinSpec): <<10067>> [for database 3023:2350:ora11g,44]
        SA_Subcate0001593D T7688
        ag_SALES_BIG_LOCAL T7695
    DetailFilter: SA_Subcate0001593D.PROD_SUBCA0001561B = ag_SALES_BIG_LOCAL.PROD_SUBCA0001561B [for database 0:0]
    OrderBy: c2 asc [for database 3023:2350,44]

+++Administrator:2a0000:2a0005:----2008/10/02 16:46:18

-------------------- Sending query to database named ora11g (id: <<10037>>):

select distinct T7695.QUANTITY_S0001562C as c1,
     T7688.PROD_SUBCA0001561B as c2
from
     SH_COPY.SA_Subcate0001593D T7688,
     SH_COPY.ag_SALES_BIG_LOCAL T7695
where  ( T7688.PROD_SUBCA0001561B = T7695.PROD_SUBCA0001561B )
order by c2
So, you can see the aggregate table being used, but there's nothing in the log to say that aggregates were considered, the query is just generated against the logical table source that returns the smallest amount of rows.

Anyway, that's a quick trip around the query log file, with a short diversion into driving tables and aggregates. Of course the $64,000 questions is "what do you do with the log file, in particular the knowledge of the various query execution steps", and in reality if you're getting to the point where federated queries are running slowly, your most obvious and productive means of sorting this out is to physically integrate your data into a single data mart or data warehouse rather than mucking around trying to decipher undocumented log files. Still, if you're stuck with a federated query or just interested in how the internals work, it's an interesting set of diagnostics you can generate, I'll certainly be blogging more on this over time and if you've got anything to add, just let me know in the article comments.