Using the OBIEE Aggregate Persistence Wizard
One of the interesting new features that came with OBIEE 10.1.3.2 was the Aggregate Persistence Wizard. This utility automates the creation and initial population of aggregates, persists them in a back-end database and configures the BI Server metadata layer so that they're used when appropriate. What's particularly interesting about this feature is that you can create aggregates for, say, an Oracle database and store them in an SQL Server, DB2 or Teradata database. It's one of those OBIEE "rapid application development" features that lets you tackle underlying database performance issues in a pragmatic way - you can't create summaries in the backend database, so find any old database and create them in there, the BI Server will remember where it's stored them - and it's also a useful feature when your metadata layer maps to multiple back-end databases, and there isn't an obvious single place to create your aggregates as your facts span multiple databases.
So, how you you use this feature? Well, it's accessed from the BI Administrator tool, and relies on you first having set up dimensions in your logical model to describe the hierarchies and levels within your data. In this example, I'm looking at a simple dataset with three hierarchies; a customer hierarchy that goes from All Customers to Countries, then State, then City and then Customer, a product hierarchy that goes from All Products through Categories to Product, and a time hierarchy that goes from All Times through Years, Quarters, Months and Days.
In this scenario, I'm interested in aggregating the Quantity measure within the Items fact table by Product Category, Year and State. To kick things off, I select Tools > Utilities from the BI Administrator application menu, and then select Aggregate Persistance Wizard from the list of available utilities.
The way the wizard works is to let you select one or more fact table measures, then select the dimension hierarchy levels on which you'll aggregate, wherapon it then generates a script that tells the BI Server to create the aggregates. You can run this script either through the scheduler (the method the docs recommend) or through the nqcmd.exe command-line utility found in the /OracleBI/server/bin directory (my recommended method, as this seem to be the only way I can get it to work).
To start off then, the first screen displayed by the wizard prompts you to enter a filename for the script it's going to generate. Note the option underneath to generate a DDL file as well - this is a bit of a red herring, as the main script that it generates already tells the BI Server to generate back-end database tables anyway, I'm not quite sure what the DDL file is for but it's definately not needed for what we're doing.
You then select the business model, then the fact table and finally the measure(s) that you wish to aggregate. To keep things simple, I'm just aggregating the quantity measure in my Items fact table.
Then you select the dimension hierarchy levels on which you'll aggregate the data.
Now you choose the database in which you'll store the aggregates. In this initial example, I'll stored the aggregate tables in the same Oracle schema as the source data, but I could just as well store them in an Access database, or an SQL Server database, if I had one handy.
The wizard then shows me a preview of the script it's going to generate, and asks me if I want to create any more summaries. Here's how the preview script looks - note that it's not SQL, it's a script language that the BI Server understands and uses as the specification for an aggregation job, which it then turns into separate CREATE TABLE, INSERT ... SELECT and so on SQL statements, "optimized" for the particular back-end database platform.
"ag_Items" for "Seminar - Complete"."Items"("Quantity") at levels ("Seminar - Complete"."Customerdim"."State", "Seminar - Complete"."Productdim"."Category", "Seminar - Complete"."Timedim"."Year") using connection pool "custdw"."CUSTDW_Pool" in "custdw".."CUSTDW";
I then close the wizard down and take a look at the script that it's generated.
create aggregates"ag_Items"
for "Seminar - Complete"."Items"("Quantity")
at levels ("Seminar - Complete"."Customerdim"."State", "Seminar - Complete"."Productdim"."Category", "Seminar - Complete"."Timedim"."Year")
using connection pool "custdw"."CUSTDW_Pool"
in "custdw".."CUSTDW";
Before I run the script, I insert another command into the script before the "create aggregates" command, to clear down any aggregates that might already exist:
delete aggregates;create aggregates
"ag_Items"
for "Seminar - Complete"."Items"("Quantity")
at levels ("Seminar - Complete"."Customerdim"."State", "Seminar - Complete"."Productdim"."Category", "Seminar - Complete"."Timedim"."Year")
using connection pool "aggregates"."Connection Pool"
in "aggregates"."E:\product\OracleBI\server\src\aggregates";
Now the script is ready, I run it using the "nqcmd.exe" utility in the /OracleBI/server/bin directory.
E:\product\OracleBI\server\Bin>nqcmd.exe -u Administrator -p Administrator -d an alyticsweb -s c:\agg_wiz_script.sql
Oracle BI Server Copyright (c) 1997-2006 Oracle Corporation, All rights reserved
delete aggregates
delete aggregatesStatement execute succeeded
create aggregates
"ag_Items"
for "Seminar - Complete"."Items"("Quantity")
at levels ("Seminar - Complete"."Customerdim"."State", "Seminar - Complete"."Pr
oductdim"."Category", "Seminar - Complete"."Timedim"."Year")
using connection pool "custdw"."CUSTDW_Pool"
in "custdw".."CUSTDW"
create aggregates"ag_Items"
for "Seminar - Complete"."Items"("Quantity")
at levels ("Seminar - Complete"."Customerdim"."State", "Seminar - Complete"."Pr
oductdim"."Category", "Seminar - Complete"."Timedim"."Year")
using connection pool "custdw"."CUSTDW_Pool"
in "custdw".."CUSTDW"Statement execute succeeded
Processed: 2 queries
If I take a look in the schema in which it's created the aggregates, I see that some system-generated tables are now there:
E:\product\OracleBI\server\Bin>sqlplus custdw/password@ora10gSQL*Plus: Release 10.2.0.3.0 - Production on Wed Oct 24 19:03:47 2007
Copyright (c) 1982, 2006, Oracle. All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, OLAP and Data Mining optionsSQL> select table_name from user_tables;
TABLE_NAME
SNP_PLAN_TABLE
SNP_CHECK_TAB
E$_CUSTOMER
ADDRESS
CUSTOMER
CUSTOMER_ADDRESS
ITEMS
ORDERS
ORDER_SALESPERSON_ALLOC
PRODUCT
SALESPERSONTABLE_NAME
SALESPERSON_TREE
SALESPERSON_TREE_COPY
TIMES
AG_ITEMS
SA_STATE000062A6
SA_CATEGOR000062AA
SA_YEAR000062AD18 rows selected.
SQL> select count(*) from ag_items;
COUNT(*)
14
SQL> select count(*) from items;
COUNT(*)
19
SQL> select count(*) from address;
COUNT(*)
10
SQL> select count(*) from sa_state000062A6;
COUNT(*)
7
Looking at the Physical Layer in the BI Administration tool, I can see the new aggregate tables created and registered, and shown in red to show they're aggregates.
Similarly, new Logical Table Sources have been created in the business model to link these aggregate tables to the existing logical model.
Now, if I run a query through Answers that selects the Quantity measure at this level of aggregation, and then check the NQQuery.log file to check that the new aggregate tables are being used, I see it's all working ok.
-------------------- Logical Request (before navigation):RqList
Quantity:[DAggr(Items.Quantity by [ Product.Category, Times.Year_yyyy, Customer.State] )] as c1 GB,
Customer.State as c2 GB,
Product.Category as c3 GB,
Times.Year_yyyy as c4 GB
OrderBy: c2 asc, c3 asc, c4 asc+++Administrator:460000:46000e:----2007/10/24 19:11:11
-------------------- Sending query to database named custdw (id: <<27483>>):
select distinct T18312.Quantity00006299 as c1,
T18293.State00006290 as c2,
T18299.Category00006285 as c3,
T18304.Year_yyyy0000628D as c4
from
CUSTDW.SA_Year000062AD T18304,
CUSTDW.SA_Categor000062AA T18299,
CUSTDW.SA_State000062A6 T18293,
CUSTDW.ag_Items T18312
where ( T18293.State00006290 = T18312.State00006290 and T18299.Category00006285 = T18312.Category00006285 and T18304.Year_yyyy0000628D = T18312.Year_yyyy0000628D )
order by c2, c3, c4
Not bad. Of course, if you're running solely against an Oracle database, creating and maintaining your aggregates using materialized views probably makes more sense, but if you're running in a mixed environment, or for whatever reason you either can't create materialized views or you only have an DB/2, SQL Server or other database lying around to create aggregates in, it's a useful feature to be aware of.
Thanks to Christopher Bridge for reminding me of this feature, and for a few tips on getting it working. Cheers Chris.