Oracle Exalytics Week : Exalytics & TimesTen - Under the Covers
So in the first two postings in our Exalytics week series, we looked at why Oracle created Exalytics, and what Exalytics actually comprises of. If you missed the first two postings in the series, here's the week's agenda including links that I'll complete once all of the series goes online:
- Oracle Exalytics Week : The Case for Exalytics
- Oracle Exalytics Week : Introducing Oracle Exalytics In-Memory Machine
- Oracle Exalytics Week : Exalytics & TimesTen - Under the Covers
- Oracle Exalytics Week : The Summary Advisor
- Oracle Exalytics Week : How Does it Perform, and How can I Test It?
So how does Exalytics actually work, and where does the special engineering and in-memory features come in? Exalytics brings together different technologies and places them in a high-performance server, and aims to harmonize and optimize their use for this particular hardware platform. The three engineering features that make Exalytics possible are:
- An In-Memory Adaptive Data Mart
- An In-Memory Intelligent Results Cache, and
- In-Memory MOLAP cubes
Unlike tools such as SAP HANA that require you to re-engineer your system to work with a new database technology, or in-memory desktop tools such as Qlikview or Microsoft PowerPivot, Exalytics applications need no re-engineering and are designed to scale across the whole enterprise. So how do these features work, and in what sort of situations might they improve query performance?
Oracle Exalytics uses a specially-enhanced version of Oracle TimesTen, Oracle's in-memory database, to cache commonly-used aggregates used in dashboards, analyses and other BI objects. By using an in-memory store to hold these aggregates, results are returned to the user with sub-second response times, making it possible to create dashboards that remove the Apply and Reset buttons on prompts to instantaneously respond to parameter changes. Because it's a cache, and only aggregates are loaded in, this feature lends itself better to data warehouse data sources with a regular ETL process, and to BI systems that aggregate and analyze data. To work out what aggregates to create, usage tracking has been enhanced and now also includes a summary statistics table that tracks not only the logical SQL generated by queries, but also the physical SQL, query response time and query granularity, making it possible to now use a new Oracle BI Administration tool feature, the Summary Advisor, to recommend and create aggregates to place into TimesTen.
As well as the in-memory database cache, you also still have the results cache that OBIEE's always have, stored in files on the filesystem. This still exists with Exalytics, and should be part of any caching and performance improvement strategy. With the high amount of RAM that's on an Exalytics server, typically these cache files will end-up being held in memory whilst the OBIEE server is being used, or you can specifically place them on a RAM disk if required. Typically, that last step isn't needed for the results cache files, which are written once and then read many times, but this approach might be useful for the working area for the BI and Presentation Servers, which typically write lots of data to disk as queries are merged and analysis views constructed.
Essbase, also, has had some in-memory optimizations and additional support for up to 128 CPUs, some of which will be available generally for all users, and some of which will be Exalytics-only. With Essbase applications, the idea is that Exalytics will hold the Essbase database, whilst you'd host the applications - Planning, Financial Reporting, Financial Management etc - on separate servers, some of which might need to be Windows-only. Check back on the blog once this release is out, for more details on what's involved.
So what is Oracle TimesTen, and how does it work internally? TimesTen is a product that Oracle acquired several years ago, and is an in-memory database with data being persisted in the background to disk, so you don't lose everything if the plug is pulled on the server. TimesTen loads the whole database into memory, and allocates disk as "perm" space, which is where your data gets stored, and "temp" space, which is working space for TimesTen. Generally, for larger deployments of TimesTen, the recommendation is to allocate the same amount of temp space as perm space, which if you consider that with the 1TB of RAM in Exalytics you might want to allocate, say, 800GB of that to TimesTen, then only half of that (400GB) would be useable space for the TimesTen databases (you can of course vary all of this depending on your particular workload). Overall not bad, but you're not going to be able to take a 1TB data warehouse and load that all into memory, if that's what you were thinking. The diagram below shows TimesTen as relating to OBIEE and how it uses disk-based checkpoint and other logging to ensure data isn't lost if memory is powered-off.
TimesTen was originally created for the OLTP market, and was used in applications such as market trading, airline bookings and other scenarios where low-latency, fast single-row read and writes were required. In its previous form it was actually a pretty poor fit for OBIEE, as whilst it connected via ODBC, analytic and aggregation queries were actually slower than with regular disk-stored database, as it was optimized for these single-row reads and writes. With Exalytics though, extensions have been added to TimesTen which makes it more suited to the type of analytics, aggregated queries issued by the BI Server including:
- Columnar compression
- Support for the AVG, SUM, COUNT, MAX, MIN, DENSE_RANK, RANK, ROW_NUMBER, FIRST_VALUE and LAST_VALUE analytic functions
- Support for the OVER PARTITION BY and OVER ORDER BY analytic clauses
- Support for GROUP BY CUBE, GROUP BY ROLLUP, GROUP BY GROUPING SETS grouping clauses
- Support for the GROUP, GROUPING_ID, GROUP_ID grouping functions
- Support for the WITH clause (though in this initial release it's best used only for reading data, not writing to it
- and support for aggregate expressions over DISTINCT expression
So the idea here is that TimesTen should be able to support the same type of SQL functions and clauses as regular on-disk databases, but with the aggregates that are stored within it held in memory, and column-compressed. The Oracle BI Server will then use it's internal aggregate navigation feature to refer to the TimesTen in-memory aggregates when queries request that level of aggregation, switching back to the regular data source when more detailed queries are requested. Unfortunately, these new TimesTen features can only be used in conjunction with Exalytics, which is a shame as it'd be nice to try them as a general-purpose in-memory analytic database, but that's Oracle's call I guess.
So how does data get into the TimesTen in-memory database, and how do you know what data to cache in there? We'll see tomorrow, when we look in detail at the new Summary Advisor feature within the Oracle BI Administration tool.