Oracle OpenWorld 2015 Roundup Part 3 : Oracle 12cR2 Database Sharding, Analytic Views and Essbase 12c
With the UKOUG conference starting tomorrow I thought it about time I finished off my three-part post-OOW 2015 blog series, with a final post on some interesting announcements around Oracle Database and Essbase. As a reminder the other two posts were on OBIEE12c and the new Data Visualisation Cloud Service, and Data Integration and Big Data. For now though let’s look first at two very significant announcements about future 12cR2 functionality - database sharding and Analytic Views.
Anyone who’s been involved in Oracle Data Warehousing over the years will probably be aware of the shared-everything vs. shared-nothing architecture debate. Databases like Oracle Database were originally designed for OLTP workloads with the optimal way to increase capacity being to buy a bigger server. When RAC (Real Application Clusters) came along the big selling point was a single shared database instance spread over multiple nodes, making application development easy (no real changes) but with practical limits as to how big that cluster can get - due to the need to synchronise shared memory across all nodes in the cluster, and network bottleneck caused by compute and storage being spread across the whole cluster, not co-located as we get with Hadoop and HDFS, for example.
Shared-nothing databases such as Netezza, for example, take a different approach and “shard” the database instance over multiple nodes in the cluster so that processing and storage are co-located on the same node for particular ranges of data. This gives the advantage of much greater scalability that a shared-nothing approach (again, this is why Hadoop uses a similar approach for its massively-clustered distributed compute approach) but with the drawback of having to consider data locality when writing ETL and other code; at worst it means data loading and processing needs to be rewritten when you add more nodes and re-shard the database, and it also generally precludes OLTP work and consequently mixed-workloads on the same platform.
But if it’s just data warehousing you want to do, you don’t really care about mixed workloads and its generally considered that shared-nothing and sharding is what you need if you want to get to very-large scale data warehousing, such that Oracle went partly down the shared-nothing route with Exadata and push-down of filtering, projection and other operations to storage nodes thereby adding an element of data locality and reducing the network throughput between storage and compute.
But both types of database are loosing out to Hadoop for very, very large datasets with Hadoop distributed compute approach designed right from the start for large distributed workloads at the expense of not supporting OLTP at all and, at least initially, all intermediate resultsets being written to disk. For those types of workloads and database size Oracle just wasn’t an option, but a certain top their of Oracle’s data warehousing customers wanted to be able to scale to hundreds or thousands of nodes and most of them have ULAs, so cost isn’t really a limiting factor; for those customers, Oracle announced that the 12c Release 2 version of Oracle Database would support sharding … but with warnings it’s for sophisticated and experienced customers only.
Oracle are positioning what they’re referring to as “Oracle Elastic Sharding” as for both scaling and fault-tolerance, with up to 1,000 nodes supported and with data routed to particular shards through use of a “sharding key” passed the connection pool.
Sharding in 12c Release 2 was described to me as a featured aimed to the “top 5%” of Oracle customers where price isn’t the issue but they want Oracle to scale to the size of cluster supported by Hadoop and NoSQL. Time will tell how well it’ll work and what it’ll cost, but it certainly completes Oracle’s journey from strict shared-everything for data warehousing to more-or-less shared nothing, if you want to go down that extreme-scalability route.
The other announcement from the Oracle Database side was the even-more-unexpected “Analytic Views”. A clue came from who was running the session - Bud Endress, of Oracle Express / Oracle OLAP fame and more recently, the Vector Group By feature in the In-Memory Option - but what we got was a lot more than Oracle OLAP re-imagined for in-memory; instead what Oracle are looking to do is bring the business metadata and calculation layers that BI tools use right into the database, provide an MDX query interface over it, simplify SQL so that you just select measures, attributes and hierarchies - and then optimise the whole thing so it runs in-memory if you have that option licensed.
Its certainly an "interesting” goal with considerable overlap with OBIEE’s BI Server and Essbase Server, but the goal of bringing all this functionality closer to the data and available to all tools is certainly ambitious, if it gets traction it should bring business metadata layers and simpler queries to a wider audience - but the fact that it seems to be being developed separately to Oracle’s BI and Essbase teams means it probably won’t be subsuming Essbase or the BI Server’s functionaliy.
The last area I wanted to look at was Essbase. Essbase Cloud Service was launched at this event with it positioned as a return to Essbase’s roots as a tool you could use in the finance department without requiring IT’s help, except this time it’s because Essbase is running as a service in the cloud rather than on an old PC under your desk. What was particularly interesting though is that the version of Essbase being used in the cloud is the new 12c version, that replaces some of the server components (the Essbase Agent, but not the core Essbase Server part) with new Java components that presumably fit better with Oracle’s cloud infrastructure and also support greater levels of concurrency
Apart from the announcement of a future ability to link to R libraries, the other really interesting part of Essbase 12c is that for now the only on-premise version of it is as part of OBIEE12c, and it’ll have a very fixed role there as a pure query accelerator for OBIEE’s BI Server - perhaps the answer to Qlikview and Tableau’s in-memory column-store caches. Essbase as part of an OBIEE12c store doesn’t work with Essbase Studio or any of the other standard Essbase tools, but instead has a new Essbase Business Intelligence Acceleration Wizard that deploys Hybrid ASO/BSO Essbase cubes directly from the OBIEE BI Server and RPD.
Coupled with the changes to Essbase announced a couple of years ago at Openworld 2013 designed to improve compatibility with OBIEE, this co-located version of Essbase seems to have completed it’s transformation into the BI Server mid-tier aggregate cache layer of choice that started back with the 11.1.1.6.2 BP1 version of OBIEE - but it does mean this version can’t be used for anything else, even custom Essbase cubes you load and design yourself. Interesting developments across both database server products though, and that wraps up my overview of OOW2015 announcements. Next stop - UKOUG Tech’15 in Birmingham, where I’ve just arrived ready for my masterclass session in tomorrow’s Super Sunday event - on data reservoirs and Customer 360 on Oracle Big Data Appliance.