Analytic Views, Oracle Database 12.2

An interesting new feature came out with the 12c Release of Oracle Database Enterprise Edition called Analytic Views that looks, at first sight, to be an extension of the existing dimensions and materialized views features the Oracle database has had since the 8i release. For anyone who remembers those features being added to the Oracle Database back in the mid 90’s (and if you don’t, here’s a youthful-looking blogger writing about that topic on a soon-to-be-rather-controversial DBA website), the CREATE DIMENSION statement allowed you to define an aggregation rule for use by the database’s query optimizer, like this:

CREATE DIMENSION products
LEVEL product IS (products.prod_id, products.prod_desc, products.prod_list_price, products.prod_min_price, products.prod_name, products.prod_pack_size, products.prod_status, products.supplier_id, products.prod_unit_of_measure, products.prod_weight_class)
LEVEL subcategory IS (products.prod_subcategory, products.prod_subcat_desc)
LEVEL category IS (products.prod_category, products.prod_cat_desc)
HIERARCHY prod_hier (product CHILD OF subcategory CHILD OF category)
ATTRIBUTE product DETERMINES (products.prod_category, products.prod_id)
ATTRIBUTE subcategory DETERMINES (products.prod_category, products.prod_subcategory)
ATTRIBUTE category DETERMINES products.prod_category;

These aggregation rules could be thereafter be used by the optimizer to work out whether to rewrite the users’ query to use a pre-computed aggregate to answer that query instead of rolling-up from the detail-level the user specified in their query. Since then, subsequent releases of the Oracle Database have introduced a separate but related feature, the OLAP Option, that used technology originated from the old Express Server to create what were called Analytic Workspaces, fully-featured OLAP multi-dimensional caches together with their own programming language and suite of applications for sales and budgeting/planning.

Over the years Oracle introduced features to integrate analytic workspaces with the SQL query language used by the core database rather than developers having to learn and use OLAP DML, Oracle OLAP’s proprietary programming language, and the dimensional model provided by the OLAP Option and all OLAP servers is just as relevant today as it was back in the 1970’s when that technology first became available, and at that time and until recently the other innovation provided by dedicated multidimensional OLAP Servers — ultra-fast queries made possible by the use of a separate pre-computed and highly-indexed mid-tier data cache — was so valuable to users that the overhead in maintaining these caches, and the query latency they introduced because of the time it took to refresh and recompute the aggregates — made sense.

Until recently that is, when hardware used to host database servers suddenly became able to host the entire database in-memory and disk I/O was no longer a problem, and fast CPUs with multiple cores and large amounts of on-board cache RAM meant pre-computing calculated measures was no longer needed. So along came Oracle Database 12c and the In-Memory Option, that until the incident with the WiFi kettle was my claim to fame as the person Larry Ellison quoted on stage at Oracle when I said it was so easy to use it was “almost boring”.

Increasingly, it no longer made any sense to define and then maintain a separate multi-dimensional OLAP cache in a sort-of-embedded-server like Oracle OLAP, when the so-easy-to-use-it’s-boring Database In-Memory Option along with materialized views and bitmap indexes were just as fast to return query results to users … especially when this in-memory option also came with another feature, vector transformation plans, that enabled a special type of in-memory aggregation using the vector processing features in modern CPUs and had been developed — rather interestingly — by none other than the OLAP Option development team headed up by none other than … Bud Endress, the long-term development lead for Oracle OLAP and before that, Express Server.

So what’s going on here then? Well back then this adding of special aggregation capabilities into the in-memory database feature by the old OLAP Server team clearly signalled that this was where the Oracle intended to take the technology legacy from their standalone and then partly-integrated multidimensional OLAP server cache. but then the second shoe dropped back at Oracle Openworld two years ago when Analytic Views were first launched in a presentation by … Bud Endress.

[embed]https://twitter.com/markrittman/status/659776557523869696[/embed]

Analytic Views then came along with the 12.2 release of Oracle OLAP and fairly obviously were the other half to the puzzle of where Oracle were planning to take the people, ideas and inspiration that came out of Oracle OLAP and the twin core user benefits of a dimensional model, and fast response to user queries involving aggregation and slice-and-dice of structured datasets. Not only that though, they also included a new take on the simplified SQL query format that Oracle OLAP introduced a few years ago, gave users the ability to build out a complete Oracle BI-style business model right over the data in the database … and, as if that wasn’t enough surprises, support (via third-party Simba ODBC drivers) for MDX.


Taking the IoT, wearables and social media data I’ve been collecting and storing up until now in Google BigQuery and transferring that across to an Oracle Database 12c 12.2 database with Analytic Views and the In-Memory Option enabled, and using Oracle SQL Developer 4.2 with its new Analytic Views functionality now included, you can see that there is a new dedicated section for Analytic Views in the main interface not unlike the old Analytic Workspace Manager standalone tool from the Oracle 10g/11g days.

Creating a new analytic view involves one of two routes in SQL Developer 4.2 (or you could of course script it all using DDL statements); you can create a quick-and-dirty analytic view over a single table using the Quick Analytic View menu option as per the screenshot below that then introspects your data to determine candidate measure and dimension columns, but this doesn’t seem to handle the sorts of wide, multiple level-per-attribute dimension denormalized tables that are common in distributed query engines like Google BigQuery that prefer those over joining lots of separate tables;

or, as I did for my testing you can normalize your fact and dimension tables so they’re separate and joined via foreign key constraints and then create in turn your attribute dimensions, hierarchies and attribute views containing measures and aggregation rules as shown in the screenshots below.

So far, very much in the same vein as creating relational and OLAP Option cube structures in previous database versions and developer tools in the past, and you can view the commands that SQL Developer will then send to the database using one of the dialog tabs, giving you a set of DDL commands looking like this:

CREATE ATTRIBUTE DIMENSION DEVICE_DIMENSION 
DIMENSION TYPE STANDARD
CAPTION 'DEVICE DIMENSION'
DESCRIPTION 'DEVICE DIMENSION'
USING DEVICE_DIM
ATTRIBUTES (
DEVICE AS DEVICE
CAPTION 'DEVICE'
DESCRIPTION 'DEVICE' ,
DEVICETYPE AS DEVICETYPE
CAPTION 'DEVICETYPE'
DESCRIPTION 'DEVICETYPE' ,
ROOM AS ROOM
CAPTION 'ROOM'
DESCRIPTION 'ROOM'
)
LEVEL DEVICE_LEVEL
LEVEL TYPE STANDARD
CAPTION 'DEVICE LEVEL'
DESCRIPTION 'DEVICE LEVEL'
KEY ( "DEVICE")
MEMBER NAME DEVICE
MEMBER CAPTION DEVICE
MEMBER DESCRIPTION DEVICE
ORDER BY MIN DEVICE ASC NULLS FIRST
DETERMINES ( "DEVICE")
LEVEL ROOM_LEVEL
LEVEL TYPE STANDARD
CAPTION 'ROOM LEVEL'
DESCRIPTION 'ROOM LEVEL'
KEY ( "ROOM")
MEMBER NAME ROOM
MEMBER CAPTION ROOM
MEMBER DESCRIPTION ROOM
ORDER BY MIN ROOM ASC NULLS FIRST
DETERMINES ( "ROOM")
LEVEL DEVICETYPE
LEVEL TYPE STANDARD
CAPTION 'Devicetype'
DESCRIPTION 'Devicetype'
KEY ( "DEVICETYPE")
MEMBER NAME TO_CHAR("DEVICETYPE")
ORDER BY MIN DEVICETYPE ASC NULLS LAST
DETERMINES ( "DEVICETYPE")
ALL
MEMBER NAME 'ALL'
MEMBER CAPTION 'ALL'
MEMBER DESCRIPTION 'ALL';
CREATE ATTRIBUTE DIMENSION DATES_DIMENSION 
CAPTION 'DATES DIMENSION'
DESCRIPTION 'DATES DIMENSION'
USING DATES_DIM
ATTRIBUTES (
DATE_DATE AS DATE_DATE
CAPTION 'DATE_DATE'
DESCRIPTION 'DATE_DATE' ,
DAY_NAME AS DAY_NAME
CAPTION 'DAY_NAME'
DESCRIPTION 'DAY_NAME' ,
DAY_NUMBER AS DAY_NUMBER
CAPTION 'DAY_NUMBER'
DESCRIPTION 'DAY_NUMBER' ,
YEAR AS YEAR
CAPTION 'YEAR'
DESCRIPTION 'YEAR' ,
YEAR_DAY_OF_YEAR AS YEAR_DAY_OF_YEAR
CAPTION 'YEAR_DAY_OF_YEAR'
DESCRIPTION 'YEAR_DAY_OF_YEAR' ,
YEAR_MM AS YEAR_MM
CAPTION 'YEAR_MM'
DESCRIPTION 'YEAR_MM'
)
LEVEL DATE_DATE
CAPTION 'Date Date'
DESCRIPTION 'Date Date'
KEY ( "DATE_DATE")
MEMBER NAME TO_CHAR("DATE_DATE")
MEMBER CAPTION TO_CHAR("DATE_DATE")
MEMBER DESCRIPTION TO_CHAR("DATE_DATE")
ORDER BY MIN DATE_DATE ASC NULLS FIRST
DETERMINES ( "DAY_NAME" , "DAY_NUMBER" , "YEAR_DAY_OF_YEAR")
LEVEL YEAR
CAPTION 'Year'
DESCRIPTION 'Year'
KEY ( "YEAR")
ALTERNATE KEY YEAR
MEMBER NAME YEAR
MEMBER CAPTION YEAR
MEMBER DESCRIPTION YEAR
ORDER BY MIN YEAR ASC NULLS FIRST
LEVEL YEAR_MM
CAPTION 'Year Mm'
DESCRIPTION 'Year Mm'
KEY ( "YEAR_MM")
MEMBER NAME YEAR_MM
MEMBER CAPTION YEAR_MM
MEMBER DESCRIPTION YEAR_MM
ORDER BY MIN YEAR_MM ASC NULLS FIRST
ALL
MEMBER NAME 'ALL'
MEMBER CAPTION 'ALL'
MEMBER DESCRIPTION 'ALL';
CREATE HIERARCHY DATES_HIERARCHY 
CAPTION 'DATES_HIERARCHY'
DESCRIPTION 'DATES_HIERARCHY'
USING DATES_DIMENSION
( "DATE_DATE" CHILD OF "YEAR_MM" CHILD OF "YEAR" );
CREATE HIERARCHY DEVICE_TYPE_HIERARCHY 
CAPTION 'DEVICE_TYPE_HIERARCHY'
DESCRIPTION 'DEVICE_TYPE_HIERARCHY'
USING DEVICE_DIMENSION
( "DEVICE_LEVEL" CHILD OF "DEVICETYPE" );
CREATE HIERARCHY ROOM_HIERARCHY 
CAPTION 'ROOM_HIERARCHY'
DESCRIPTION 'ROOM_HIERARCHY'
USING DEVICE_DIMENSION
( "DEVICE_LEVEL" CHILD OF "ROOM_LEVEL" );
CREATE ANALYTIC VIEW IOT_DEVICES_AV 
CAPTION 'IOT_DEVICE_READINGS'
DESCRIPTION 'IOT_DEVICE_READINGS'
USING IOT_DEVICES
DIMENSION BY (
DEVICE_DIMENSION AS DEVICE_DIMENSION KEY ( "DEVICE") REFERENCES ( "DEVICE")
HIERARCHIES
(
DEVICE_TYPE_HIERARCHY AS DEVICE_TYPE_HIERARCHY DEFAULT,
ROOM_HIERARCHY AS ROOM_HIERARCHY
),
DATES_DIMENSION AS DATES_DIMENSION KEY ( "DATE_TIME") REFERENCES ( "DATE_DATE")
HIERARCHIES
(
DATES_HIERARCHY AS DATES_HIERARCHY DEFAULT
)
)
MEASURES (
VALUE FACT VALUE CAPTION 'VALUE'
DESCRIPTION 'VALUE'
)
DEFAULT MEASURE VALUE
DEFAULT AGGREGATE BY AVG;

But the bit that’s to my mind interesting in the Analytic View creation dialog is the Cache Measure Groups option in the left-hand menu of the SQL Developer screen.

This, and the vector transformation feature introduced with Oracle Database 12.1, are the key to how Analytic Workspaces will return aggregated query results as fast, if not faster, than using a dedicated multidimensional OLAP cache layer. Optimisations in the SQL parser for analytic views that on one hand enable simplified SQL queries without the need for GROUP BY or join clauses, like this:

SELECT time_hier.member_name as TIME,
sales,
sales_prior_period
FROM
sales_av HIERARCHIES(time_hier)
WHERE time_hier.level_name IN ('YEAR','QUARTER')
ORDER BY time_hier.hier_order;

also use these cache measure group settings to hard-wire the Analytic View into any pre-created materialized views containing aggregations of your measures, side-stepping the usual query rewrite mechanism used by the database query optimizer to ensure, along with vector transformations, that users have split-second response times to their queries whether at detail-level or aggregated across one or more of the analytic view hierarchies. You can try this feature out for yourself using Oracle’s Live SQL demo environment here, and it’s important to understand what this feature and the integration of analytic views with the in-memory database option with its vector transformation feature really means — OLAP aggregation rethough for today’s in-memory database servers.

So this, in my opinion, is where Oracle are headed in terms of meeting the needs of database customers who want fast, easy-to-maintain OLAP query functionality that provides the response time of a dedicated OLAP server but does it through leveraging the in-memory capabilities of Oracle Database 12c, the people and ideas from the OLAP Option product and the huge amounts of memory and compute capacity you get in modern servers. All very interesting, and interesting as well to see where the high-end traditional relational database vendors are taking OLAP analysis now old-school OLAP servers have had their day but dimensional analysis and dimensional business models are still just as relevant.

Previous
Previous

Google BigQuery, Large Table Joins and How Nested, Repeated Values and the Capacitor Storage Format…

Next
Next

What BI Development Looks like with BigQuery, Google Cloud APIs, Looker and Fluentd (courtesy of…