Oracle Openworld may have San Francisco and the Moscone Center but you can’t beat Liverpool and the wind whipping in-off the Mersey in early December, and so we’re very pleased to be presenting once again at the UK Oracle User Group Technology Conference & Exhibition running next week at the Liverpool ACC, December 3rd-5th 2018.
Mark Rittman is presenting two sessions on Oracle’s new autonomous analytics and data warehousing platforms:
”As Oracle Analytics and Data Warehousing becomes self-driving and autonomous, the need for a strategy within your BI function becomes all the more important. How you deliver BI content to your users, the skills your developers now need and the most efficient way to manage your cloud estate are vital components of an autonomous cloud analytics strategy; this session will explain what’s changed, what’s significant and what are the implications of that change.
By reference to today’s Oracle BI technology and applications and explaining how they now run autonomously as services in Oracle Cloud we’ll cover what are the key changes and its implications, what this means in terms of budgets and planning your investment, what are the key technology and process changes, what this means for team resourcing and roles - and just as importantly, what can be safely ignored for now.”
”In this session, we'll look at the role of the data engineer in designing, provisioning, and enabling an Oracle Cloud data lake using Oracle Analytics Cloud Data Lake Edition. We’ll also examine the use of data flow and data pipeline authoring tools and how machine learning and AI can be applied to this task. Furthermore, we’ll explore connecting to database and SaaS sources along with sources of external data via Oracle Data-as-a-Service. Finally we’ll delve into how traditional Oracle Analytics developers can transition their skills into this role and start working as data engineers on Oracle Public Cloud data lake projects.”
Mark Rittman will be at the event all week, so if you’ve got any questions about the topics we’re talking about in the two sessions or any aspect of Oracle Analytics, big data or Autonomous Data Warehouse either drop us an email beforehand, give us a call on +44 7866 568246 or just stop us in the corridor and we’ll be pleased to chat then.
One of the sessions we delivered at Oracle Openworld 2018 was “Data Warehouse Like a Tech Startup with Oracle Autonomous Data Warehouse Cloud”, a presentation on the agility that Oracle Autonomous Data Warehouse Cloud (ADWC) can provide for teams within big, traditional businesses that would otherwise have to wait months to provision a physical Oracle data warehousing database. Moreover, ADWC instances can scale-up and scale-down as demand grows over time, and hits peaks such as Black Friday that are only short-term but in the past would have required companies to pay for and provision that level capacity all year round even though it sat idle most of the time.
The story behind the presentation was also written-up as a profile article in Forbes magazine by Oracle’s Jeff Erickson, with a great description of how MJR Analytics works on client projects.
But don’t just take our word for it, or even Forbes Magazine - our objective is for every client engagement to be referenceable where client confidentially allows, and you can check-out our first two testimonials from Colourpop and from Florence on our new Customers page … with a couple more to be added in the next week or so!
In this series of three blogs on Oracle Analytics Cloud Data Lake Edition I’ve setup an object store data lake in Oracle Cloud using Oracle Big Data Cloud and Oracle Storage Cloud, and ingested streams of real-time event data from IoT and social media sources into Oracle Cloud’s object storage service using Oracle Event Hub Cloud Service.
The event-stream data I staged into Storage Cloud was then copied into parquet files on HDFS and then presented out to BI and ETL tools through Big Data Cloud’s Thrift Server interface, so that now I’m ready, after a short diversion into defining the data engineer role that would typically work with this new product edition, to start exploring some of Oracle Analytics Cloud Data Lake Edition’s new data flow and predictive model preparation features.
The diagram below shows where OAC Data Lake Edition fits into my project architecture, performing the tasks of transforming and enriching the incoming dataset and then presenting my at-scale data out to end-users for analysis using OAC Data Lake Edition’s Data Visualization features.
Looking at the homepage within OAC Data Lake Edition I can see my two Hive tables listed within the dataset catalog, alongside other datasets I’d uploaded directly into OAC. This visual catalog of available datasets is also the new homepage interface that OBIEE12c 184.108.40.206.0 now adopts, with both cloud and on-premises versions of Oracle’s BI tools now relegating the old “Answers” homepage to something you have to dig around and specifically look for in favour of this more self-service Data Visualization starting page.
I’ll have to write an article on Answers and how powerful its interface is, and the full dimensional model it exposes from the Oracle BI Repository, in a blog post sometime in the future as it’s almost in danger of getting forgotten about.
Moving on though, the first transformation I need to do on all the incoming datasets is to take the timestamp column in each table and convert it to a format that OAC recognises as a valid TIMESTAMP datatype format, then convert those columns to TIMESTAMPs so that DV can automatically enable time-series analysis by day, month, quarter, hour and so on. I do that using a feature that’s also present in OAC Standard Edition, the lightweight data preparation interface that’s presented to users when they first add a new data source into OAC’s dataset catalog, shown in the screenshots below.
Where OAC Data Lake Edition gets really interesting right now both in terms of differences vs. the on-premises versions of OBIEE I used to use, and in terms of it’s “data engineering” potential, is with a feature called Data Flows.
Most self-service BI tools now have a basic data loading and data preparation capability today with Tableau Data Prep being one of the latest examples. Designed to handle more complex data prep use-cases than basic datatype changes and field-splitting, they give end-users the ability to do this type of work themselves rather than trying to do it in Excel or handing the work off to the IT department and having to wait days or weeks to get that data back.
Data Flows are a feature that’s been introduced since the original on-premises version of OBIEE12c that I last used when working out in consulting, and provide you with what’s effectively a lightweight, multi-step ETL tool that executes transformations using the BI Server’s Model Extension feature, introduced back when OBIEE12c first came out as the mechanism to enable on-the-fly data mashups between server-side and user-uploaded datasets.
Looking at the transformation operators available in OAC Data Lake Edition v4 there’s quite a few that apply to data lake and data engineering-type workloads including running Python statistical analysis scripts and predictive model training and model build; there’s also an operator for creating an Essbase Cube, with Essbase in this instance positioned as a fast ad-hoc analysis back-end for use with the data visualization part of OAC.
For now though there’s two transformation tasks I want to do with my Hive datasets; first, enrich the incoming social media data by analyzing the sentiment in each tweet and then writing the data plus this sentiment tagging back to the Oracle Big Data Cloud environment, so that I can then turn those sentiment tags into a score and create a data visualization showing who sends me the most tweets and how crazy they are overall.
The second data enrichment I wanted was on some Strava cycling workout data I’d uploaded directly into OAC using the CSV file upload facility; using the model train and build Data Flow operators I defined a model to predict how many “kudos”, the Strava equivalent to Facebook “likes”, I’d get for a given cycle workout with a number of different variables available to the model in order to make the prediction — for example, distance and elevation gain, map location, effort expended and so on.
Then, after running the model build step and looking at the predicted values and the actual ones for the remainder of the dataset not used for model training, you can see the predicted kudos values are fairly in-line with the ones I actually recorded for those rides.
Another feature that’s now in Oracle Analytics Cloud is automated data diagnostics, or Explain. Explain uses machine-learning libraries and that same model extension/XSA BI Server framework to help users quickly understand the value distribution and statistically correlated driving factors for a particular dataset, and learn which segments or cohorts have the highest predictive significance. Enabled by a bunch of extensions to BI Server logical SQL I used the feature first on the sentiment scoring I’d performed earlier on, and then on the steps data I’d brought into Oracle Big Data Cloud from my Fitbit device, after converting the numeric step counts into a text attribute by bucketing its values into low, medium and extreme bucket values.
This is pretty powerful stuff, with automated understanding and context-gaining about new datasets being one of the most user-enabling features I’ve seen arrive recently in BI tools with the best example of this being BeyondCore, now part of Salesforce Einstein. OAC lets the user pick the most useful of the Explain facts and driver insights and publish them to a Data Visualization dashboard like the one below, showing the most predictive and significant variables in my dataset that influence the steps I take each day.
Which leads neatly to the final “data at-scale” feature in OAC, the Data Visualization feature that in my case is querying the ingested, transformed and now enriched datasets I’ve got running on my Oracle Big Data Cloud instance alongside Oracle Event Hub Cloud and Oracle Analytics Cloud Data Lake Edition.
Wrapping-up this three part series on Oracle Analytics Cloud Data Lake Edition and Oracle Big Data Cloud I’d like to go back to the two (serious) questions I asked myself at the end of the previous post:
Has OAC Data Lake Edition got anything actually to do with data lakes, and is it a useful tool for aspiring Oracle technology data engineers?
How does it compare to my old favourite Oracle big data product Oracle Big Data Discovery, officially still available and not quite dead yet but existing in some strange zone where the on-premises version stopped getting updates a while ago and the cloud version is for sale but you can’t buy it unless you know the right person to ask and he’s actually gone to Cloudera
So has Oracle Analytics Cloud Data Lake Edition got much to do with actual “data lakes”? Well … it integrates with Oracle Big Data Cloud and apparently comes with an option to run those data flow transformation in Big Data Cloud’s Apache Spark environment, though to be fully-transparent I didn’t see that as an option when doing my evaluation so can’t comment on how well or not that works.
Like Oracle Big Data Discovery before it, OAC Data Lake Edition makes you structure your incoming event stream data with Hive table metadata before you can work with it, but that’s actually fairly standard practice with most data visualization tools that work with Hadoop and data lake environments.
Having Essbase in this product package, alongside the data lake functionality, did make me scratch my head a bit and wonder, “why?” — data lakes and Essbase are about as opposite as you can get in terms of target users and use-cases and I think this Data Lake Edition is as much about creating a product package and price point that’s mid-way between the OAC Standard and Enterprise Edition.
But there is some logic to having Essbase in this edition; it provides a set of easy-to-use loading and preparation tools for Essbase making it easier for customers new to that product to start using it, and Essbase with its latest hybrid ASO/BSO storage format is surprisingly scalable and blindingly-fast to query, a great potential back-end for enabling data analysis “at-scale” using Oracle Analytics Cloud’s data visualization features.
I also get the feeling that this initial v4 version of OAC Data Lake Edition is more of an initial first-cut release to get something out to customers, establish the product package and validate the roadmap and market assumptions. Oracle Analytics Cloud v5 isn’t too far off and I’d expect incremental improvements and new features in areas such as natural language processing and machine learning built-into the developer experience; I wouldn’t be surprised to see Oracle Big Data Preparation Cloud making its way into the product given its obvious fit and overlap with Data Lake Edition’s data prep features.
I came across SNAP, Sparkline‘s platform for building OLAP-style dimensional models over data lakes and cloud object storage layers about a year ago when researching analytics platforms at Qubit and quite frankly, it’s as revolutionary in terms of todays data lake analytics market as OBIEE (or the nQuire Server) was back in 1995 with its virtual data warehouse over application and data warehouse sources.
Take these two slides from the Sparkline website and imagine them as the future of Oracle Analytics Cloud analyzing event-streams and big data cloud-hosted datasets…
and you can see why I’m keen to see where Oracle Analytics Cloud Data Lake Edition goes over the next couple of years. I’m speaking on OAC Data Lake Edition at ODTUG KScope’18 in Orlando in just a couple of weeks time so come along if you’re there, it should be an interesting talk.
Late last year I covered one of the first releases of Oracle Analytics Cloud (OAC) on this blog covering the v3 release of OAC; since then the product has had a major UI refresh with OAC v4 and so I thought it’d be worth covering this new release along with updates to OBIEE12c and DV Desktop in a follow-up to last year’s post.
OAC’s original look and feel was based on OBIEE12c v1 that itself contained the first iteration of Oracle Data Visualisation (DV) and whilst the new modern flat UI that release introduced was an improvement on the 11g release before it since then Oracle have rapidly iterated with Oracle DV Desktop and DV Cloud Service and the UI from those products is now the primary user interface for Oracle Analytics Cloud v4.
If you’ve not used Oracle Analytics since the OAC v3 and the initial OBIEE12c versions you’ll be surprised how much the user experience has changed since then; when OBIEE12c first came out most of us still considered Answers and Dashboards to be the primary reporting UI with the BI Repository being the central carefully governed and dimensionally-modelled source of all reporting data.
Since then the world has moved-on and analytics is now all about empowering the end-user through self-service, easy-to-use tools that enable you to do most tasks without having to involve the IT department or do lots of up-front data modeling and data onboarding. Oracle DV Desktop started meeting this need by introducing basic data preparation features into the product so users could upload spreadsheets and other other data and do some basic formatting and tidying before analyzing it on their desktop, and those data upload and preparation features are now available for OAC users in the v4 release. To upload a spreadsheet file into OAC you now just drop it onto the web page and you’re then given a preview and the option to upload its contents to the analytics instance.
After that you can trim, split, derive and change the datatypes of data you upload so that it works in the most optimal way with the analysis features of the product, for example by turning dates in string formats not recognized by Oracle DV into proper date datatypes that can then be used in time-series analysis.
For data transformation needs that go beyond basic formatting and datatype changes you can now build ETL-style routines that filter, aggregate, transform and add external datasets all using simple point-and-click user tools, and if you’ve licensed the new OAC Data Lake Edition these transformations then extend to include model training, forecasting and sentiment analysis. In the example below I’m using OAC Data Lake Edition to forecast weight forward from a set of smart scale readings that’s then executed at the BI Server level using Python maths and statistics libraries.
Cloud of course is great but not every Oracle Analytics customers has made the move yet, or like myself you might have a cloud instance you can spin-up as-and-when you need it but then use regular on-premises OBIEE12c in a VM for your day-to-day learning and experimentation.
With Oracle’s cloud-first strategy and focus on Oracle DV as the main innovation path for analytics features this meant that my OBIEE12c instance running in a Google Compute Engine VM was starting to lag behind OAC v4 and DV Desktop in-terms of look-and-feel and all of these new features, and so I was pleased to note the other day that the 220.127.116.11.0 (aka v4) release of OBIEE12c is now available for download on OTN that includes many of the features currently available in OAC v4 and DV Desktop (but not the advanced data flow features in OAC Data Lake Edition, for example)
Oracle Data Visualization Desktop has also been updated recently to connect to Oracle Autonomous Data Warehouse Cloud and I’ll be covering that new capability in a blog post in the near future; but more importantly for me at least….
… it also now supports Mac OS X 10.13 High Sierra, great news for us Mac users who had to stop using DV Desktop on our laptops late last year when that most recent update to OS X came out but DV Desktop didn’t yet support it.
Earlier in the week I wrote-up my first impressions of Oracle Autonomous Data Warehouse Cloud (ADWC) on this blog, and said at the end I’d follow with another post on how to connect ADWC to Looker, the cloud-based BI tool I usually query Google BigQuery with in my role as Analytics Product Manager at Qubit.
For this initial example I uploaded a few tables of workout and other data into ADWC using SQL Developer, exporting the relevant tables out of BigQuery in CSV format and then uploading them into a user account I created for the purpose in AWDC. The table I’ll concentrate on for this example is the STRAVA_RIDES table, each row detailing an individual cycle workout as recorded through the Strava smartphone app.
The first step in connecting-up Looker to ADWC, or indeed any Oracle Database, is to run some scripts that set up Oracle as Looker expects to find it. Unlike most other BI tools I’ve used with Oracle, Looker expects to connect through a specific user login (“LOOKER”) that is then granted SELECT access to any tables and views you want to report on in other schemas. This user login also needs to have some views and synonyms created to give it access to the V$ system views within Oracle that report on active sessions, and an ability to kill long-running sessions through a PL/SQL package that calls ALTER SYSTEM … KILL SESSION.
The commands to run for regular Oracle databases are detailed on the Looker website but you need to alter them slightly to use ADWC’s superuser account name (“ADMIN”) instead of SYS when initially connecting and when creating the LOOKER_HASH function for symmetric aggregate handling, along with various other changes due to differences in how various objects are named in AWDS vs. regular Oracle Database.
I’ve listed the commands I ran on my ADWC instance below, they should work for you but if not then check out the “Autonomous Data Warehouse Cloud for Experienced Oracle Database Users” section in Using Autonomous Data Warehouse Cloud that explains the differences between the new autonomous and regular Oracle Database server versions.
First create the Looker account and grant the relevant roles and priviledges:
create user LOOKER identified by <<new_looker_account_password>>;
alter user LOOKER default tablespace DATA temporary tablespace TEMP account unlock;
alter user LOOKER quota unlimited on DATA;
alter user LOOKER default role RESOURCE;
grant CREATE SESSION to LOOKER;
GRANT UNLIMITED TABLESPACE TO LOOKER;
GRANT CREATE TABLE TO LOOKER;
grant select on -- <all tables that will be used by looker>;
Now create the views that Looker uses to understand what sessions are active and the SQL that’s currently being executed to provide data for looks and dashboard tiles:
create or replace view LOOKER_SQL as select SQL.SQL_ID, SQL.SQL_TEXT from V$SQL sql ,v$session sess where SESS.SQL_ADDRESS = SQL.ADDRESS and SESS.USERNAME='LOOKER';
create or replace synonym LOOKER.LOOKER_SQL for LOOKER_SQL;
grant select ON LOOKER.LOOKER_SQL to LOOKER;
create or replace view LOOKER_SESSION as SELECT SID, USERNAME, TYPE, STATUS, SQL_ID, "SERIAL#", AUDSID FROM V$SESSION WHERE USERNAME='LOOKER';
create or replace synonym LOOKER.LOOKER_SESSION FOR LOOKER_SESSION;
GRANT SELECT ON LOOKER.LOOKER_SESSION TO LOOKER;
Next, create the Oracle PL/SQL function that Looker uses as part of symmetric aggregate handling, and a function that Looker can use to “kill” runaway database queries that are taking too long to return results back to you.
create or replace function LOOKER_HASH(bytes raw, prec number) return raw as begin return(DBMS_CRYPTO.HASH(bytes, prec)); end;
create or replace synonym LOOKER.LOOKER_HASH for LOOKER_HASH;
grant execute on LOOKER.LOOKER_HASH to LOOKER; grant execute on ADMIN.LOOKER_HASH to LOOKER;
create or replace procedure LOOKER_KILL_QUERY(P_SID in VARCHAR2, P_SERIAL# in VARCHAR2) is CURSOR_NAME pls_integer default dbms_sql.open_cursor; IGNORE pls_integer;
begin select COUNT(*) into IGNORE from V$SESSION where USERNAME = USER and SID = P_SID and SERIAL# = P_SERIAL#;
if (IGNORE = 1) then dbms_sql.parse(CURSOR_NAME, 'alter system kill session ''' || P_SID || ',' || P_SERIAL# || '''', dbms_sql.native); IGNORE := dbms_sql.execute(CURSOR_NAME); else raise_application_error(-20001, 'You do not own session ''' || P_SID || ',' || P_SERIAL# || ''''); end if; end;
create or replace synonym LOOKER.LOOKER_KILL_QUERY for ADMIN.LOOKER_KILL_QUERY;
grant execute on ADMIN.LOOKER_KILL_QUERY to LOOKER;
Next over to the Looker configuration. You’ll need to be on the Looker 5.12.12 or higher release with an instance hosted in the US to get the integration working as of the time of writing so that “ADWC” is listed as a connection type and the ADWC wallet integration works; if you’re running Looker as a hosted instance you’ll also need to speak with support to have them copy across the wallet files to the correct location on the Looker server.
To create the connection, enter the following details:
Name : Name of your connection, e.g. “rittman_adwc”
Dialect : Oracle ADWC (only appears with Looker 5.12.12+)
Host:Port : from the TNSNAMES.ORA file in your ADWC wallet zip file
Username : LOOKER (as per the account setup in previous steps)
Password : password of LOOKER account
Temp Database : LOOKER (as per previous steps)
Persistent Derived Tables : checked
Service Name : From TNSNAMES.ORA in your ADWC wallet zip file
Additional Database Params : TNSNAMES.ORA SSL Server Cert DN
To show the Service Name and Additional Database Params fields you first have to save the connection, then tick the “Use TNS” checkbox to reveal the fields. To find your host:port, service name and SSL Server Cert DN values first download the wallet zip file for your ADWC instance from the ADWC Service Console, unzip the archive and then locate the details you need in the TNSNAMES.ORA file as shown below. In my case I chose to use the “medium” ADWC instance type for my connection settings.
Then, save and test your connection. The step that checks that persistent derived tables will probably fail if you try this around the time of my writing as there’s a known bug in the step that checks this feature, it’ll no doubt be fixed soon but if the rest of the checks pass you should be good.
Finally, it’s just then a case of importing your table metadata into Looker and creating explores and a model as you’d do with any other data source, like this:
In this instance I’ve updated the Strava Rides LookML view to turn the relevant metric fields into measures, define a primary key for the view and remove or hide fields that aren’t relevant to my analysis, like this:
Now I can start to analyze my Strava workout data I previously uploaded to Oracle Autonomous Data Warehouse, starting with average cadence and speed along with total distance and Strava’s “suffer score” for each of my workouts:
and then looking to see how much correlation there is between distance and personal strava records being broken on my five longest rides.
In the background, Looker is sending Oracle and ADWC-specific SQL to Oracle Autonomous Data Warehouse Cloud, with the SQL tab in the Explore interface showing me the actual SQL for each query as its sent.
Should I wish to check how much of the storage and CPU capacity available for my ADWC instance is being used, I can do this from ADWC’s Service Console.
So there you have it — Looker powered by an Oracle Autonomous Data Warehouse Cloud, and no need for an Oracle DBA to get it all running for you.
Regular readers of this blog will know that my day-to-day work involves ingesting, storing and then analyzing petabytes of data using Google BigQuery, one of a new generation of fully-managed, cloud-hosted data warehouse platforms-as-a-service. Services like BigQuery take care of all the database infrastructure management and capacity planning leaving you just to define tables, load data into them and then run queries using SQL or your favourite BI tool.
Oracle recently announced Oracle Autonomous Data Warehouse Cloud (ADWC) that competes in this market and given my background in Oracle technology I was interested to see how it stacked-up compared to BigQuery and similar services from the likes of Amazon AWS and Snowflake. To do this I took advantage of Oracle’s free $300 service credit offer and signed-up for an ADWC instance – note that at the time of writing the ADWC service is only available in Oracle Public Cloud’s North American datacenter region as opposed to the European ones so make sure you choose the former when creating your Oracle Cloud account.
Setting up an Oracle on-premises database was something once-upon-a-time I’d spent months and thousands on course fees studying for certifications to be able to do confidently, so it was nice to see how things had got a lot simpler with setting up an ADWC instance with the first step being to select ADWC from the Oracle Cloud Services menu, like this:
Then answer a few simple questions: the name to give the instance, number of CPUs and amount of storage to set aside initially, and the password for the administrator account.
Setup of an instance takes a few minutes, then its ready. Not bad.
The next big change you notice compared to Oracle’s customer-managed Database Cloud Service is the replacement of Enterprise Manager Database Express 12c and DBaaS Console with a single Service Console, accessed not through the usual SYS and SYSTEM superuser accounts but through a new superuser account called ADMIN, presumably adopting a naming standard more familiar to users of other SaaS and PaaS platforms.
The Service Console contains a couple of performance-tracking pages with the emphasis on read-only viewing of those stats vs. giving you controls to fine-tune database parameters or table structures, and a page where you can download a zip file containing a wallet directory with TNSNAMES.ORA and encrypted connection details to your ADWC instance.
I seem to remember at the time of Autonomous Data Warehouse Cloud’s launch that many in the Oracle community (myself included, most probably) said that all that the “autonomous” part of Autonomous Data Warehouse Cloud did was automate the Oracle Database maintenance tasks that more recent data warehousing platforms-as-a-service handled for you automatically. Another common critique at that time was that under the covers ADWC was more scripting and use of existing “advisors” than true autonomous artificial intelligence-style database management.
To be honest though, it doesn’t matter; compared to running a regular Oracle database you just fill in a short form and the platform runs without any further intervention from you, you don’t need to know how it works, job done in terms of competing with BigQuery and Redshift for ease-of-use and maintenance.
Connecting to ADWC is a bit different to regular Oracle Database connections in that it’s easier — no need to specify host, port and SID/service name — but this involves downloading a ZIP file wallet and the tool you wish to connect having support for that wallet. Oracle SQL Developer in its most recent 18c version has that support making it the obvious tool to define new users, create tables and upload data into them and then run simple SQL queries against your data.
Oracle ADWC is based on Oracle Exadata Database Server technology and supports PL/SQL, full INSERT/UPDATE/DELETE support in SQL as well as SELECT, parallel query (enabled by default) and hybrid columnar compression.
What you don’t get are indexes and materialized views as these generally aren’t considered necessary when running on Exadata hardware that’s optimized for full table scans of data arranged in compressed columnar format, and as ADWC charges per CPU per hour there’s no penalty in scanning lots of detail-level data unnecessarily as you get with BigQuery.
Also missing are features such as the Oracle OLAP Option, Oracle Spatial and Graph and Oracle Text that I’m guessing might get implemented at a later date, along with management features such as Oracle Tuning Pack that are no longer needed when the autonomous features of the platform do the database tuning for you.
Price-wise, although as I said a moment ago you do get charged by the amount of data you scan (and a bit for storage) with Google BigQuery, Oracle AWDC seems to be more expensive to run at least at the data volumes I’d be using as a developer initially evaluating the product.
If you commit to monthly charging then ADWC costs around $1200/cpu/month whereas Google charge $5/TB for queries with the first 1TB each month free plus a token amount for storage and streaming inserts, which meant my bill last month for BigQuery came to just $1.53 and my total bill for all Google Cloud Platform services including multiple hosted VMs and extensive usage of ther NLP and Geocoding APIs took the total up to just over $150. Clearly Google are aiming more at the startup market whereas Oracle are serving enterprise customers, but don’t make the assumption that just because AWDC is elastic it’s also super-cheap for hobbyist use, long-term.
So now its a case of loading some sample data into ADWC using SQL Developer and then running a couple of test queries to see it working in action with my data.
Tomorrow, I’ll show how Looker be connected using the newly announced support it has for Oracle ADWC and later on in the week do the same for Oracle Analytics Cloud and DV Desktop.
Oracle Analytics Cloud, the “full” implementation of Oracle’s business analytics platform tools in Oracle Public Cloud, was released back in May 2017 and I covered its basic product proposition in a Drill to Detail Podcast episode with Oracle’s Vasu Murthy a few weeks just before then.
Up until now I’d not really had a chance to properly review Oracle Analytics Cloud as the products I work with right now are mainly based around Google’s Cloud Platform and the Looker BI tool, but Oracle kindly made available some cloud promotion credits through their Oracle ACE Director community program and so I gave it as well as a few other supporting cloud services a spin over the Christmas break.
In the end it was only really a high-level evaluation but with my previous experience with Oracle’s BI Cloud Service, Data Visualization Desktop and Cloud Services and the on-premises versions of their BI Enterprise Edition product line I was able to get up and running fairly quickly. More interestingly for me was trying to work out who the target market is for Oracle Analytics Cloud and what particular use-cases it tries to address; why Oracle released Oracle Analytics Cloud when they already have Oracle BI Cloud Service and how they differentiate the two products and the two markets they presumably serve; and to try and speculate on why Oracle made the choices it made when coming-up with Oracle Analytics Cloud given the competition it faces in the market.
For anyone not so familiar with Oracle’s business analytics products you’ve basically got a classic end-to-end enterprise BI platform, Oracle Business Intelligence 12c, that covers everything from reporting tools through to business metadata modeling to application and database server integration that typically sold to the IT function within corporate and government customers and was recently joined by Oracle Data Visualization that turned this all on its head and sold to the sorts of buyers who were now buying Tableau on their company credit cards and bypassing the IT department so that they could actually get something done.
Oracle’s first forays into moving all this into the cloud were focused again on the needs of these non-technical business buyers starting with Oracle Business Intelligence Cloud Service (“BICS”), a departmental cut-down version of Oracle Business Intelligence that I covered in an article for Oracle Magazine when the product first came out. Over time BICS was extended to include data visualization capabilities that were being introduced with the Data Visualization Option for the new Oracle Business Intelligence 12c release and then just those visualization features were packaged up into another end-user focused cloud service, Oracle Data Visualization Cloud Service, that became available around two years ago.
One thing you couldn’t do with any of those cloud tools though was migrate your on-premises Oracle BI deployments into Oracle’s Public Cloud, as BICS made you use a cut-down version of the Oracle Database along with a much-simplified data modeling tool to store and organize your data for reporting, but around the same time as Data Visualization Cloud Service came out it started to become possible to “lift-and-shift” your full enterprise business metadata models into BICS and run them against full Oracle Database instance running in the cloud or on-premises databases using a special secure connector between Oracle’s cloud data centres and the one running your Oracle database servers.
What some customers still wanted though was more than this; BICS, DVCS and all the other BI products Oracle offered were fully-managed services, not unlike Looker and Google’s BigQuery distributed database service that I use day-to-day in my current product management role at Qubit. What these customers wanted was full on-premise Oracle Business Intelligence running in Oracle Public Cloud that they could then connect into securely and manage the configuration settings to suit just their particular needs, and choose when to apply patches and when to run backups to suit their standard operating model.
What they also wanted was full unfettered access to Oracle’s BI Server metadata layer, so they could not only upload their on-premises business metadata models but then extend and update them and at the same time incorporate elements of functionality from Oracle’s Essbase Server that had also recently made its own transition into Oracle Public Cloud. Crucially, these customers still wanted to pay for all of these software costs monthly out of OpEx but also wanted the license metric to move back to per-processor rather than the named-used basis BICS and DVCS used, so they could then roll-out cloud analytics to everyone in the organization rather than just a few users and also spin-up test, development and pre-production environments if they’d licensed enough processor capacity to run them all. This, then, was the set of requirements Oracle Analytics Cloud was put together to meet, and so I took a look at what v1 of this new product looked like over the time between Christmas and New Year.
Installing Oracle Analytics Cloud is more like installing Tableau Server, or one of the other pre-rolled BI server software VMs you find on services such as AWS Marketplace, than than getting a new account on BICS or one of the other Oracle fully-managed service running on Oracle Public Cloud. You first have to set up some object storage space using Oracle Cloud Storage to hold the various configuration and log files OAC will require, then setup a database instance using Oracle Database Cloud Service to hold the RCU schema that OAC will require.
Oh, and you almost definitely need some prior knowledge of how to setup and configure Oracle software, and the previous on-premises version of Oracle Business Intelligence, before any of this makes sense; Oracle Analytics Cloud is definitely aimed at the IT department rather than casual business users, and moreover those IT departments already invested in Oracle’s analytics, database and cloud technologies for whom this will seem very familiar and in-fact will seem very reassuring — this is Oracle’s full on-premise functionality ported up into the cloud, not some cut-down managed-for-you version that hides all the controls and is impossible to manage via your own DevOps scripting.
Truth be told, I found the installation process just baffling when I first ran through it; being more used to services such as Looker and Google’s BigQuery, Cloud DataPrep and Google Data Studio products these days being presented with software where you actually need to understand the installation process and what it’s doing, and that process requires mandatory knowledge of other current and historic products relating to the one I was installing, seemed crazy in-comparison to the startup-friendly SaaS products I’m more familiar with these days.
But every organization isn’t a startup, and every user of an installer utility isn’t a non-technical business user; complex enterprise software is complex because enterprises, and large government customers have gone beyond simple customer success departments, growth hackers and biz ops teams to instead needing to support operations across multiple countries, integrate with legacy systems you’ve never heard of and hopefully never will, and do everything according to strict sets of rules such as Sarbanes-Oxley and HIPAA that land you in jail if you don’t comply with them. These types of organization have very strict rules and procedures around how software is hosted, accessed and managed and vendors like Oracle know these customers’ needs well, who would in most cases prefer a complex but controllable installation process over one that hides all the details but potentially leaves them exposed to regulatory issues thereafter, meaning they can’t ever sign-off the provisioning process as complete and fully-compliant.
And one of the first lessons I learnt in product management is that whilst engineers and product managers prefer packages that give end-users all available functionality for one simple price, enterprise salespeople much prefer options.
With packages sold to enterprises with a single list price advertised — say, $1m all-in for the right to run everything on a single high-powered server — what happens in reality is that one or two customers pay that price, but the vast majority get some sort of discount so that everybody pays something and no sales are lost just because the customer couldn’t afford the list price.
What works far better when selling enterprise deals is when everything beyond a basic core set of functionality becomes an option and customers can then align what they want with what they can afford without the vendor giving everything away each time and massively-discounting deals to make them affordable.
Hence Oracle Analytics Cloud having both Standard and Enterprise Editions, and the Standard Edition having two variants based around just data visualization or just Essbase, and there being an upcoming Data Lake Edition that will include other net-new functionality, and presumably some time in the future other functionality being added as additional options that will need further license spend by the customer before becoming available to their end-users.
Just remember when you’re sitting in the engineering department mocking the salespeople that extracting the right price from the right customers whilst everyone feeling they’ve got a good deal is what b2b selling is all about, which is why those salespeople are about the only people who are paid more than the engineers in most startups.
So what does Oracle Analytics Cloud look like once you’ve uploaded or pointed it towards some data, and started to visualize and analyze it using the various tools available to you in the platform? Well it’s effectively Oracle Business Intelligence 12c with the Data Visualization Option, at least in the configuration I’d chosen; “Self-Service Data Visualization, Preparation and Smart Discovery” along with “Enterprise Data Models” which translates to Visual Analyzer plus Answers, Dashboards and a full RPD in old on-premise terms. There’s the new home page inherited from Visual Analyzer and a new console that allows you to perform admin functions, define and work with data sources and download the BI Administration tool if the online modeler is too simplistic for your project.
I was able to upload and then analyze some of my cycling stats from Strava, working out that Saturdays in September were my best time for logging the miles and working off some calories in the last twelve months.
I was also pleased to see that the classic Answers interface was also still available along with the original home page, recent and popular content and task menus.
And this gets to what the central appeal of Oracle Analytics Cloud really is; it’s full Oracle Business Intelligence 12c, Oracle Data Visualization and Oracle Essbase with all the configuration options, IT appeal and product capabilities that were built-up over the years based on the requirements of large-scale, sophisticated and demanding enterprise and government customers. It’s not Superset or Looker, it makes no sense to anyone who’s not already invested in the Oracle analytics and database ecosystems and it can meet just about any analytics requirement, and then some … and now it all runs natively in the cloud.
Sadly my trial account and promotional credits ran out after a couple of weeks but if Oracle are ever kind enough to extend that access sometime in the New Year I’d be glad to roll-up my sleeves, dust-down my old book on how it all works and give this new cloud incarnation of the product that defined my career for the best part of ten years a further look.
I’m particularly pleased therefore to be travelling up to Birmingham this week to attend the UK Oracle User Group Tech’17 Conference where I’ll take part in a few Oracle ACE Director briefing and social events, catch-up with some old friends and past colleagues and speak on the Tuesday about something that’s fascinated, newly-motivated and quite honestly surprised me in this year away from Oracle BI consulting: the world of high-growth eCommerce and tech SaaS startups and how they intensively-use analytics, stats models and now AI to out-think, out-manouvre and out-compete their old-world brick-and-mortar competition … and how most of us have never even heard of the tools and analysis techniques they use to do it.
In my previous life the BI tool of choice was Oracle BI Enterprise Edition, or maybe Tableau, Qlik or Oracle DV if you wanted something self-service or more vizualization-focused. In the eCommerce world it’s Google Analytics, not the free version you’ve used in the past to see how popular your blog is but instead the full Google Analytics 360 Suite which well goes beyond simple counts of page hits and returning visitors. Google Analytics 360 provides digital markers with tools to measure retention and churn, goals achieved and attribution of revenue based on first click, last click and behavioural rules, segment their customers into simple cohorts of customers similar to themselves and advertise to them using Google’s Adwords service in order to land them as prospects, move them rapidly through the purchase funnel to buy something, and with a bit of luck come back again later and buy something else again and again.
Google Analytics 360 is the de-facto standard for web analytics but other specialist tools and vendors exist alongside or as specialist competition such as MixPanel and Amplitude Analytics for product analysts, and tech startups such as Qubit with the Live Tap product that I’m product manager for that marketers use along with the Looker BI tool to make their use of Qubit’s personalization platform more data-driven and productive.
In the presentation next week I’ll talk about how A/B testing, multivariate testing and more recently contextual multi-armed bandit models are used to optimize eCommerce sales operarations and mobile app experiences to maximize conversions and select what features to offer, and how bayesian statistical models are commonly employed against raw transaction data to predict with confidence the actual uplift and revenue from a campaign by considering the probability of those transactions based on prior knowkedge and thereby filtering out the one or two outlier big-spenders who only appear once-in-a-while … except if you’re in the eGaming vertical where those are exactly the customers you’re all fighting over and you certainly don’t want to ignore their activity.
And, as Qubit’s found out a while ago and the most forward-looking brands who work with them are also discovering, just randomly flipping the colour of a button or blindly discounting doesn’t increase customer lifetime value over the long-term and for that you need to move to truly 1:1, personalized marketing and communications driven by individual-level behavorial event data powered by next-generation big data cloud platforms that are engineered from the outset for this purpose, not hacked-together from a bunch of old acquisitions.
More imporantly though the next problem has come along and it’s that everyone’s now using their smartphones to access those retailers’ mobile websites but then transacting far less than they did when using desktop browsers because mobile doesn’t work as well for browsing those retailers’ product catalogs; have you ever found something you like, purely by chance, when trying to buy something new to wear on a mobile phone?
Qubit Aura recommends new and relevant items to visitors based on their preferences and behavior and then presents it all using an Instagram/Tinder-style interface designed specifically for mobile devices, and in-particular the way we “window-shop” on our phone before transacting either on that same phone at a later date, or back at our desk once we’re in the office or at one of that brand’s stores on the way home from work or at the mall at the weekend.
Interestingly though and bringing it back to Oracle and the audience for the presentation I’m giving on Tuesday, Oracle are also now adding AI and data-driven features into their HR, CX, CRM and other line-of-business SaaS apps in an initiative called Oracle Adaptive Intelligent Applications that’s headed up by Jack Berkowitz, who many of you will know from the Oracle BI Applications days and who recently came on the Drill to Detail Podcast to talk about Oracle’s investments in this area; and also explained by Oracle’s Mark Hurd at the product announcement session at this year’s Oracle Openworld shown in the video below.
Oracle are solving a different problem to Qubit and like Salesforce with their Einstein initiative the aim here is to enable AI for the traditional, non-tech and established business sector who need all of this to work with their payroll, database and manufacturing systems and want Oracle to leverage all the financial and operational data it holds for them to increase their efficiency and make more effective decisions.
and in the meantime, if you see me around in one of the other conference sessions or at the Tap & Spile with the other Oracle ACEs and ACE Directors, say hello and I’m always up for a chat about this stuff and how it applies to Oracle customers and partners.
An interesting new feature came out with the 12c Release of Oracle Database Enterprise Edition called Analytic Views that looks, at first sight, to be an extension of the existing dimensions and materialized views features the Oracle database has had since the 8i release. For anyone who remembers those features being added to the Oracle Database back in the mid 90’s (and if you don’t, here’s a youthful-looking blogger writing about that topic on a soon-to-be-rather-controversial DBA website), the CREATE DIMENSION statement allowed you to define an aggregation rule for use by the database’s query optimizer, like this:
CREATE DIMENSION products LEVEL product IS (products.prod_id, products.prod_desc, products.prod_list_price, products.prod_min_price, products.prod_name, products.prod_pack_size, products.prod_status, products.supplier_id, products.prod_unit_of_measure, products.prod_weight_class) LEVEL subcategory IS (products.prod_subcategory, products.prod_subcat_desc) LEVEL category IS (products.prod_category, products.prod_cat_desc) HIERARCHY prod_hier (product CHILD OF subcategory CHILD OF category) ATTRIBUTE product DETERMINES (products.prod_category, products.prod_id) ATTRIBUTE subcategory DETERMINES (products.prod_category, products.prod_subcategory) ATTRIBUTE category DETERMINES products.prod_category;
These aggregation rules could be thereafter be used by the optimizer to work out whether to rewrite the users’ query to use a pre-computed aggregate to answer that query instead of rolling-up from the detail-level the user specified in their query. Since then, subsequent releases of the Oracle Database have introduced a separate but related feature, the OLAP Option, that used technology originated from the old Express Server to create what were called Analytic Workspaces, fully-featured OLAP multi-dimensional caches together with their own programming language and suite of applications for sales and budgeting/planning.
Over the years Oracle introduced features to integrate analytic workspaces with the SQL query language used by the core database rather than developers having to learn and use OLAP DML, Oracle OLAP’s proprietary programming language, and the dimensional model provided by the OLAP Option and all OLAP servers is just as relevant today as it was back in the 1970’s when that technology first became available, and at that time and until recently the other innovation provided by dedicated multidimensional OLAP Servers — ultra-fast queries made possible by the use of a separate pre-computed and highly-indexed mid-tier data cache — was so valuable to users that the overhead in maintaining these caches, and the query latency they introduced because of the time it took to refresh and recompute the aggregates — made sense.
Until recently that is, when hardware used to host database servers suddenly became able to host the entire database in-memory and disk I/O was no longer a problem, and fast CPUs with multiple cores and large amounts of on-board cache RAM meant pre-computing calculated measures was no longer needed. So along came Oracle Database 12c and the In-Memory Option, that until the incident with the WiFi kettle was my claim to fame as the person Larry Ellison quoted on stage at Oracle when I said it was so easy to use it was “almost boring”.
Increasingly, it no longer made any sense to define and then maintain a separate multi-dimensional OLAP cache in a sort-of-embedded-server like Oracle OLAP, when the so-easy-to-use-it’s-boring Database In-Memory Option along with materialized views and bitmap indexes were just as fast to return query results to users … especially when this in-memory option also came with another feature, vector transformation plans, that enabled a special type of in-memory aggregation using the vector processing features in modern CPUs and had been developed — rather interestingly — by none other than the OLAP Option development team headed up by none other than … Bud Endress, the long-term development lead for Oracle OLAP and before that, Express Server.
So what’s going on here then? Well back then this adding of special aggregation capabilities into the in-memory database feature by the old OLAP Server team clearly signalled that this was where the Oracle intended to take the technology legacy from their standalone and then partly-integrated multidimensional OLAP server cache. but then the second shoe dropped back at Oracle Openworld two years ago when Analytic Views were first launched in a presentation by … Bud Endress.
Analytic Views then came along with the 12.2 release of Oracle OLAP and fairly obviously were the other half to the puzzle of where Oracle were planning to take the people, ideas and inspiration that came out of Oracle OLAP and the twin core user benefits of a dimensional model, and fast response to user queries involving aggregation and slice-and-dice of structured datasets. Not only that though, they also included a new take on the simplified SQL query format that Oracle OLAP introduced a few years ago, gave users the ability to build out a complete Oracle BI-style business model right over the data in the database … and, as if that wasn’t enough surprises, support (via third-party Simba ODBC drivers) for MDX.
Taking the IoT, wearables and social media data I’ve been collecting and storing up until now in Google BigQuery and transferring that across to an Oracle Database 12c 12.2 database with Analytic Views and the In-Memory Option enabled, and using Oracle SQL Developer 4.2 with its new Analytic Views functionality now included, you can see that there is a new dedicated section for Analytic Views in the main interface not unlike the old Analytic Workspace Manager standalone tool from the Oracle 10g/11g days.
Creating a new analytic view involves one of two routes in SQL Developer 4.2 (or you could of course script it all using DDL statements); you can create a quick-and-dirty analytic view over a single table using the Quick Analytic View menu option as per the screenshot below that then introspects your data to determine candidate measure and dimension columns, but this doesn’t seem to handle the sorts of wide, multiple level-per-attribute dimension denormalized tables that are common in distributed query engines like Google BigQuery that prefer those over joining lots of separate tables;
or, as I did for my testing you can normalize your fact and dimension tables so they’re separate and joined via foreign key constraints and then create in turn your attribute dimensions, hierarchies and attribute views containing measures and aggregation rules as shown in the screenshots below.
So far, very much in the same vein as creating relational and OLAP Option cube structures in previous database versions and developer tools in the past, and you can view the commands that SQL Developer will then send to the database using one of the dialog tabs, giving you a set of DDL commands looking like this:
CREATE ATTRIBUTE DIMENSION DEVICE_DIMENSION DIMENSION TYPE STANDARD CAPTION 'DEVICE DIMENSION' DESCRIPTION 'DEVICE DIMENSION' USING DEVICE_DIM ATTRIBUTES ( DEVICE AS DEVICE CAPTION 'DEVICE' DESCRIPTION 'DEVICE' , DEVICETYPE AS DEVICETYPE CAPTION 'DEVICETYPE' DESCRIPTION 'DEVICETYPE' , ROOM AS ROOM CAPTION 'ROOM' DESCRIPTION 'ROOM' ) LEVEL DEVICE_LEVEL LEVEL TYPE STANDARD CAPTION 'DEVICE LEVEL' DESCRIPTION 'DEVICE LEVEL' KEY ( "DEVICE") MEMBER NAME DEVICE MEMBER CAPTION DEVICE MEMBER DESCRIPTION DEVICE ORDER BY MIN DEVICE ASC NULLS FIRST DETERMINES ( "DEVICE") LEVEL ROOM_LEVEL LEVEL TYPE STANDARD CAPTION 'ROOM LEVEL' DESCRIPTION 'ROOM LEVEL' KEY ( "ROOM") MEMBER NAME ROOM MEMBER CAPTION ROOM MEMBER DESCRIPTION ROOM ORDER BY MIN ROOM ASC NULLS FIRST DETERMINES ( "ROOM") LEVEL DEVICETYPE LEVEL TYPE STANDARD CAPTION 'Devicetype' DESCRIPTION 'Devicetype' KEY ( "DEVICETYPE") MEMBER NAME TO_CHAR("DEVICETYPE") ORDER BY MIN DEVICETYPE ASC NULLS LAST DETERMINES ( "DEVICETYPE") ALL MEMBER NAME 'ALL' MEMBER CAPTION 'ALL' MEMBER DESCRIPTION 'ALL';
CREATE ATTRIBUTE DIMENSION DATES_DIMENSION CAPTION 'DATES DIMENSION' DESCRIPTION 'DATES DIMENSION' USING DATES_DIM ATTRIBUTES ( DATE_DATE AS DATE_DATE CAPTION 'DATE_DATE' DESCRIPTION 'DATE_DATE' , DAY_NAME AS DAY_NAME CAPTION 'DAY_NAME' DESCRIPTION 'DAY_NAME' , DAY_NUMBER AS DAY_NUMBER CAPTION 'DAY_NUMBER' DESCRIPTION 'DAY_NUMBER' , YEAR AS YEAR CAPTION 'YEAR' DESCRIPTION 'YEAR' , YEAR_DAY_OF_YEAR AS YEAR_DAY_OF_YEAR CAPTION 'YEAR_DAY_OF_YEAR' DESCRIPTION 'YEAR_DAY_OF_YEAR' , YEAR_MM AS YEAR_MM CAPTION 'YEAR_MM' DESCRIPTION 'YEAR_MM' ) LEVEL DATE_DATE CAPTION 'Date Date' DESCRIPTION 'Date Date' KEY ( "DATE_DATE") MEMBER NAME TO_CHAR("DATE_DATE") MEMBER CAPTION TO_CHAR("DATE_DATE") MEMBER DESCRIPTION TO_CHAR("DATE_DATE") ORDER BY MIN DATE_DATE ASC NULLS FIRST DETERMINES ( "DAY_NAME" , "DAY_NUMBER" , "YEAR_DAY_OF_YEAR") LEVEL YEAR CAPTION 'Year' DESCRIPTION 'Year' KEY ( "YEAR") ALTERNATE KEY YEAR MEMBER NAME YEAR MEMBER CAPTION YEAR MEMBER DESCRIPTION YEAR ORDER BY MIN YEAR ASC NULLS FIRST LEVEL YEAR_MM CAPTION 'Year Mm' DESCRIPTION 'Year Mm' KEY ( "YEAR_MM") MEMBER NAME YEAR_MM MEMBER CAPTION YEAR_MM MEMBER DESCRIPTION YEAR_MM ORDER BY MIN YEAR_MM ASC NULLS FIRST ALL MEMBER NAME 'ALL' MEMBER CAPTION 'ALL' MEMBER DESCRIPTION 'ALL';
CREATE HIERARCHY DATES_HIERARCHY CAPTION 'DATES_HIERARCHY' DESCRIPTION 'DATES_HIERARCHY' USING DATES_DIMENSION ( "DATE_DATE" CHILD OF "YEAR_MM" CHILD OF "YEAR" );
CREATE HIERARCHY DEVICE_TYPE_HIERARCHY CAPTION 'DEVICE_TYPE_HIERARCHY' DESCRIPTION 'DEVICE_TYPE_HIERARCHY' USING DEVICE_DIMENSION ( "DEVICE_LEVEL" CHILD OF "DEVICETYPE" );
CREATE HIERARCHY ROOM_HIERARCHY CAPTION 'ROOM_HIERARCHY' DESCRIPTION 'ROOM_HIERARCHY' USING DEVICE_DIMENSION ( "DEVICE_LEVEL" CHILD OF "ROOM_LEVEL" );
CREATE ANALYTIC VIEW IOT_DEVICES_AV CAPTION 'IOT_DEVICE_READINGS' DESCRIPTION 'IOT_DEVICE_READINGS' USING IOT_DEVICES DIMENSION BY ( DEVICE_DIMENSION AS DEVICE_DIMENSION KEY ( "DEVICE") REFERENCES ( "DEVICE") HIERARCHIES ( DEVICE_TYPE_HIERARCHY AS DEVICE_TYPE_HIERARCHY DEFAULT, ROOM_HIERARCHY AS ROOM_HIERARCHY ), DATES_DIMENSION AS DATES_DIMENSION KEY ( "DATE_TIME") REFERENCES ( "DATE_DATE") HIERARCHIES ( DATES_HIERARCHY AS DATES_HIERARCHY DEFAULT ) ) MEASURES ( VALUE FACT VALUE CAPTION 'VALUE' DESCRIPTION 'VALUE' ) DEFAULT MEASURE VALUE DEFAULT AGGREGATE BY AVG;
But the bit that’s to my mind interesting in the Analytic View creation dialog is the Cache Measure Groups option in the left-hand menu of the SQL Developer screen.
This, and the vector transformation feature introduced with Oracle Database 12.1, are the key to how Analytic Workspaces will return aggregated query results as fast, if not faster, than using a dedicated multidimensional OLAP cache layer. Optimisations in the SQL parser for analytic views that on one hand enable simplified SQL queries without the need for GROUP BY or join clauses, like this:
SELECT time_hier.member_name as TIME, sales, sales_prior_period FROM sales_av HIERARCHIES(time_hier) WHERE time_hier.level_name IN ('YEAR','QUARTER') ORDER BY time_hier.hier_order;
also use these cache measure group settings to hard-wire the Analytic View into any pre-created materialized views containing aggregations of your measures, side-stepping the usual query rewrite mechanism used by the database query optimizer to ensure, along with vector transformations, that users have split-second response times to their queries whether at detail-level or aggregated across one or more of the analytic view hierarchies. You can try this feature out for yourself using Oracle’s Live SQL demo environment here, and it’s important to understand what this feature and the integration of analytic views with the in-memory database option with its vector transformation feature really means — OLAP aggregation rethough for today’s in-memory database servers.
So this, in my opinion, is where Oracle are headed in terms of meeting the needs of database customers who want fast, easy-to-maintain OLAP query functionality that provides the response time of a dedicated OLAP server but does it through leveraging the in-memory capabilities of Oracle Database 12c, the people and ideas from the OLAP Option product and the huge amounts of memory and compute capacity you get in modern servers. All very interesting, and interesting as well to see where the high-end traditional relational database vendors are taking OLAP analysis now old-school OLAP servers have had their day but dimensional analysis and dimensional business models are still just as relevant.