Customer 360 Data Warehousing and Sync to Hubspot using BigQuery, dbt, Looker and Hightouch

In this blog I’ll show how we’ve used Google BigQuery together with Looker, Segment and our RA Warehouse for dbt to do just this, connecting offline and online data together for both clients and contacts and syncing that data to our Hubspot CRM system using a new “reverse ETL” service, Hightouch.

Mark Rittman

Rittman Analytics are a Segment Implementation Partner and we use Segment Personas together with Segment Connections to connect all of our digital marketing touchpoints and create a single view of client and visitor interactions across those digital channels.

Customer Data Platform services such as Segment Personas are great for collecting these granular interaction histories for clients and prospects, data that can be combined with transaction and other information we hold for those clients when they identify themselves to us when they login to our site or register their interest in a new product or service we’re offering.

We’ve used customer data from our Segment Personas customer data to provide our Intercom chatbot service with information on the project we’re currently delivering for a client, their preferences and interests, making those interactions more relevant, more efficient and more likely to result in a successful client outcome.

1_uRNydys81LcioCfyLOSHOQ.png1_uRNydys81LcioCfyLOSHOQ.png

Why Do We Need a Customer Data Warehouse?

Whilst Personas and customer data platforms in-general do a great job of understanding and better serving the needs of our clients and potential clients when they interact with the real-time, digital aspects of our business, by definition they only really understand and serve those clients and prospects who visit our website and other marketing channels.

The majority of our client and prospect interactions happen offline with the sales, delivery and finance parts of our business and the data these interactions generate is much richer, diverse and wide ranging in-terms of scope and the years we’ve been in-business.

The best place to hold this sort of “customer 360-degree” historical interaction data is a cloud data warehouse and in this blog I’ll show how we’ve used Google BigQuery together with Looker, Segment and our RA Warehouse for dbt to do just this, connecting offline and online data together for both clients and contacts and syncing that data to our Hubspot CRM system using a new “reverse ETL” service, Hightouch.

Creating a 360-Degree View of all Customer and Client Interactions

I’ve blogged in the past about how we’ve built our in-house data warehouse using Google BigQuery, dbt and our open-source RA Warehouse for dbt package, now extended to provide compatibility with Snowflake Data Warehouse and covering finance, projects, marketing, customer and product analytics.

Data generated by our operations is centralized into a single integrated view of the business using dbt (“Data Build Tool”) and our RA Warehouse for dbt package to connect, transform and orchestrate the various steps in the integration process; the screenshot below shows the number of sources and steps in the process currently taken to build-up our customer view with light-green steps being data sources and light-blue ones as transformations.

1_1zLMDIcQrz2mhmNw3CTv_w.png1_1zLMDIcQrz2mhmNw3CTv_w.png

The final output of this process is a BigQuery table that has one row per customer and then details of their billing, invoicing and associated contacts along with those contacts involvement in any marketing or sales activity, with the data structured as nested repeating BigQuery columns. Data from this table is then also used to create an event timeline table that shows the complete end-to-end history of our interactions with that client and all of their contacts, as shown in the table screenshots below.

1_nky3pyMrd2O-z7SUPqJsMQ.png1_nky3pyMrd2O-z7SUPqJsMQ.png

Another set of steps in our dbt package combines and integrates data for individual contacts, combining their activity across all of the companies in our database they may have been associated with.

1_4I_TwAMbRhCAHk_Bbbz43w.png1_4I_TwAMbRhCAHk_Bbbz43w.png

We then take these customer and contact datasets and use them to create customer and contact-centred explores in Looker; in the latest release of Looker there’s a new app you can install from the marketplace that lets you to create interactive diagrams of your Looker explores to show how views and models are linked together.

1_L-rrejHUGE5IuH9ETqTUrQ.png1_L-rrejHUGE5IuH9ETqTUrQ.png

The diagram below was created using this new feature and shows how we’ve configured the customer explore and how the various views over our warehouse tables connect to the central customer view.

1_C1RkpsKlIEmf0rM3sBcHkQ.png1_C1RkpsKlIEmf0rM3sBcHkQ.png

Similarly our contact-centred Looker explore connects all of our contacts’ activities together in one place, as opposed to being split amongst the multiple companies that contact might have been associated with over their time with us.

1_Yyb87IakfJzXQ_BucBhFBg.png1_Yyb87IakfJzXQ_BucBhFBg.png

As well as structuring and connecting our customer and contact data for analysis in Looker we also derive various analytical insights from these datasets, for example calculating the gross and net revenue for a client across all service channels or determining how influential a particular contact might be when they’re involved in a sales opportunity, something we do by quantifying their engagement with our team, use of Looker (via Stitch’s new Looker metadata and usage data integrations) and how many open and closed won opportunities they’ve been associated with.

1_Ro7OeJj0vZEgfoT-ozVeAA.png1_Ro7OeJj0vZEgfoT-ozVeAA.png

Insights such as these are useful to see in a Looker dashboard but become even more useful when synced into tools such as Hubspot used by our customer and sales team.

“Reverse ETL” using Hightouch

Whilst Hubspot and services such as Salesforce have public APIs that allow customer and contact records to be updated with data from the warehouse, doing this typically involves custom integration work and keeping up-to-date with (in Hubspot’s case, frequent) API spec changes.

Services such as Stitch and Fivetran make setting up “ETL” data extract pipelines from SaaS applications like these easy and straightforward; a new category of “reverse ETL” services such as Hightouch and Census do this the other way around and make it easy to feed customer and other insights from your warehouse into downstream SaaS applications such as Hubspot , Facebook Audiences and Salesforce.

1_LYiSeL9lG2dq88DdZhPO7g.png1_LYiSeL9lG2dq88DdZhPO7g.png

Setting up a sync of our contact influencer properties into Hubspot starts with my defining a query against our BigQuery warehouse to return the set of contacts, their unique Hubspot ID and the property values I want to sync into Hubspot CRM.

1_X2eKw5iDA6-eJAkfS4luBw.png1_X2eKw5iDA6-eJAkfS4luBw.png

Prior to this I’d set-up connections to BigQuery and our Google Cloud Platform data warehouse, with options also to connect to Snowflake, Teradata, mySQL and other popular cloud data warehouses.

For destinations there’s the usual options for ad networks (Google Ads, Facebook Custom Audiences etc) and marketing technology SaaS apps (Marketo, Salesforce Marketing Cloud, Hubspot and so on).

1_EvC4G7vJ9xJW36S6aKUxqA.png1_EvC4G7vJ9xJW36S6aKUxqA.png

If you’ve tried to feed new customer properties data into Salesforce or Hubspot from services such as Segment Personas you’ll know it’s a pretty hit-and-miss affair with docs out of date and APIs always changing. In contrast, reverse ETL services such as Hightouch make this task refreshingly simple and straightforward giving you flexibility over what contact property you use for merging incoming contact records and drop-down selectors to choose which query output columns are used to populate your standard and (pre-created) custom contact properties.

1_onYJJF1z0CxwqBeRTk2IyA.png1_onYJJF1z0CxwqBeRTk2IyA.png

Then I define what triggers the sync that Hightouch runs between my warehouse customer data and Hubspot; in my case it makes sense to have the completion of our dbt job run trigger the sync so I first go and register the API key for our dbtCloud account and then configure the sync job to trigger on completion of a specific dbtCloud job run.

1_3B1Vdzrf8Vwef2NdQZFD4g.png1_3B1Vdzrf8Vwef2NdQZFD4g.png

I then trigger a test run of the dbt job and see that Hightouch then replicated those influencer scores from the warehouse tables that dbt populated for us.

1_W9sWwsgECeKpwjRo_zNW3Q.png1_W9sWwsgECeKpwjRo_zNW3Q.png

Finally when I check Hubspot I can see the two new contact properties added to the relevant contact records.

1_L7zVCN2HSaeombvzqAtkRQ.png1_L7zVCN2HSaeombvzqAtkRQ.png

Similarly I can set up syncs of company-level data such as the gross and net revenue numbers for client billing, not normally available to Hubspot as a property value for company records.

1_5ibyfKEwSmeB6bogUr8lxA.png1_5ibyfKEwSmeB6bogUr8lxA.png

Interested? Find out More

If the idea of creating a complete, 360-degree view of your customers’ and contacts’ interactions with your business both online and offline and having that information synced to your downstream marketing and customer applications, contact us now to organise a 100% free, no-obligation 30 minute call to discuss your data needs — we’d love to hear from you.