Adding Looker Regression Tests to the dbtCloud CI/CD Pipeline using Spectacles

We’ve been users of the Looker testing tool Spectacles since its days as an open-source project started by Josh Temple and Dylan Baker, using it to check that Looker model updates and new releases we push into production have had their SQL content validated as well as its LookML, dashboards and looks.

Most Looker developers will be familiar with the Validate LookML button that appears when you’re editing the LookML definition of your project, but this only checks that you’ve used the right syntax and property definitions for your dimensions and measures. 

1_1yCKKHFY9B17zUAV83WN-w.png

What it doesn’t check is whether the SQL you’ve used is actually valid or whether the database objects that SQL relies on are still there and named as expected. Before Spectacles we’d try to do this by scheduling reports that selected against all of the fields in our explores but these only reported that a view had failed, not which field within the view, and was an inelegant solution.

Spectacles lets you define Suites of regression tests that run either on-demand, to a schedule or if you’ve licensed the Advanced or Elite editions of Looker, when one of your developers submits a pull request for the git repo that hosts your LookML project. 

1_zChPmzCqSmE4YmRdecNv8Q.png

As well as testing that the SQL used in your Looker model is valid and all of the tables and columns it references are still present and correct, test suites can also run the Content Validator to check that your dashboards and looks aren’t referring to obsolete model content as well as any Data Tests you’ve added to validate the model’s business logic.

1_zLUs86L1xF9J-bVZJArTSQ.png

We use dbt (“Data Build Tool”) to build the underlying data warehouses for our client Looker projects, and use dbtCloud’s CI/CD (Continuous Integration/Continuous Deployment) feature to automatically stand-up a temporary BigQuery Dataset (equivalent to a schema in Snowflake or Redshift) on submission of a developer’s pull request, into which the package is test deployed to check that it compiles, runs and passes its internal checks before they’re allowed to be pushed into production.

1_e8N-I9TRcgIU5tNrEnDxfw.png

The catch, of course, is that whilst those dbt models and transformations may have passed our CI/CD test pipeline checks and deployed successfully into the production BigQuery dataset, those changes may still break Looker content that relied on tables or columns that may no longer be present. 

Running a Spectacles test suite overnight each day will help identify those breaking changes after the event; the latest release of Spectacles goes one step further and lets us extend the CI/CD test pipeline initiated by dbtCloud to include any dependent Looker content. 

Now we can test both the upstream database changes and downstream BI content in the one test pipeline, doing so in a temporary database dataset ensuring that users of our Looker dashboards aren’t affected by any last-minute working through of LookML changes we find we have to make to ensure the new release goes smoothly.

Spectacles integration with dbtCloud’s CI/CD pipeline feature works by leveraging the same Github Actions feature that dbt uses to call an external web service; Spectacles adds a further step to the test pipeline that calls your regression test suite so that both need to complete successfully for your pull request merge checks to pass.

1_85x3FLK8xM6qrlqUalf7WQ.png

Testing all of your Looker content in the same temporary dataset that dbtCloud uses for its test pipeline is achieved by passing the name of that dataset to Looker as a user attribute value using Looker’s API, and then using that user attribute to dynamically set the SQL data source for each of your LookML views.

1_RpoozMvG-SEUoeNVhLYVlg.png

Test Suites in Spectacles now have an addition dbt button in their configuration settings for integration with dbtCloud pull requests.

spectacles_dbt.png

These new configuration settings let you tell Spectacles the URL for the dbt git repo that code that dbtCloud uses when deploying your test pipeline and the name of the user attribute variable you’re going to use to pass across the temporary dataset name.

1__kgvNjy0Xnf6z7AvaWY79A.png

Now your dbt deployment test pipelines have to check whether they’ve broken any downstream Looker explores, looks, dashboards or data test assertions before they’re considered ready for deployment.

1_Cm3YROSw7Ci3UBhYyMcBSw.png

Extending your dbt CI/CD pipeline in this way benefits both your end-users because your releases won’t break the Looker content they’ve come to rely on, and the analytics engineers in your data team who won’t be called on in the future so much to fix breaking changes they’ve inadvertently introduced into the data pipeline.

Rittman Analytics is a Spectacles implementation partner along with partnerships with Looker, Fishtown Analytics (maintainers of dbt) and other modern data stack technology vendors. We help businesses adopt a modern data stack, build their analytics capability and deliver analytics projects that increase revenue and improve their operational efficiency. 

If you’re interested in Spectacles, your analytics development workflow or any aspect of moving to a modern, flexible and modular data stack, contact us now to organise a 100% free, no-obligation 30 minute call — we’d love to hear from you.

Previous
Previous

RFM Analysis and Customer Segmentation using Looker, dbt and Google BigQuery

Next
Next

Why (and How) Customer Data Warehouses are the New Customer Data Platform