Forecasting HubSpot Deal Revenue and Project Resourcing using dbt, Google BigQuery and Looker

Mark Rittman

Last week in our blog on using dbt and Looker to analyze Hubspot CRM data we talked about metrics that let us look back at the history of a deal to understand deal velocity and measure open and active deals at any particular point in the past.

In this blog I wanted to share how we use that same data to forecast our likely revenue in the coming months, and how we use this forecasting technique along with some custom Hubspot deal properties to predict what resources and skills we’ll need to deliver the projects we’re forecasting.

In out Hubspot deals table, as processed previously in dbt tidy-up and cleanse the raw data coming from Hubspot via Stitch, a typical row of deal data contains these columns relevant to a revenue forecast:

  • The predicted start and end date of the deal, typically a consulting engagement of two or three two-week duration development sprints, along with the duration of the engagement in days

  • The sales funnel stage for the deal as of now; we’ve excluded those deals either lost, or won and already delivered

  • The predicted deal value, and a probability percentage based on current deal stage that when used as a modifier (multiplier) with the deal amount gives us a weighted pipeline deal value


Screenshot 2020-01-06 at 23.38.32.pngScreenshot 2020-01-06 at 23.38.32.png

What we’d like to do is forecast, based on the estimated start and end dates of our open deals and the current weighted pipeline value of that deal, what our revenue numbers are likely to look like over the total duration of those deals.

In dbt we can do that by taking this deals data and using it to work out what revenue each deal brings in over the time it’s active, then joining those daily revenue amounts to a date array table we create on the fly to first aggregate each deal’s total revenue over those days, and then calculate the total revenue per month from all of those forecasted deals.

{{
    config(
        materialized='table'
    )
}}
WITH daily_weighted_revenue as (
  SELECT
    *,
    (amount * probability) / nullif(contract_days,0) AS contract_daily_weighted_revenue,
    amount / contract_days AS contract_daily_full_revenue,
    (amount / contract_days) - ((amount * probability) / contract_days) AS contract_diff_daily_revenue
  FROM (
    SELECT
      *,
      TIMESTAMP_DIFF(end_date_ts,start_date_ts,DAY) AS contract_days
    FROM (
      SELECT
        dealname,
        deal_id,
        amount,
        probability,
        start_date_ts,
        end_date_ts
      FROM
        {  }
      WHERE
        stage_label not in ('Closed Lost','Closed Won and Delivered')
      GROUP BY
        1,2,3,4,5,6,7))
),
months as (
  SELECT *
  FROM UNNEST(GENERATE_DATE_ARRAY('2019-01-10', '2024-01-01', INTERVAL 1 DAY)) day_ts
)
SELECT deal_id,
       date_trunc(day_ts,MONTH) as month_ts,
       sum(revenue_days) as revenue_days,
       sum(daily_weighted_revenue) as weighted_amount_monthly_forecast,
       sum(daily_full_revenue) as full_amount_monthly_forecast,
       sum(daily_diff_revenue) as diff_amount_monthly_forecast
from (
  SELECT deal_id,
       day_ts,count(*) as revenue_days,
       sum(contract_daily_weighted_revenue) daily_weighted_revenue,
       sum(contract_daily_full_revenue) daily_full_revenue,
       sum(contract_diff_daily_revenue) daily_diff_revenue
  FROM months m
  JOIN daily_weighted_revenue d
  ON TIMESTAMP(m.day_ts) between d.start_date_ts and timestamp_sub(d.end_date_ts, interval 1 day)
  GROUP BY 1,2)
GROUP BY  1,2

Bringing the table that dbt creates for this SQL query into Looker and joining it to our main Hubspot-sourced deals table, we can now predict revenue over the coming few months based on start date and length of deal with each deal value weighted based on its current deal stage to allow for the fact that not all deals will close.


In addition to recording revenue for our deals in Hubspot we also use custom deal properties to record aspects of the deal such as what products are likely to be part of the solution, how many sprints and of what type will be needed and who within the team we’re thinking of assigning the project to.


Screenshot 2020-01-07 at 00.20.07.pngScreenshot 2020-01-07 at 00.20.07.png

Now I can predict overall and individual workload for each of our consulting team, like this:


Screenshot 2020-01-07 at 00.24.43.pngScreenshot 2020-01-07 at 00.24.43.png

I can see how these forecasted projects are being priced:


Screenshot 2020-01-07 at 00.29.17.pngScreenshot 2020-01-07 at 00.29.17.png

And finally, useful for training and recruitment planning, I can see what technology skills our team need to have to service these deals and when.


Screenshot 2020-01-07 at 00.32.33.pngScreenshot 2020-01-07 at 00.32.33.png

For more details of what we can do for you with dbt, check the new dbt Solutions page and Data Centralization solution page on our website.