When building our internal data warehouse, we quickly noticed that people and companies sit at the centre of most business processes and that the same person or company often plays multiple roles over time.
A consultant might be a staff member one day, a project lead the next and a contract signatory when closing a deal. Marketing contacts become clients, clients become referral partners and candidates sometimes become employees.

While these roles share common attributes such as name, title and contact details, others are role-specific. It is tempting to model each role separately with dimensions for staff, contacts or candidates but this quickly leads to silos, duplication and the inevitable question of which table holds the correct email address.
Instead, we modelled people in BigQuery using a single unified person dimension that captures every role a person plays across the business.
Most analytics warehouses grow organically. You start with a CRM and build a contacts dimension. HR adds employee data so you create an employees dimension. Recruiting wants candidates, finance wants signatories and before long you are maintaining several tables that all store the same basic fields.
This fragmentation causes predictable problems:
Most importantly, it breaks the relationship history. When a prospect becomes a client, their marketing journey disappears. When a contractor becomes an employee, their project history is left behind.
We use Google BigQuery as our data warehousing managed service and instead created a single persons_dim table that accommodates every role a person could play in our business. To do this in an elegant and BigQuery-optimized way we used a design approach that:
Uses boolean flags for common filters. Queries like WHERE is_staff = TRUE are cleaner and faster than WHERE staff_attributes IS NOT NULL. The flags also work in Looker filters without requiring users to understand STRUCT syntax.
Stores current roles as an array. The current_roles field enables queries like WHERE 'PARTNER_CONTACT' IN UNNEST(current_roles) while also supporting aggregations on multi-role individuals.
Groups related attributes in STRUCTs. Rather than 20 columns prefixed with staff_, nest them in staff_attributes. This keeps the schema organised and makes it obvious which fields belong together.
CREATE TABLE `analytics.persons_dim` (
-- Core identity (common to all persons)
person_pk STRING NOT NULL,
person_name STRING,
person_phone STRING,
all_emails ARRAY<STRING>,
person_linkedin_url STRING,
-- Role flags for simple filtering
is_staff BOOL DEFAULT FALSE,
is_contractor BOOL DEFAULT FALSE,
is_client_contact BOOL DEFAULT FALSE,
is_partner_contact BOOL DEFAULT FALSE,
is_prospect BOOL DEFAULT FALSE,
is_candidate BOOL DEFAULT FALSE,
-- Current roles array for multi-role queries
current_roles ARRAY<STRING>,
primary_role STRING,
-- Staff-specific attributes (populated when is_staff = TRUE)
staff_attributes STRUCT<
job_title STRING,
team_name STRING,
employment_start_date DATE,
weekly_capacity INT64,
default_hourly_rate INT64,
cost_rate INT64,
manager_person_fk STRING
>,
-- CRM/Marketing attributes
crm_attributes STRUCT<
lifecycle_stage STRING,
lead_status STRING,
lead_source STRING,
conversion_event_name STRING
>,
-- Candidate attributes (populated when is_candidate = TRUE)
candidate_attributes STRUCT<
resume STRING,
current_application_stage STRING,
is_referred BOOL
>,
-- Partner attributes
partner_attributes STRUCT<
partner_company_fk STRING,
is_primary_contact BOOL,
has_signatory_authority BOOL
>
);Uses bridge tables for contextual relationships. A person’s role on a specific project or deal isn’t an attribute of the person — it’s a relationship. Bridge tables model this correctly and support temporal tracking of when assignments started and ended.
Maintains a primary role. When a person has multiple roles, which one matters most? We calculate primary_role based on a precedence hierarchy (staff > contractor > partner > client > prospect), giving reports a sensible default.
This type of table schema lets you use boolean flags, for example WHERE is_staff = TRUE to return all persons who are staff members and for more complex analysis you can refer to the STRUCT fields for role-specific attributes. A person with multiple roles simply has multiple flags set to TRUE, with their respective STRUCT fields populated.
With a unified person dimension, every fact table that involves people can join through a single, consistent foreign key.

This consistency transforms cross-functional analysis. Instead of building complex unions or maintaining role-specific join logic, every query follows the same pattern.
Some roles however require more context recorded for them in-order for them to be useful in making decisions; a person isn’t just “a project team member” in the abstract; they’re a team member on specific projects with specific allocations. We handle this with bridge tables that link the person to the relevant context.

These bridge tables let us track the same person across multiple projects or deals with different roles on each. A senior consultant might lead one project while providing support on another — the bridge captures both relationships without duplicating their core identity.
Looker, through its tight integration with Google BigQuery, makes it easy to import this unified person dimension table schema into your LookML project and connect it, either directly or through its bridge tables, to the activities that person has participated in across your business operations.

The look below, for example, is filtered to only show people where has_multiple_roles = Yes and sorted by role count descending, so the most interconnected relationships appear first.

This helps us spot people like:
and would be something we’d make use of before any outreach, in-order to personalise our messaging and avoid treating warm contacts like cold leads.
A unified person dimension such as this one transforms what’s possible with your analytics. Instead of “how many clients do we have?” you can ask “how many clients started as inbound leads versus referrals from existing partners?” Instead of “who’s on this project?” you can ask “which team members have worked with this client before, in any capacity?”
The investment in consolidating person data pays dividends across every function as sales gets complete contact history, delivery sees full project context and finance can track the same individual from prospect through to paid invoice. Building on this foundation, you can layer predictive models that identify which prospects are most likely to convert based on engagement patterns across all touchpoints.
If your warehouse has scattered person data across multiple dimensions, consider as we have whether a unified approach might simplify your architecture and unlock new analytical possibilities.
Rittman Analytics is a boutique data analytics consultancy that helps ambitious, digital-native businesses scale-up their approach to data, analytics and generative AI.
We’re authorised delivery partners for Google Cloud along with dbt Labs, Cube, Dagster, Embeddable, SnowCatCloud and Fivetran and are experts at helping you design an analytics solution that’s right for your organisation’s needs, use-cases and budget.
If you’re looking for some help and assistance with your data or AI initiative or would just like to talk shop and share ideas and thoughts on what’s going on in your organisation and the wider data analytics world, contact us now to organise a 100%-free, no-obligation call — we’d love to hear from you!



