Bridge tables resolve many-to-many relationships in dimensional models.
Without bridge tables, many-to-many joins can double count metrics.
Interview framing:
- Fact and dimensions are usually many-to-one joins
- When a relationship is many-to-many (e.g., customer↔account, title↔genre), use bridge table
- Sometimes include allocation weights to distribute measures correctly
dim_customer bridge_customer_account dim_account
------------ ----------------------- -----------
customer_key <-> customer_key <-> account_key
account_key
relationship_type
effective_date
end_date
allocation_pct (optional)
fact_transaction
----------------
transaction_id
account_key
amount
date_key
To analyze by customer:
fact_transaction -> dim_account -> bridge_customer_account -> dim_customer
dim_title <-- bridge_title_genre --> dim_genre
One account can have multiple holders; one customer can have multiple accounts.
Bridge preserves ownership relationships and enables customer-level reporting.
A title can belong to multiple genres/subgenres. Bridge avoids storing repeated genre arrays in fact tables.
Corporate cost centers can map to multiple departments with weighted allocation.
- Genuine many-to-many relationship exists
- Need analytically correct aggregation across both sides
- Need temporal ownership history in relationships
- Relationship is actually one-to-many (simpler FK works)
- Bridge is used to patch upstream data quality issues
- Team cannot maintain weighting and SCD logic correctly
- Correct modeling of many-to-many
- Prevents schema hacks and repeated denormalized arrays
- Supports weighted attribution
- More complex joins
- Risk of double counting if not weighted
- Harder for analysts without semantic layer guidance
- No allocation logic for shared ownership
- Counting fact amount fully for each bridge match (inflation)
- Missing effective dates in bridge when relationships change
- Using bridge without clear business definition
- Not documenting whether bridge is exclusive/non-exclusive
- Keep bridge narrow and indexed on both keys
- Add effective date filters for temporal joins
- Precompute customer-attributed facts for hot dashboards
- Use allocation_pct with clear default rules
- Validate bridge cardinality drift regularly
- What problem does a bridge table solve?
- Difference between bridge table and factless fact table?
- Why can bridge tables cause double counting?
- One account has 2 owners and $100 transaction. How do you report customer revenue?
- Relationship changes over time. How do you model historical ownership?
- Dashboard totals exceed source by 40% after adding bridge joins. Debug steps?
- Design customer-account bridge for Amazon co-branded credit products.
- Design Netflix title-genre bridge with evolving taxonomy.
- Design Uber enterprise cost allocation bridge with weighted splits.
- When do you pre-aggregate instead of joining bridge at query time?
- How do you test allocation correctness?
- Could this be solved in semantic layer only?