Fix: Remove join fan-out bug in agg_monthly_loans#13
Open
bmiller-dh wants to merge 1 commit intomainfrom
Open
Conversation
Removes the problematic left join on loans table that was causing aggregated loan amounts to be multiplied by the number of loans per loan_type. This join was creating a Cartesian product that inflated the amount_originated values. The customer_id field was not being used in the aggregation and was causing the data quality issue reported in the Risk Analytics Reporting Dashboard. Fixes the issue introduced in commit f9d33f8.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The Risk Analytics Reporting Dashboard is showing inflated loan values this month. The
agg_monthly_loansmodel contains a join fan-out bug that's causing aggregated loan amounts to be multiplied incorrectly.Root Cause
The model joins the aggregated
monthly_originationsCTE back to theloanstable using onlyloan_type_name:Since
monthly_originationsis aggregated by month and loan_type, but theloanstable contains multiple individual loan records per loan_type, this creates a Cartesian product. Each aggregated row gets multiplied by the number of loans matching that loan_type.Example: If you have 100 loans of type "Mortgage" and the aggregated row shows $5M originated, the join multiplies this by 100, showing $500M instead.
Solution
Remove the problematic
left join loansclause. Thecustomer_idfield that was added in the previous commit is not needed for the aggregation and was causing the data quality issue.Impact
amount_originatedvalues in the Risk Analytics Reporting DashboardTesting
After merge, please run a dbt test to verify the aggregated values match expected ranges for previous months.