The analytics team needs a monthly summary table showing Community Health Worker (CHW) performance. This table will power dashboards that track CHW productivity and help identify high/low performers.
Source Table: marts.fct_chv_activity
This fact table contains one row per CHW activity/visit with details about:
- Which CHW performed the activity
- When it occurred (date + timestamp)
- What type of activity (pregnancy visit, child assessment, family planning, etc.)
- Which household was visited
- Which patient was served
See schema_documentation.md for full column list.
metrics.chw_activity_monthly
One row per CHW per month
Composite: (chw_id, report_month)
For each CHW for each month, calculate:
- Definition: Count of all activities performed by the CHW in that month
- Notes: Include all activity types
- Definition: Number of distinct households the CHW visited
- Notes: One household may be visited multiple times, count it once
- Definition: Number of distinct patients/individuals served
- Notes: Same patient may be seen multiple times, count once
- Definition: Count of activities where
activity_type = 'pregnancy_visit'
- Definition: Count of activities where
activity_type = 'child_assessment'
- Definition: Count of activities where
activity_type = 'family_planning'
Rule: Activities are NOT always assigned to the month they occurred in.
Month Assignment Rule:
- If
activity_dateis on or after the 26th of the month → Assign to the NEXT month - If
activity_dateis before the 26th → Assign to the CURRENT month
Why?: Field data collection for some regions often continues into the first days of the next month for the previous month's activities. This ensures activities are attributed to the correct reporting period.
Examples:
- Activity on 2025-01-15 → Assigned to 2025-01-01 (January)
- Activity on 2025-01-26 → Assigned to 2025-02-01 (February)
- Activity on 2025-01-31 → Assigned to 2025-02-01 (February)
- Activity on 2025-12-26 → Assigned to 2026-01-01 (January of next year)
Implementation:
You must create a macro called month_assignment(date_column) that implements this logic.
- All activity types
- All CHWs (even if they had zero activities that month - handled by dbt incremental)
- Activities from all regions
- Rows where
activity_dateis NULL (these are invalid records) - Rows where
is_deleted = TRUE(soft-deleted records) - Rows where
chv_idis NULL (data quality issue)
This table will be queried frequently by dashboards, so:
- Materialization: Use incremental table (not view)
- Unique Key: Set to
['chv_id', 'report_month']to handle updates - Incremental Strategy: Use
delete+insertto allow reprocessing of historical months if late data arrives
The model must ensure:
- No Duplicates: Each CHW-month combination appears exactly once
- No NULLs in Key Fields:
chv_idandreport_monthmust never be NULL - Logical Metrics: All count metrics should be >= 0
Add appropriate dbt tests to verify these.
Scenario: CHW "CHV001" had the following activities in January 2025:
| activity_date | activity_type | household_id | patient_id |
|---|---|---|---|
| 2025-01-05 | pregnancy_visit | HH001 | PAT001 |
| 2025-01-12 | child_assessment | HH002 | PAT002 |
| 2025-01-15 | pregnancy_visit | HH001 | PAT001 |
| 2025-01-28 | family_planning | HH003 | PAT003 |
Expected Output Row:
| chv_id | report_month | total_activities | unique_households_visited | unique_patients_served | pregnancy_visits | child_assessments | family_planning_visits |
|---|---|---|---|---|---|---|---|
| CHV001 | 2025-01-01 | 3 | 2 | 2 | 2 | 1 | 0 |
| CHV001 | 2025-02-01 | 1 | 1 | 1 | 0 | 0 | 1 |
Note: The 2025-01-28 activity is assigned to February because it's on/after the 26th
Summary Checklist:
- Aggregate by CHW and month
- Use special month assignment logic (26th cutoff)
- Calculate 6 metrics
- Exclude NULL dates, deleted records, NULL CHV IDs
- Ensure no duplicates
- Use incremental + delete+insert strategy