-
Notifications
You must be signed in to change notification settings - Fork 10
Replace CD stacking with block-based county FIPS filtering for NYC.h5 #654
Copy link
Copy link
Open
Description
Problem
NYC.h5 is currently built using "congressional district stacking" — filtering to 13 CDs that overlap NYC, then probabilistically scaling weights by P(NYC county | CD). This approach is deprecated because:
- CDs are redrawn every decade — the hardcoded
NYC_CDSlist of 13 CDs is fragile - NYC is not a collection of CDs — CDs straddle NYC boundaries, requiring probabilistic weight scaling
- We now have census blocks —
GeographyAssignment.county_fipsgives us 5-digit county FIPS derived fromblock_geoid[:5]
Solution
Replace CD stacking with a direct county FIPS filter:
NYC_COUNTY_FIPS = {"36005", "36047", "36061", "36081", "36085"}Each clone IS or IS NOT in NYC based on its assigned block's county — no probabilistic scaling needed. This is simpler, more correct, and doesn't depend on congressional district boundaries.
Changes
- Add
county_fips_filterparameter tobuild_h5()that zeros out weights for clones outside target counties - Update
build_cities()to usecounty_fips_filter=NYC_COUNTY_FIPSinstead ofcd_subset+county_filter - Remove
NYC_COUNTIES(enum name set) andNYC_CDS(13 hardcoded CD codes) - Remove now-unused
get_county_filter_probability()andget_filtered_block_distribution()fromblock_assignment.py - Update
modal_app/worker_script.pyaccordingly
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels