-
Notifications
You must be signed in to change notification settings - Fork 10
Description
Problem
DC (FIPS 11) has zero representation in block_cd_distributions.csv.gz, which means ExtendedCPS_2024 and EnhancedCPS_2024 contain no DC households. This causes both test_aca_calibration and test_medicaid_calibration to fail in CI (100% error for DC).
Root cause
In us library v3.2.0, us.states.DC is not included in us.states.STATES_AND_TERRITORIES (it's neither a state nor a territory in that library's classification). The state loop in make_block_cd_distributions.py (line 43-47) iterates over that collection, so DC's census block populations are never fetched:
states_to_process = [
s
for s in us.states.STATES_AND_TERRITORIES
if not s.is_territory and s.abbr not in ["ZZ"]
]This produces a list of 50 states without DC. The downstream inner merge (line 70) then has no DC blocks, so block_cd_distributions.csv.gz has zero DC rows.
Note: the older build_crosswalk_cd116_to_cd119() in make_district_mapping.py (line 166) already knew about this quirk and manually excluded DC then added it back as a hardcoded row. The newer build_block_cd_distributions() missed this workaround.
Fix
Add DC to states_to_process in make_block_cd_distributions.py, e.g.:
states_to_process = [
s
for s in us.states.STATES_AND_TERRITORIES
if not s.is_territory and s.abbr not in ["ZZ"]
] + [us.states.DC]Then regenerate block_cd_distributions.csv.gz and rebuild datasets.
Discovered in
PR #537 (puf-impute-fix-530) — CI failures in test_aca_calibration and test_medicaid_calibration.