Skip to content

DC missing from block_cd_distributions: us library excludes DC from STATES_AND_TERRITORIES #539

@baogorek

Description

@baogorek

Problem

DC (FIPS 11) has zero representation in block_cd_distributions.csv.gz, which means ExtendedCPS_2024 and EnhancedCPS_2024 contain no DC households. This causes both test_aca_calibration and test_medicaid_calibration to fail in CI (100% error for DC).

Root cause

In us library v3.2.0, us.states.DC is not included in us.states.STATES_AND_TERRITORIES (it's neither a state nor a territory in that library's classification). The state loop in make_block_cd_distributions.py (line 43-47) iterates over that collection, so DC's census block populations are never fetched:

states_to_process = [
    s
    for s in us.states.STATES_AND_TERRITORIES
    if not s.is_territory and s.abbr not in ["ZZ"]
]

This produces a list of 50 states without DC. The downstream inner merge (line 70) then has no DC blocks, so block_cd_distributions.csv.gz has zero DC rows.

Note: the older build_crosswalk_cd116_to_cd119() in make_district_mapping.py (line 166) already knew about this quirk and manually excluded DC then added it back as a hardcoded row. The newer build_block_cd_distributions() missed this workaround.

Fix

Add DC to states_to_process in make_block_cd_distributions.py, e.g.:

states_to_process = [
    s
    for s in us.states.STATES_AND_TERRITORIES
    if not s.is_territory and s.abbr not in ["ZZ"]
] + [us.states.DC]

Then regenerate block_cd_distributions.csv.gz and rebuild datasets.

Discovered in

PR #537 (puf-impute-fix-530) — CI failures in test_aca_calibration and test_medicaid_calibration.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions