Skip to content

CPS top-coding caps AGI at $6.26M — zero observations above $10M in any state #530

@PavelMakarchuk

Description

@PavelMakarchuk

Summary

An audit of the Enhanced CPS state datasets reveals a hard top-coding ceiling at $6,263,051 AGI across all 51 state datasets. No state has a single observation above $10M, and only a handful have sparse records between $5M-$6.3M. This creates significant data quality issues for modeling policies that target high-income brackets (e.g., NY A05435 which raises rates on $5M-$25M and $25M+ brackets).

Impact

  • Bills targeting income above $10M or $25M have zero observations to model against
  • Revenue estimates for high-bracket policies are systematically understated
  • Example: NY A05435 estimates $120M revenue from the $5M+ bracket change, but captures zero impact from the $25M+ bracket change — the real revenue would be substantially higher

Audit methodology

For each state, loaded the Enhanced CPS dataset via Microsimulation, calculated adjusted_gross_income at the tax unit level, and counted raw records and weighted totals above $1M, $5M, $10M, and $25M thresholds.

from huggingface_hub import hf_hub_download
from policyengine_us import Microsimulation

dataset_path = hf_hub_download(
    repo_id="policyengine/policyengine-us-data",
    filename=f"states/{state}.h5",
    repo_type="model",
)
sim = Microsimulation(dataset=dataset_path)
agi = sim.calculate("adjusted_gross_income", 2026).values
weight = sim.calculate("tax_unit_weight", 2026).values

Full results by state

State $1M+ raw $1M+ wtd avg wt $5M+ raw $5M+ wtd avg wt $10M+ raw $10M+ wtd $25M+ raw $25M+ wtd Max AGI
AL 513 19,593 38 22 76 3 0 0 0 0 $6,263,051
AK 67 3,263 49 3 136 46 0 0 0 0 $5,575,013
AZ 864 40,390 47 31 758 24 0 0 0 0 $6,263,051
AR 325 11,072 34 12 366 31 0 0 0 0 $6,263,051
CA 10,353 344,096 33 304 5,552 18 0 0 0 0 $6,263,051
CO 1,251 45,100 36 37 866 23 0 0 0 0 $6,263,051
CT 839 42,203 50 26 463 18 0 0 0 0 $6,263,051
DE 97 5,182 53 2 71 36 0 0 0 0 $5,571,654
DC 311 9,054 29 9 240 27 0 0 0 0 $5,793,470
FL 1,971 208,750 106 89 8,724 98 0 0 0 0 $6,263,051
GA 1,620 58,722 36 42 611 15 0 0 0 0 $6,263,051
HI 286 5,341 19 10 81 8 0 0 0 0 $5,793,470
ID 178 9,561 54 5 112 22 0 0 0 0 $5,607,726
IL 1,818 93,636 52 57 1,313 23 0 0 0 0 $6,263,051
IN 781 31,170 40 24 181 8 0 0 0 0 $6,263,051
IA 269 15,641 58 7 106 15 0 0 0 0 $6,263,051
KS 427 14,788 35 11 142 13 0 0 0 0 $6,263,051
KY 475 15,236 32 21 133 6 0 0 0 0 $6,263,051
LA 444 19,520 44 9 97 11 0 0 0 0 $6,263,051
ME 158 6,247 40 4 125 31 0 0 0 0 $5,699,046
MD 1,528 36,862 24 48 358 7 0 0 0 0 $6,263,051
MA 1,839 78,200 43 53 1,321 25 0 0 0 0 $6,263,051
MI 944 52,221 55 28 287 10 0 0 0 0 $6,263,051
MN 1,116 35,256 32 33 403 12 0 0 0 0 $6,263,051
MS 263 8,166 31 7 13 2 0 0 0 0 $5,793,470
MO 604 30,117 50 23 157 7 0 0 0 0 $6,263,051
MT 224 6,538 29 3 172 57 0 0 0 0 $5,571,654
NE 189 11,803 62 9 62 7 0 0 0 0 $6,263,051
NV 274 23,761 87 10 577 58 0 0 0 0 $5,699,046
NH 156 11,986 77 5 144 29 0 0 0 0 $5,586,008
NJ 2,275 88,779 39 64 958 15 0 0 0 0 $6,263,051
NM 196 7,310 37 5 20 4 0 0 0 0 $6,263,051
NY 4,730 187,147 40 153 4,335 28 0 0 0 0 $6,263,051
NC 1,498 55,637 37 46 385 8 0 0 0 0 $6,263,051
ND 81 5,731 71 5 38 8 0 0 0 0 $6,263,051
OH 1,088 59,179 54 30 285 10 0 0 0 0 $6,263,051
OK 342 16,165 47 6 208 35 0 0 0 0 $5,686,040
OR 942 20,904 22 32 398 12 0 0 0 0 $6,263,051
PA 1,598 81,179 51 46 815 18 0 0 0 0 $6,263,051
RI 175 6,691 38 3 58 19 0 0 0 0 $5,675,536
SC 641 23,356 36 18 275 15 0 0 0 0 $6,263,051
SD 48 6,585 137 3 116 39 0 0 0 0 $5,686,040
TN 518 39,970 77 18 166 9 0 0 0 0 $6,263,051
TX 2,242 199,803 89 98 4,544 46 0 0 0 0 $6,263,051
UT 618 18,401 30 18 89 5 0 0 0 0 $6,263,051
VT 86 3,199 37 2 24 12 0 0 0 0 $5,586,008
VA 1,481 56,697 38 42 621 15 0 0 0 0 $6,263,051
WA 631 72,514 115 26 2,636 101 0 0 0 0 $6,263,051
WV 114 4,967 44 3 11 4 0 0 0 0 $5,607,726
WI 619 32,345 52 20 302 15 0 0 0 0 $6,263,051
WY 63 4,636 74 3 357 119 0 0 0 0 $5,572,205

Key observations

  1. Hard ceiling at $6,263,051 — this is the CPS top-coding/swapping cap. Most states hit this exact number.
  2. Zero observations above $10M in every state — policies targeting $10M+ or $25M+ brackets cannot be modeled.
  3. Sparse $5M+ data — even large states have very few records (NY: 153 raw / 4,335 weighted; CA: 304 raw / 5,552 weighted). Many of these have tiny weights (1-5), meaning a single record can disproportionately drive results.
  4. Weight instability — average weights for $5M+ records vary wildly (FL: 98, WA: 101 vs MS: 2, KY: 6), suggesting calibration noise at the tail.

Practical consequence

For the state legislative tracker, any bill targeting income above $5M will have unreliable estimates, and bills targeting $10M+ or $25M+ will show zero impact from those brackets. This needs to be disclosed as a data limitation or addressed via tail imputation (e.g., Pareto extrapolation from IRS SOI data).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions