This repo contains three completed experiment phases. The front-door claim is:
On a held-out procedural micro-world task,
Unknown/ non-entailment is recoverable from verdict-region hidden states, while decoder outputs under-express it.
The old “global topology of truth” headline did not survive.
- Paper source: paper/main.tex
- Paper PDF: paper/paper.pdf
- Claim-to-file mapping: CLAIMS_AND_EVIDENCE.md
- Repository file map: REPO_MAP.md
- Experiment index: docs/experiments/README.md
-
Phase A (negative): global H0/H1 topology on GSM8K traces
docs/experiments/01_global_h0_topology/README.md -
Phase B (positive): fixed-decoding convergence failure on GSM8K
docs/experiments/02_convergence_failure/README.md -
Phase C (main positive): procedural micro-world representation–decoder dissociation
docs/experiments/03_micro_world_semantics/README.md
- Phase A: corrected topology-only AUC =
0.20(controls =0.60) - Phase B: cap→wrong AUC =
0.839; OR ≈26.40 - Phase C: decoder Unknown recall (
Qwen2B=0.000,Qwen4B=0.000,Gemma-it=0.0125) while verdict-token probe Unknown recall (0.7375,0.1292,0.5625) - Phase C add-ons:
- latent pre-readout steering (Gemma-it raw) improves macro-F1
0.339→0.395with Unknown recall0.375→0.363 - shallow MLP sensitivity shows strong hidden Unknown signal on Qwen4B no-think (
verdict_tokenrecall0.129→0.638)
- latent pre-readout steering (Gemma-it raw) improves macro-F1
- Legacy root Python scripts were moved to
phase_a_global_h0_topology/scripts/. - All committed artifacts are kept under
artifacts/andoutputs/. .npzartifacts are committed (not ignored).
- Locked deps: repro/requirements.lock.txt
- Rebuild guide: repro/README.md
- CI build/release: .github/workflows/paper.yml