Summary
We found that Persistent Homology (H0 total persistence) on class-mean direction vectors provides a real-time overfitting signal with r=0.998 correlation to the generalization gap — often detecting overfitting before the train/test accuracy gap becomes visible.
Method
- Extract direction vectors from model:
d = normalize(engine_A(x) - engine_G(x))
- Compute per-class mean directions
- Build cosine distance matrix between class centroids
- Run H0 persistent homology (via ripser)
- Compare H0_train vs H0_test — the gap predicts overfitting
Also includes
- Automatic LR search: The LR that minimizes H0 CV (coefficient of variation) over 5 epochs = optimal LR
- 1-epoch difficulty prediction: H0 after 1 epoch predicts final accuracy (H0=4.38 → 98.3%, H0=2.02 → 52.0%)
- Confusion prediction: H0 merge order = confusion pairs (Spearman r=-0.97)
Verified results
| Dataset |
Accuracy |
Best LR |
Early Stop |
Time |
| MNIST |
98.3% |
1e-03 |
no |
2.2 min |
| Fashion |
87.4% |
3e-04 |
no |
2.2 min |
| CIFAR-10 |
52.0% |
1e-03 |
yes (ep 6) |
1.4 min |
CIFAR early-stopped at epoch 6 when H0_gap exceeded threshold — preventing wasted compute on a model that was already overfitting.
Repo: https://github.com/need-singularity/ph-training
Install: pip install -e . then ph-train --dataset cifar
Related projects
- logout — Consciousness Continuity Engine. The main research project with the dual-engine (PureFieldEngine) architecture that produces direction vectors analyzed by PH.
- Anima — Conversational consciousness agent with real-time PH overfitting detection integrated into the live inference loop.
- ph-training — Standalone training pipeline.
pip install -e . then ph-train --dataset cifar.
Summary
We found that Persistent Homology (H0 total persistence) on class-mean direction vectors provides a real-time overfitting signal with r=0.998 correlation to the generalization gap — often detecting overfitting before the train/test accuracy gap becomes visible.
Method
d = normalize(engine_A(x) - engine_G(x))Also includes
Verified results
CIFAR early-stopped at epoch 6 when H0_gap exceeded threshold — preventing wasted compute on a model that was already overfitting.
Repo: https://github.com/need-singularity/ph-training
Install:
pip install -e .thenph-train --dataset cifarRelated projects
pip install -e .thenph-train --dataset cifar.