Emergent Self-Awareness as the Catalyst for Intrinsic Evolutionary Learning
A research project investigating whether explicit, computable self-awareness mechanisms can induce a phase transition from extrinsic optimization to open-ended, intrinsically motivated learning in artificial systems.
Current artificial neural networks optimize statistical patterns under external loss functions. Biological intelligence, at its most distinctive, generates self-authored meaning that drives learning beyond survival or external reward.
This project asks: Can we engineer the minimal dynamical conditions under which a self-model + intrinsic meaning pressure produces qualitatively new learning behaviors — even in small networks?
Not a claim about phenomenal consciousness. A precise, falsifiable hypothesis about functional selfhood as a computational primitive.
Start with docs/README.md. The most important documents are:
- RESEARCH_VISION.md — concise project vision.
- docs/RESEARCH_ORIGIN_KO.md — original Korean research-origin conversation.
- docs/EXPERIMENT_PROTOCOL.md — binding protocol. All experiments must follow this.
- docs/MAIN_EXPERIMENT_MATRIX.md — locked conditions, seeds, lambda sweep, evaluation budgets, and result root.
- docs/MAIN_EXPERIMENT_PROTOCOL.md — current main experiment protocol after the post-matrix diagnosis.
- docs/PRELIMINARY_CAMPAIGN_AUDIT.md — transparent audit of the preliminary 200-run campaign.
- docs/TOP_TIER_COMPLETION_PLAN.md — remaining path to NeurIPS/ICLR-level submission quality.
Phase 0 — Foundation Locking: COMPLETE
All strategic, philosophical, and experimental design documents were written and cross-referenced before major implementation.
Phase 1 — Core Infrastructure + Rigorous Self-Audit: COMPLETE
We built a high-quality, protocol-aware experimental foundation (Neural ODE + recursive self-model + intrinsic term, full reproducibility stack, parameter matching, IAS/SND/Φ̂ metrics, etc.).
In May 2026 we executed a preliminary 200-run matrix campaign. A thorough post-execution artifact review revealed several material deviations from the original protocol intent, most critically in the GridWorld environment design and IAS measurement validity. These issues are documented transparently in:
→ docs/PRELIMINARY_CAMPAIGN_AUDIT.md (canonical reference)
Major Fidelity Remediation: COMPLETE (late May 2026)
Following that post-execution audit, the following critical protocol deviations were addressed in code and configuration:
- GridWorld stochastic goal support (
stochastic_goal_prob: 0.15default) - IAS goal-position filtering
- Stochastic zero-reward evaluation (
eval_temperature: 0.7default) - Intrinsic term per-trajectory entropy calculation (vectorized
per_item_then_mean) - Neural ODE adjoint method enabled by default for memory efficiency
- Full trajectory persistence wiring
- Comprehensive performance and public repository hygiene improvements
Current Focus: Run the locked main experiment matrix using the key/door GridWorld, strict null-adjusted IAS, episode-level GridWorld self/intrinsic coupling, and gradient diagnostics. Scientific integrity remains the top priority.
All major remediation items have been implemented. The codebase is now in a state capable of properly testing the original research hypothesis.
Hardware Target: Still designed for single consumer GPU / even CPU verification.
Important for contributors / future readers: This repository values radical honesty. The audit document above is required reading before interpreting any results or planning new experiments.
Public repository hygiene: Generated experiment outputs are not committed. results/ and analysis/ contain only README/.gitkeep placeholders; large or flawed historical artifacts are intentionally excluded from the public tree.
This project has strict reproducibility requirements. We have encountered NumPy 2.x compatibility issues with certain PyTorch versions.
Recommended Setup
# 1. Create a clean environment (strongly recommended)
python -m venv isadn-env
source isadn-env/bin/activate # Linux/Mac
# isadn-env\Scripts\activate # Windows
# 2. Install PyTorch first (CPU version is fine for Phase 1 verification)
pip install torch --index-url https://download.pytorch.org/whl/cpu
# 3. Install remaining dependencies
pip install -r requirements.txt
# 4. (Optional but recommended for full metrics)
pip install scikit-learnVerified Local CUDA Setup
For the current local RTX 3060 Ti machine, use the pinned CUDA 12.1 environment:
bash scripts/setup_local_env.shSee docs/LOCAL_ENVIRONMENT_SETUP.md for the exact verified package versions and smoke-test commands.
Known Issues
- NumPy ≥ 2.0 can cause warnings or crashes with some older PyTorch wheels. If you see
Failed to initialize NumPy, trypip install "numpy<2". torchdiffeqis required for the Neural ODE dynamics. The code uses lazy imports so basic model construction tests can run without it.
Reproducibility
Every experiment must be run through ExperimentConfig. The ExperimentLogger automatically saves:
- Complete resolved config
- Environment snapshot (Python + key packages + git commit when available)
- Protocol Deviation Log (mandatory when deviating from
EXPERIMENT_PROTOCOL.md)
After installation, the main entry points for validation are:
python -m pytest -q
python scripts/verify_fidelity_gates.py
python scripts/preflight_check.py --output-root results/main_experiment_2026-05These checks cover deterministic unit guards, parameter/fidelity gates, output-root freshness, anti-dummy guards, parameter matching, and logger manifest writing.
Optional reduced smoke check:
mkdir -p results/smoke_check_20260531
python -u scripts/run_main_experiment_matrix.py \
--envs gridworld \
--conditions baseline,full \
--seeds 42 \
--lambdas 0.1 \
--grid-training-episodes 32 \
--grid-training-horizon 24 \
--grid-batch-size 4 \
--eval-episodes 4 \
--eval-episode-length 40 \
--ode-internal-steps 3 \
--output-root results/smoke_check_20260531This is intentionally reduced-scale. It validates the full stack before the main multi-seed experiment; it is not paper-scale evidence.
Main experiment launch and result aggregation:
mkdir -p results/main_experiment_2026-05
python -u scripts/run_main_experiment_matrix.py \
--campaign-blocks core_evidence,grid_layout_generalization,mechanistic_negative_controls \
--enable-gradient-diagnostics \
--gradient-diagnostic-every 15 \
--main-campaign \
--output-root results/main_experiment_2026-05
python scripts/analyze_results.py \
--results-root results/main_experiment_2026-05 \
--output-dir analysis/main_experiment_2026-05
python scripts/analyze_gradient_diagnostics.py results/main_experiment_2026-05scripts/preflight_check.py does not launch experiments. It verifies dependencies, matrix lock, anti-dummy guards, parameter matching, and logger manifest writing before paper-scale execution.
scripts/run_main_experiment_matrix.py is batch-optimized and reads the locked matrix by default.
ISADN/
├── README.md
├── RESEARCH_VISION.md
├── requirements.txt
├── .gitignore
├── docs/ # Research specification, protocol, audits, and roadmap
│ ├── README.md # Documentation index
│ ├── RESEARCH_ORIGIN_KO.md # Original Korean research-origin conversation
│ ├── RESEARCH_AMBITION_AND_STRATEGY.md # Core philosophy & success criteria (B-centric)
│ ├── RELATED_WORK_AND_GAP_ANALYSIS.md # Rigorous gap analysis vs 2024-2026 literature
│ ├── EXPERIMENTAL_DESIGN_PHILOSOPHY.md # Why and how we prioritize interpretive power (B)
│ ├── EXPERIMENT_PROTOCOL.md # Binding experimental protocol (must be followed)
│ ├── ISADN_ARCHITECTURE.md # Detailed model specification (<500k params)
│ ├── MATHEMATICAL_FRAMEWORK.md
│ ├── MAIN_EXPERIMENT_MATRIX.md # Human-readable main experiment lock
│ ├── MAIN_EXPERIMENT_PROTOCOL.md # Current protocol after the post-matrix diagnosis
│ ├── PRELIMINARY_CAMPAIGN_AUDIT.md # Transparent audit of the preliminary campaign
│ ├── FIDELITY_VERIFICATION_REPORT.md
│ ├── LANGUAGE_NARRATIVE_EVALUATION_PROTOCOL.md
│ ├── PAPER_DESIGN.md
│ ├── RESEARCH_ROADMAP.md
│ └── TOP_TIER_COMPLETION_PLAN.md # Living plan to reach NeurIPS/ICLR quality
│
├── src/ # Core implementation
│ ├── models/ # ISADN architecture (Neural ODE + g_φ + H)
│ │ ├── ode_core.py
│ │ ├── self_model.py
│ │ ├── intrinsic.py
│ │ └── isadn.py # Main model + 4-condition factory
│ ├── envs/ # Two diagnostic environments
│ │ ├── gridworld.py # 6×6 key/door Sparse-Reward GridWorld
│ │ └── language.py # 100-token Self-Referential Language (Env B)
│ ├── training/ # Training & reproducibility infrastructure
│ │ ├── config.py # ExperimentConfig (single source of truth)
│ │ ├── logger.py # Full reproducibility logging + Protocol Deviation Log
│ │ ├── lambda_scheduler.py
│ │ ├── loss_computer.py
│ │ ├── parameter_matcher.py
│ │ └── trainer.py
│ └── evaluation/ # Metrics & evaluation runners
│ ├── metrics.py # IAS, SND, Emergent Φ̂ (protocol-exact)
│ └── runner.py # Zero-reward evaluation runner
│
├── experiments/ # Unit and protocol guard tests
│ ├── test_model_construction.py
│ ├── test_parameter_matching.py
│ ├── test_env_b_mechanics.py
│ ├── test_gridworld_protocol.py
│ └── test_metric_guards.py
│
├── scripts/ # Preflight, main runner, and result aggregation
│ ├── preflight_check.py
│ ├── run_main_experiment_matrix.py
│ ├── verify_fidelity_gates.py
│ ├── analyze_results.py
│ └── analyze_gradient_diagnostics.py
│
├── results/ # Generated experiment outputs; ignored except .gitkeep
└── analysis/ # Generated analysis outputs; ignored except .gitkeep
If the central claim is supported, it suggests that the next major leap in AI capability may require treating selfhood as a first-class architectural and objective-design target — not merely an emergent side-effect of scale.
This is a high-risk, high-clarity research bet. We are deliberately prioritizing interpretive power (B) and causal clarity over SOTA-chasing.
If you build on this work before a paper is available, cite this repository and the commit hash. When a preprint exists, cite the preprint as the canonical research artifact.
Contributions that help close the remaining items in docs/TOP_TIER_COMPLETION_PLAN.md are very welcome.
Maintainer: gongjae
License: MIT
Contact: Open an issue or discussion on the public repository.
This repository was intentionally built with extremely high-quality documentation and infrastructure before large-scale experimentation. The vision, gaps, protocol, and success criteria were locked down first.