📝 Add mandatory field-notes system and Feb 12 session gotchas

robertdevore · robertdevore · commit 101a0131f2d8 · 2026-02-12T15:05:11.000-05:00
diff --git a/notes/2026-02-12_14-52_hashmap-fusion-jit-sealing-release.md b/notes/2026-02-12_14-52_hashmap-fusion-jit-sealing-release.md
@@ -0,0 +1,70 @@
+# Ruff Field Notes — Hashmap loop fusion, JIT sealing fix, and v0.9.0 release prep
+
+**Date:** 2026-02-12
+**Session:** 14:52 local
+**Branch/Commit:** main / 7f37e2b
+**Scope:** Implemented hashmap-oriented bytecode/compiler/VM optimizations, fixed JIT sealing regressions causing test failures, and completed release preparation for v0.9.0. Validated with full build/test and cross-language benchmarks.
+
+---
+
+## What I Changed
+- Added fused hashmap loop opcodes in `src/bytecode.rs` for common integer map accumulation/fill patterns.
+- Added compiler pattern lowering for canonical while-loop hashmap patterns in `src/compiler.rs`.
+- Implemented fused opcode execution paths and VM-level tests in `src/vm.rs`.
+- Fixed Cranelift block sealing regressions in `src/jit.rs` to eliminate panics in JIT tests.
+- Updated release/docs metadata in `CHANGELOG.md`, `ROADMAP.md`, `README.md`, and `Cargo.toml` for v0.9.0.
+- Ran `cargo build`, `cargo test`, and full cross-language benchmarks via `benchmarks/cross-language/run_benchmarks.sh`.
+- Created release commit and tag, then pushed `main` and `v0.9.0` to remote.
+
+## Gotchas (Read This Next Time)
+- **Gotcha:** Cranelift block sealing must happen in strict control-flow order.
+  - **Symptom:** JIT tests panic with Cranelift assertions around already-sealed or improperly sealed blocks.
+  - **Root cause:** Some paths sealed blocks early or attempted duplicate sealing when loop/header flow had multiple edges.
+  - **Fix:** Centralized and corrected sealing logic in `src/jit.rs` so blocks are sealed exactly once at the right stage.
+  - **Prevention:** Treat block sealing as a control-flow invariant; when touching JIT branching/loops, re-run targeted JIT tests before full suite.
+
+- **Gotcha:** Hashmap loop fusion only applies to specific canonical shapes.
+  - **Symptom:** Expected fusion does not occur for semantically equivalent loops with reordered or extra statements.
+  - **Root cause:** Compiler optimization is pattern-based, not a full general-purpose loop optimizer.
+  - **Fix:** Keep matcher strict and optimize known hot patterns first.
+  - **Prevention:** Add new fusion patterns intentionally with tests; do not assume all equivalent loops are currently optimized.
+
+- **Gotcha:** Benchmark conclusions can drift if non-comparable workloads are mixed.
+  - **Symptom:** Performance claims vary run-to-run when benchmark mix or warmup conditions change.
+  - **Root cause:** Startup overhead and workload differences dominate tiny tasks.
+  - **Fix:** Use the cross-language benchmark harness consistently and compare the same scripts across Ruff/Python/Go.
+  - **Prevention:** Keep benchmark table generation tied to one command and one benchmark set per report.
+
+## Things I Learned
+- JIT correctness invariants (SSA/block sealing) are as release-critical as throughput gains; performance changes must be paired with stability sweeps.
+- In this codebase, high-confidence optimization work means: targeted tests first, full suite second, cross-language benchmark last.
+- Release readiness should include metadata synchronization (`Cargo.toml`, changelog, roadmap, README), not just green tests.
+- Canonical-pattern fusion delivers meaningful wins quickly without requiring broad IR redesign.
+
+## Debug Notes (Only if applicable)
+- **Failing test / error:** Cranelift JIT panics related to block sealing invariants (already-sealed / incorrect sealing order).
+- **Repro steps:** Run `cargo test` after JIT/optimizer edits; failures appear in JIT-focused tests.
+- **Breakpoints / logs used:** Focused on JIT control-flow/sealing paths in `src/jit.rs`; iterated with targeted runs then full suite.
+- **Final diagnosis:** Block sealing was performed at inconsistent points for some control-flow shapes; corrected ordering removed panics.
+
+## Follow-ups / TODO (For Future Agents)
+- [ ] Add focused regression tests that isolate problematic sealing shapes (nested loops/branches) at JIT translation boundaries.
+- [ ] Expand hashmap fusion matcher coverage incrementally with benchmark-backed additions.
+- [ ] Automate release checklist into a single script (build + tests + benchmark + metadata checks).
+
+## Links / References
+- Files touched:
+  - `src/bytecode.rs`
+  - `src/compiler.rs`
+  - `src/vm.rs`
+  - `src/jit.rs`
+  - `CHANGELOG.md`
+  - `ROADMAP.md`
+  - `README.md`
+  - `Cargo.toml`
+- Related docs:
+  - `README.md`
+  - `ROADMAP.md`
+  - `CHANGELOG.md`
+  - `notes/FIELD_NOTES_SYSTEM.md`
+  - `notes/GOTCHAS.md`
diff --git a/notes/FIELD_NOTES_SYSTEM.md b/notes/FIELD_NOTES_SYSTEM.md
@@ -0,0 +1,286 @@
+# Ruff AI Agent Field Notes & Gotchas System
+
+This document defines a **mandatory post-work note-taking system** for AI agents working on the Ruff programming language. The goal is to capture *hands-on, experience-based knowledge* so future agents build faster, avoid mistakes, and develop correct mental models of the codebase.
+
+This is not documentation.
+This is **operational memory**.
+
+---
+
+## 🎯 Purpose
+
+After implementing features, fixing bugs, refactoring, or debugging Ruff, the AI agent must distill what it learned into structured Markdown notes.
+
+These notes should capture:
+
+* surprising behavior
+* incorrect assumptions
+* fragile areas of the code
+* ordering constraints
+* edge cases
+* rules the code implicitly relies on
+
+Future agents should be able to read these notes and *immediately* avoid repeated mistakes.
+
+---
+
+## 📁 Folder Structure
+
+```
+./notes/
+├── YYYY-MM-DD_HH-mm_short-kebab-summary.md
+├── YYYY-MM-DD_HH-mm_another-session.md
+├── GOTCHAS.md
+└── README.md (optional index)
+```
+
+* One notes file per work session
+* Notes are append-only (never overwrite or rename old files)
+
+---
+
+## 🧾 Session Notes File Rules
+
+### Filename format (required)
+
+```
+YYYY-MM-DD_HH-mm_short-kebab-summary.md
+```
+
+Examples:
+
+* `2026-01-25_10-14_parser-gotchas.md`
+* `2026-01-25_18-02_vm-scope-fix.md`
+
+If `./notes/` does not exist, create it.
+
+---
+
+## 🧠 When Notes Must Be Written
+
+Create or update a session notes file after:
+
+* adding a feature
+* fixing a bug
+* debugging a test failure
+* changing parser / lexer / AST / evaluator / runtime behavior
+* modifying CLI behavior
+* discovering surprising or non-obvious behavior
+* correcting an incorrect assumption
+
+**One work session = one notes file**
+
+---
+
+## 🧱 Required Session Notes Template
+
+Each session notes file **must use this exact structure**:
+
+```md
+# Ruff Field Notes — <short human title>
+
+**Date:** <YYYY-MM-DD>
+**Session:** <HH:mm local>
+**Branch/Commit:** <branch name> / <commit hash (if known)>
+**Scope:** <1–2 sentences describing what you worked on>
+
+---
+
+## What I Changed
+- <bullet list of concrete changes>
+- <include file paths when helpful>
+
+## Gotchas (Read This Next Time)
+- **Gotcha:** <what surprised you>
+  - **Symptom:** <what you observed>
+  - **Root cause:** <why it happened>
+  - **Fix:** <what resolved it>
+  - **Prevention:** <how to avoid it next time>
+
+(repeat for each gotcha)
+
+## Things I Learned
+- <mental model updates>
+- <rules of thumb>
+- <implicit invariants or ordering constraints>
+
+## Debug Notes (Only if applicable)
+- **Failing test / error:** <exact error output>
+- **Repro steps:** <how to reproduce>
+- **Breakpoints / logs used:** <where you looked>
+- **Final diagnosis:** <what it actually was>
+
+## Follow-ups / TODO (For Future Agents)
+- [ ] <specific next step>
+- [ ] <tech debt introduced or deferred>
+
+## Links / References
+- Files touched:
+  - `<path>`
+  - `<path>`
+- Related docs:
+  - `README.md`
+  - `ROADMAP.md`
+  - <other docs>
+```
+
+---
+
+## ✍️ Writing Style Rules (Critical)
+
+* Write like you are leaving a note to your **future self** with partial context
+* Be concrete, not theoretical
+* Include file paths, function names, enums, structs, and commands
+* If something *will surprise someone*, call it out explicitly
+* If something relies on an ordering constraint, write it as a **rule**
+* If unsure what to write, write *less*, but make it *specific*
+
+### 🚨 Mandatory Capture Rule: "Justified Behavior"
+
+If the agent has to **justify** why something is OK, expected, intentional, or safe, it **must be documented**.
+
+This includes moments where the agent says or implies:
+
+* "This warning is expected"
+* "This is intentional"
+* "This is safe because…"
+* "We can ignore this"
+* "It only happens in X, not Y"
+* "This is a known limitation"
+* "We’ll clean this up later"
+
+These statements compress non-obvious reasoning and **will not be obvious from the code alone**.
+
+When this happens, add an entry under **Gotchas** or **Things I Learned**.
+
+### Example (Required Documentation Pattern)
+
+```md
+- **Gotcha:** Compiler warning about `Spread` being unused as an `Expr`
+  - **Symptom:** Warning emitted during compile about `Spread` enum variant
+  - **Root cause:** `Spread` is intentionally NOT a standalone `Expr`
+  - **Fix:** None — this is expected behavior
+  - **Prevention:** Do not refactor `Spread` into `Expr` without redesigning
+    array/dict element handling. `Spread` is only valid within
+    `ArrayElement` and `DictElement`.
+```
+
+### Optional (High Value): Assumptions I Almost Made
+
+If applicable, add:
+
+```md
+## Assumptions I Almost Made
+- I initially assumed `Spread` should be an `Expr`, but that breaks
+  contextual evaluation rules
+```
+
+🚫 Avoid:
+
+* generic explanations
+* textbook definitions
+* restating obvious code behavior
+
+---
+
+## 🧠 Curated GOTCHAS.md (Deduplicated)
+
+In addition to session notes, maintain a **curated, long-lived gotchas file**:
+
+```
+./notes/GOTCHAS.md
+```
+
+### Purpose
+
+* High-signal summary of the most important pitfalls in Ruff
+* Deduplicated across sessions
+* Clean, readable, and short
+* Suitable for onboarding new agents
+
+Session notes are raw.
+`GOTCHAS.md` is refined.
+
+---
+
+## 📘 GOTCHAS.md Structure
+
+```md
+# Ruff — Known Gotchas & Sharp Edges
+
+This document contains the most important non-obvious pitfalls in the Ruff codebase.
+
+If you are new to the project, read this first.
+
+---
+
+## Parser & Syntax
+
+### Expression precedence is NOT inferred
+- **Problem:** <what breaks>
+- **Rule:** <explicit rule the parser expects>
+- **Why:** <design rationale>
+
+## Runtime / Evaluator
+
+### Variable scope is resolved at <stage>
+- **Problem:** <symptom>
+- **Rule:** <how scope resolution actually works>
+- **Implication:** <what not to assume>
+
+## CLI & Tooling
+
+### Tests must be run with <specific command>
+- **Problem:** <failure mode>
+- **Rule:** <correct workflow>
+
+---
+
+## Mental Model Summary
+
+- Ruff favors <design philosophy>
+- The parser assumes <key assumption>
+- The runtime guarantees <invariant>
+- Do NOT assume <common incorrect assumption>
+```
+
+### Rules for Updating GOTCHAS.md
+
+* Only add **confirmed, repeated, or high-impact** issues
+* Merge duplicates instead of appending
+* Prefer rules over stories
+* Reference session notes where the discovery came from
+
+Example:
+
+```md
+(Discovered during: 2026-01-25_10-14_parser-gotchas.md)
+```
+
+---
+
+## 📑 Optional: notes/README.md Index
+
+If the notes folder grows large, maintain an index:
+
+```md
+# Ruff Field Notes Index
+
+- 2026-01-25_10-14_parser-gotchas.md — Parser edge cases & failure modes
+- 2026-01-25_18-02_vm-scope-fix.md — Runtime scoping corrections
+```
+
+Only add entries for **high-signal sessions**.
+
+---
+
+## ✅ Definition of Done (Recommended)
+
+A task is **not complete** unless:
+
+* code compiles
+* tests pass
+* **session notes are written**
+* GOTCHAS.md is updated *if applicable*
+
+This system turns AI experience into durable project knowledge.
diff --git a/notes/GOTCHAS.md b/notes/GOTCHAS.md
@@ -752,7 +752,8 @@ If you are new to the project, read this first.
 ### Block sealing order matters for SSA
 
 - **Problem:** Panics with "unsealed block" or "block not filled" errors
-- **Rule:** Use two-pass translation - create all blocks first, then translate instructions, seal after adding terminator
+- **Rule:** Use two-pass translation - create all blocks first, then translate instructions, and seal blocks only after predecessor edges are finalized and terminators are emitted.
+- **Rule (critical):** Each block must be sealed exactly once. Early sealing or duplicate sealing on alternate control-flow paths can trigger Cranelift assertions (including already-sealed failures).
 - **Why:** Cranelift requires all predecessors of a block known before sealing (SSA form)
 - **Pattern:**
   ```rust
@@ -767,7 +768,7 @@ If you are new to the project, read this first.
   ```
 - **Implication:** Cannot seal blocks until all jumps to them are generated
 
-(Discovered during: 2026-01-27_14-51_jit-phase-3-completion.md)
+(Discovered during: 2026-01-27_14-51_jit-phase-3-completion.md, reinforced in: 2026-02-12_14-52_hashmap-fusion-jit-sealing-release.md)
 
 ### Hash-based variable resolution has collision risk
 
diff --git a/notes/README.md b/notes/README.md
@@ -23,13 +23,15 @@ These notes capture:
 
 ## Quick Navigation
 
+- **FIELD_NOTES_SYSTEM.md** - Mandatory workflow and template for post-work field notes
 - **GOTCHAS.md** - Start here! Curated list of the most important non-obvious pitfalls
 - Session notes below - Raw, chronological records of each work session
 
 ---
 
 ## Session Notes (Chronological)
 
+- **2026-02-12_14-52_hashmap-fusion-jit-sealing-release.md** — ✅ COMPLETED: Hashmap performance fusion + JIT sealing stability + v0.9.0 release prep. Added fused map opcodes in bytecode/compiler/VM, fixed Cranelift sealing regressions in `src/jit.rs`, validated with full build/tests/benchmarks, and finalized release metadata/tag push.
 - **2026-01-27_phase4e-jit-benchmarking.md** — ✅ COMPLETED: Phase 4E JIT Performance Benchmarking & Validation (v0.9.0). Added 7 micro-benchmark tests + infrastructure validation test. Created 5 real-world benchmark programs. Validated all Phase 4 subsystems: type profiling (11.5M obs/sec), guard generation (46µs per guard), cache lookups (27M/sec), specialization (57M decisions/sec). Phase 4 now 100% complete! ~3 hours implementation time. 3 commits, 198 tests passing.
 - **2026-01-26_vm-exception-handling.md** — ✅ COMPLETED: VM Exception Handling Implementation (v0.9.0 Phase 1). Implemented full exception handling in bytecode VM with BeginTry/EndTry/Throw/BeginCatch/EndCatch opcodes. Added exception handler stack for proper unwinding. Critical lesson: chunk restoration during call frame unwinding. Comprehensive test suite (9 scenarios). VM/interpreter parity achieved. ~3 hours implementation time.
 - **2026-01-26_interpreter-modularization-phase2.md** — ✅ PHASE 2 COMPLETE: Interpreter Modularization. Extracted ControlFlow enum (22 lines) to control_flow.rs and test framework (230 lines) to test_runner.rs. Reduced mod.rs from 14,285 to 14,071 lines (additional -214 lines). Total reduction: -731 lines from original 14,802 (~5%). Documented key decisions: why call_native_function_impl and register_builtins must stay in impl block. Clarified that remaining size is appropriate. Zero warnings, all tests passing.