Tooling skills n config from mtf dev#461
Open
goodboy wants to merge 19 commits into
Open
Conversation
Group `.claude/` ignores per-skill instead of a flat list: `ai.skillz` symlinks, `/open-wkt`, `/code-review-changes`, `/pr-msg`, `/commit-msg`. Add missing symlink entries (`yt-url-lookup` -> `resolve-conflicts`, `inter-skill-review`). Drop stale `Claude worktrees` section (already covered by `.claude/wkts/`). (this commit msg was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code (cherry picked from commit d3d6f64) (cherry picked from commit 82aa0cc)
New "Inspect last failures" section reads the pytest `lastfailed` cache JSON directly — instant, no collection overhead, and filters to `tests/`-prefixed entries to avoid stale junk paths. Also, - add `jq` tool permission for `.pytest_cache/` files (this commit msg was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code (cherry picked from commit ba86d48) (cherry picked from commit 2cad99f)
Rework section 3 from a worktree-only check into a structured 3-step flow: detect active venv, interpret results (Case A: active, B: none, C: worktree), then run import + collection checks. Deats, - Case B prompts via `AskUserQuestion` when no venv is detected, offering `uv sync` or manual activate - add `uv run` fallback section for envs where venv activation isn't practical - new allowed-tools: `uv run python`, `uv run pytest`, `uv pip show`, `AskUserQuestion` (this commit msg was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code (cherry picked from commit b1a0753) (cherry picked from commit 2dab0fd)
Reshuffle `pyproject.toml` deps into per-python-version `[tool.uv.dependency-groups]`: - `subints` group: `msgspec>=0.21.0`, py>=3.14 - `eventfd` group: `cffi>=1.17.1`, py>=3.13,<3.14 - `sync_pause` group: `greenback`, py>=3.13,<3.14 (was in `devx`; moved out bc no 3.14 yet) Bump top-level `msgspec>=0.20.0` too. (this commit msg was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code (cherry picked from commit 34d9d48) (factored: kept only the pyproject dep-group parts of "Raise `subint` floor to py3.14 and split dep-groups"; dropped tractor/spawn/_spawn.py + tractor/spawn/_subint.py) (cherry picked from commit ab6796d)
Fork-based backends (esp. `subint_forkserver`) can leak child actor processes on cancelled / SIGINT'd test runs; the zombies keep the tractor default registry (`127.0.0.1:1616` / `/tmp/registry@1616.sock`) bound, so every subsequent session can't bind and 50+ unrelated tests fail with the same `TooSlowError` / "address in use" signature. Document the pre-flight + post-cancel check as a mandatory step 4. Deats, - **primary signal**: `ss -tlnp | grep ':1616'` for a bound TCP registry listener — the authoritative check since :1616 is unique to our runtime - `pgrep -af` scoped to `$(pwd)/py[0-9]*/bin/python.* _actor_child_main|subint-forkserv` for leftover actor/forkserver procs — scoped deliberately so we don't false-flag legit long-running tractor- embedding apps like `piker` - `ls /tmp/registry@*.sock` for stale UDS sockets - scoped cleanup recipe (SIGTERM + SIGKILL sweep using the same `$(pwd)/py*` pattern, UDS `rm -f`, re-verify) plus a fallback for when a zombie holds :1616 but doesn't match the pattern: `ss -tlnp` → kill by PID - explicit false-positive warning calling out the `piker` case (`~/repos/piker/py*/bin/python3 -m tractor._child ...`) so a bare `pgrep` doesn't lead to nuking unrelated apps Goal: short-circuit the "spelunking into test code" rabbit-hole when the real cause is just a leaked PID from a prior session, without collateral damage to other tractor-embedding projects on the same box. (this patch was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code (cherry picked from commit d093c31) (cherry picked from commit 1ebe15d)
The previous cleanup recipe went straight to SIGTERM+SIGKILL, which hides bugs: tractor is structured concurrent — `_trio_main` catches SIGINT as an OS-cancel and cascades `Portal.cancel_actor` over IPC to every descendant. So a graceful SIGINT exercises the actual SC teardown path; if it hangs, that's a real bug to file (the forkserver `:1616` zombie was originally suspected to be one of these but turned out to be a teardown gap in `_ForkedProc.kill()` instead). Deats, - step 1: `pkill -INT` scoped to `$(pwd)/py*` — no sleep yet, just send the signal - step 2: bounded wait loop (10 × 0.3s = ~3s) using `pgrep` to poll for exit. Loop breaks early on clean exit - step 3: `pkill -9` only if graceful timed out, w/ a logged escalation msg so it's obvious when SC teardown didn't complete - step 4: same SIGINT-first ladder for the rare `:1616`-holding zombie that doesn't match the cmdline pattern (find PID via `ss -tlnp`, then `kill -INT NNNN; sleep 1; kill -9 NNNN`) - steps 5-6: UDS-socket `rm -f` + re-verify unchanged Goal: surface real teardown bugs through the test- cleanup workflow instead of papering over them with `-9`. (this patch was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code (cherry picked from commit 70d58c4) (cherry picked from commit 222784c)
Encode the hard-won lesson from the forkserver
cancel-cascade investigation into two skill docs
so future sessions grep-find it before spelunking
into trio internals.
Deats,
- `.claude/skills/conc-anal/SKILL.md`:
- new "Unbounded waits in cleanup paths"
section — rule: bound every `await X.wait()`
in cleanup paths with `trio.move_on_after()`
unless the setter is unconditionally
reachable. Recent example:
`ipc_server.wait_for_no_more_peers()` in
`async_main`'s finally (was unbounded,
deadlocked when any peer handler stuck)
- new "The capture-pipe-fill hang pattern"
section — mechanism, grep-pointers to the
existing `conftest.py` guards (`tests/conftest
.py:258`, `:316`), cross-ref to the full
post-mortem doc, and the grep-note: "if a
multi-subproc tractor test hangs, `pytest -s`
first, conc-anal second"
- `.claude/skills/run-tests/SKILL.md`: new
"Section 9: The pytest-capture hang pattern
(CHECK THIS FIRST)" with symptom / cause /
pre-existing guards to grep / three-step debug
recipe (try `-s`, lower loglevel, redirect
stdout/stderr) / signature of this bug vs. a
real code hang / historical reference
Cost several investigation sessions before the
capture-pipe issue surfaced — it was masked by
deeper cascade deadlocks. Once the cascades were
fixed, the tree tore down enough to generate
pipe-filling log volume. Lesson: **grep this
pattern first when any multi-subproc tractor test
hangs under default pytest but passes with `-s`.**
(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code
(cherry picked from commit 4106ba7)
(cherry picked from commit 45c4420)
`timeout = 200` was firing via SIGALRM (the default `method='signal'`) which synchronously raises `Failed` in trio's main thread mid-`epoll.poll()`, abandoning trio's runner mid-flight and leaving `GLOBAL_RUN_CONTEXT` half- installed. EVERY subsequent `trio.run()` in the same pytest session then bails with `RuntimeError: Attempted to call run() from inside a run()`. Empirical impact: a session that hits a single 200s hang cascades into 30-40 false-positive failures across every downstream test file that uses `trio.run`. Recent UDS run saw 1 real timeout (`test_unregistered_err_still_relayed`) poison 38 sibling tests with cascade-fails — a debugging nightmare. Same architectural bug we already documented in `tests/test_advanced_streaming.py::test_dynamic_pub_sub` (see its module-level NOTE) — both `pytest-timeout` enforcement modes are incompatible with trio under fork- based spawn backends. Now scoped session-wide. For tests that legitimately need a wall-clock cap, the canonical pattern is `with trio.fail_after(N):` INSIDE the test — trio's own `Cancelled` machinery cleanly unwinds the actor nursery without disturbing global state. For CI: rely on job-level wall-clock timeouts (e.g. GitHub Actions `timeout-minutes`) to abort genuinely-stuck suites. `pyproject.toml` comment block spells this all out so a future contributor doesn't reach back for `timeout =` and re-introduce the bug. ALSO, bump `xonsh` to at least `0.23.0` release. (this patch was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code (cherry picked from commit 3c366ca) (cherry picked from commit bbf4fe6)
Document the intermittent connect-refused failure in the registrar daemon test — root cause is the `daemon` fixture's blind `time.sleep()` readiness gate racing against the subproc's `bind()`/ `listen()` completion. Distinct from the cancel- cascade `TooSlowError` flake class. (this commit msg was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code (cherry picked from commit 29f9928) (cherry picked from commit 15c3b67)
Draft plan for consolidating pytest CLI flags, ad-hoc env vars, and hardcoded fixture defaults into the existing (but unused) `RuntimeVars` struct as the single source of truth. Deats, - `_rtvars.py` leaf mod w/ `dump`/`load`/`get`/ `update` helpers using `str(dict)` + `ast.literal_eval` encoding - phased migration: test infra first, then runtime callers, then per-session bindspace - addresses concurrent pytest session collisions and subproc env propagation for `devx/` scripts (this patch was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code (cherry picked from commit 7882c37) (cherry picked from commit 79c189b)
Follow-up note documenting why the deeper "Route B" fix for `LogSpec`/`apply_logspec()` true per-leaf-MODULE level control was NOT taken — in favor of the smaller sub-PACKAGE fix that shipped in 9c36363. Doc covers, - Status: what 9c36363 already gives (per-sub-pkg control at any nesting depth, `devx.debug` ≠ `devx`) vs. what remains unaddressed (per-leaf-mod levels, top-level lib mods like `tractor.to_asyncio` on the root logger). - "Route B" sketch: make logger *identity* the full dotted module path; mv the cosmetic leaf-trim out of logger-naming into the *formatter's* `{name}` rendering. - 6 breaking-change costs: every logger name changes, formatter rewrite, propagation/double-emit surface grows, level-inheritance semantics shift, `modden`/`piker` contract churn, `get_logger()` refactor risk. - Migration plan if pursued: extract a pure `_mk_logger_name()` helper w/ an exhaustive name-shape test matrix, swap `get_logger()` to use it for identity, swap formatter to use the display string, golden-diff rendered headers, coordinate w/ downstreams. - "Route A" alternative: a `logging.Filter` keyed on `record.module`/`pathname` for per-leaf control w/o name churn — lower risk, narrower power. - Recommendation: defer Route B; prefer Route A if per-leaf is needed soon; the shipped sub-PKG fix covers the common ask. Lives under `ai/tooling-todos/` since it's a deferred- work decision record, not a triage/conc-anal doc. (this patch was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code (cherry picked from commit 5b3c2e3) (cherry picked from commit 232d6cc)
There was a problem hiding this comment.
Pull request overview
This PR brings in development tooling configuration and internal documentation, including updates to Python/uv dependency-group setup, CI invocation tweaks, and new/updated Claude “skills” and analysis notes that support tractor’s async/concurrency-heavy development workflow.
Changes:
- Update project Python support metadata (now targeting Python 3.13–3.14) and restructure dependency groups (incl. per-group Python constraints via
tool.uv.dependency-groups). - Adjust CI test invocation flags and add/refresh internal runbook-style docs and concurrency notes.
- Add/update
.claude/skill docs and local agent permission settings.
Reviewed changes
Copilot reviewed 9 out of 11 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
uv.lock |
Updates dev/repl xonsh source and lock metadata (currently appears inconsistent with pyproject.toml). |
pyproject.toml |
Bumps requires-python, adds 3.14 classifier, restructures dependency groups and uv group Python constraints. |
ai/tooling-todos/logspec_leaf_module_granularity_route_b.md |
Adds internal design notes on logging-spec granularity tradeoffs. |
ai/conc-anal/test_register_duplicate_name_daemon_connect_race_issue.md |
Adds internal analysis of a flaky daemon readiness/connect race in tests. |
ai/conc-anal/fork_thread_semantics_execution_vs_memory.md |
Adds internal reference doc explaining fork semantics in multithreaded programs. |
.gitignore |
Refines ignored Claude/AI tooling artifacts and local notes directories. |
.github/workflows/ci.yml |
Tweaks pytest invocation formatting and capture flag/comment in CI. |
.claude/skills/run-tests/SKILL.md |
Expands the run-tests skill guidance (uv venv detection, zombie registry checks, capture-hang diagnostics). |
.claude/skills/conc-anal/SKILL.md |
Extends concurrency-analysis skill guidance with additional hang/deadlock patterns. |
.claude/settings.local.json |
Updates local agent tool permissions and UI preferences. |
.claude/notes/rt_vars_lift_plan.md |
Adds a draft design plan for consolidating runtime vars into an env-var envelope. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
31
to
+34
| "Programming Language :: Python :: 3 :: Only", | ||
| "Programming Language :: Python :: 3.12", | ||
| "Programming Language :: Python :: 3.13", | ||
| "Programming Language :: Python :: 3.14", |
Comment on lines
+156
to
+157
| --capture=fd | ||
| # ^XXX^ can't work with --spawn-method=main_thread_forkserver |
| "Skill(open-wkt)", | ||
| "Skill(prompt-io)" | ||
| "Bash(echo \"EXIT=$?\")", | ||
| "Read(//tmp/**)" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add
claudedev-tooling skills + repo/pkg configMotivation
While building the
main_thread_forkserver(MTF) spawn backend alarge body of non-library dev-tooling accreted alongside the actual
tractorwork —claude-code skills, CI and packaging tweaks, and afew concurrency-analysis writeups. None of it imports or touches
tractor/library code, so there's no reason for it to ride behindthe in-flight backend; it can land on its own, first.
This PR factors out exactly that tractor-independent slice: it
modifies only
.claude/,.github/,.gitignore,ai/and thepyproject/uv.lockpackaging config — zerotractor/ortests/files. Part of the
claude-code helpers tracking in #417.Summary of changes
/run-testsskill:lastfailed-cacheinspection, fuller venv pre-flight coverage, a zombie-actor leak
check, and a SIGINT-first cleanup ladder (graceful cancel before
SIGKILL).--capture/fork-child hang lesson into the/run-testsand
/conc-analskills so future runs don't re-derive it./commit-msgfiles write perms in.claude/settings.local.json, and stash aRuntimeVarsenv-var-lift design note under
.claude/notes/..gitignoreby skill/purpose and ignore notes & snippetsubdirs.
pytestcapture.xonsh, splitpy-version-gated
uvdependency-groups, bumppytest, andadd-then-drop a global
pytest-timeoutcap (net-zero — theexplanatory NOTE stays, the 200s hard cap doesn't).
ai/conc-analwriteups: posix multithreaded-fork()execution-vs-memory semantics, the
test_register_duplicate_namedaemon-connect race, and the
logspecleaf-module granularity(Route B) plan.
Future follow up
Implement the
RuntimeVarsenv-var lift.claude/notes/rt_vars_lift_plan.mdsketches lifting runtime varsto env-var overrides; design only, not yet built.
Pursue the
logspecleaf-module Route B granularityai/conc-anal/logspec_leaf_module_granularity_route_b.mddocumentsthe per-leaf-module log-spec plan as follow-up.
(this pr content was generated in some part by
claude-code)