Skip to content

Tooling skills n config from mtf dev#461

Open
goodboy wants to merge 19 commits into
mainfrom
tooling_skills_n_config_from_mtf_dev
Open

Tooling skills n config from mtf dev#461
goodboy wants to merge 19 commits into
mainfrom
tooling_skills_n_config_from_mtf_dev

Conversation

@goodboy

@goodboy goodboy commented Jun 16, 2026

Copy link
Copy Markdown
Owner

Add claude dev-tooling skills + repo/pkg config

Motivation

While building the main_thread_forkserver (MTF) spawn backend a
large body of non-library dev-tooling accreted alongside the actual
tractor work — claude-code skills, CI and packaging tweaks, and a
few concurrency-analysis writeups. None of it imports or touches
tractor/ library code, so there's no reason for it to ride behind
the in-flight backend; it can land on its own, first.

This PR factors out exactly that tractor-independent slice: it
modifies only .claude/, .github/, .gitignore, ai/ and the
pyproject/uv.lock packaging config — zero tractor/ or tests/
files. Part of the claude-code helpers tracking in #417.

Summary of changes

  • Substantially expand the /run-tests skill: lastfailed-cache
    inspection, fuller venv pre-flight coverage, a zombie-actor leak
    check, and a SIGINT-first cleanup ladder (graceful cancel before
    SIGKILL).
  • Codify the --capture/fork-child hang lesson into the /run-tests
    and /conc-anal skills so future runs don't re-derive it.
  • Grant /commit-msg files write perms in
    .claude/settings.local.json, and stash a RuntimeVars
    env-var-lift design note under .claude/notes/.
  • Reorganize .gitignore by skill/purpose and ignore notes & snippet
    subdirs.
  • Flip CI back to default pytest capture.
  • Packaging churn, no lib impact: pin/bump xonsh, split
    py-version-gated uv dependency-groups, bump pytest, and
    add-then-drop a global pytest-timeout cap (net-zero — the
    explanatory NOTE stays, the 200s hard cap doesn't).
  • Land three ai/conc-anal writeups: posix multithreaded-fork()
    execution-vs-memory semantics, the test_register_duplicate_name
    daemon-connect race, and the logspec leaf-module granularity
    (Route B) plan.

Future follow up

  • Implement the RuntimeVars env-var lift
    .claude/notes/rt_vars_lift_plan.md sketches lifting runtime vars
    to env-var overrides; design only, not yet built.

  • Pursue the logspec leaf-module Route B granularity
    ai/conc-anal/logspec_leaf_module_granularity_route_b.md documents
    the per-leaf-module log-spec plan as follow-up.

(this pr content was generated in some part by claude-code)

goodboy added 19 commits June 16, 2026 14:21
(cherry picked from commit 9cf3d58)
(cherry picked from commit 9a7b595)
Group `.claude/` ignores per-skill instead of a
flat list: `ai.skillz` symlinks, `/open-wkt`,
`/code-review-changes`, `/pr-msg`, `/commit-msg`.
Add missing symlink entries (`yt-url-lookup` ->
`resolve-conflicts`, `inter-skill-review`). Drop
stale `Claude worktrees` section (already covered
by `.claude/wkts/`).

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit d3d6f64)
(cherry picked from commit 82aa0cc)
New "Inspect last failures" section reads the pytest
`lastfailed` cache JSON directly — instant, no
collection overhead, and filters to `tests/`-prefixed
entries to avoid stale junk paths.

Also,
- add `jq` tool permission for `.pytest_cache/` files

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit ba86d48)
(cherry picked from commit 2cad99f)
Rework section 3 from a worktree-only check into a
structured 3-step flow: detect active venv, interpret
results (Case A: active, B: none, C: worktree), then
run import + collection checks.

Deats,
- Case B prompts via `AskUserQuestion` when no venv
  is detected, offering `uv sync` or manual activate
- add `uv run` fallback section for envs where venv
  activation isn't practical
- new allowed-tools: `uv run python`, `uv run pytest`,
  `uv pip show`, `AskUserQuestion`

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit b1a0753)
(cherry picked from commit 2dab0fd)
(cherry picked from commit b524ee4)
(cherry picked from commit 83dab8e)
(cherry picked from commit 64ddc42)
(cherry picked from commit c4951c8)
Reshuffle `pyproject.toml` deps into per-python-version
`[tool.uv.dependency-groups]`:
- `subints` group: `msgspec>=0.21.0`, py>=3.14
- `eventfd` group: `cffi>=1.17.1`, py>=3.13,<3.14
- `sync_pause` group: `greenback`, py>=3.13,<3.14
  (was in `devx`; moved out bc no 3.14 yet)

Bump top-level `msgspec>=0.20.0` too.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 34d9d48)
(factored: kept only the pyproject dep-group parts of
 "Raise `subint` floor to py3.14 and split dep-groups"; dropped
 tractor/spawn/_spawn.py + tractor/spawn/_subint.py)
(cherry picked from commit ab6796d)
(cherry picked from commit 5998774)
(cherry picked from commit 19dd6fc)
Fork-based backends (esp. `subint_forkserver`) can
leak child actor processes on cancelled / SIGINT'd
test runs; the zombies keep the tractor default
registry (`127.0.0.1:1616` / `/tmp/registry@1616.sock`)
bound, so every subsequent session can't bind and
50+ unrelated tests fail with the same
`TooSlowError` / "address in use" signature. Document
the pre-flight + post-cancel check as a mandatory
step 4.

Deats,
- **primary signal**: `ss -tlnp | grep ':1616'` for a
  bound TCP registry listener — the authoritative
  check since :1616 is unique to our runtime
- `pgrep -af` scoped to `$(pwd)/py[0-9]*/bin/python.*
  _actor_child_main|subint-forkserv` for leftover
  actor/forkserver procs — scoped deliberately so we
  don't false-flag legit long-running tractor-
  embedding apps like `piker`
- `ls /tmp/registry@*.sock` for stale UDS sockets
- scoped cleanup recipe (SIGTERM + SIGKILL sweep
  using the same `$(pwd)/py*` pattern, UDS `rm -f`,
  re-verify) plus a fallback for when a zombie holds
  :1616 but doesn't match the pattern: `ss -tlnp` →
  kill by PID
- explicit false-positive warning calling out the
  `piker` case (`~/repos/piker/py*/bin/python3 -m
  tractor._child ...`) so a bare `pgrep` doesn't lead
  to nuking unrelated apps

Goal: short-circuit the "spelunking into test code"
rabbit-hole when the real cause is just a leaked PID
from a prior session, without collateral damage to
other tractor-embedding projects on the same box.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit d093c31)
(cherry picked from commit 1ebe15d)
The previous cleanup recipe went straight to
SIGTERM+SIGKILL, which hides bugs: tractor is
structured concurrent — `_trio_main` catches SIGINT
as an OS-cancel and cascades `Portal.cancel_actor`
over IPC to every descendant. So a graceful SIGINT
exercises the actual SC teardown path; if it hangs,
that's a real bug to file (the forkserver `:1616`
zombie was originally suspected to be one of these
but turned out to be a teardown gap in
`_ForkedProc.kill()` instead).

Deats,
- step 1: `pkill -INT` scoped to `$(pwd)/py*` — no
  sleep yet, just send the signal
- step 2: bounded wait loop (10 × 0.3s = ~3s) using
  `pgrep` to poll for exit. Loop breaks early on
  clean exit
- step 3: `pkill -9` only if graceful timed out, w/
  a logged escalation msg so it's obvious when SC
  teardown didn't complete
- step 4: same SIGINT-first ladder for the rare
  `:1616`-holding zombie that doesn't match the
  cmdline pattern (find PID via `ss -tlnp`, then
  `kill -INT NNNN; sleep 1; kill -9 NNNN`)
- steps 5-6: UDS-socket `rm -f` + re-verify
  unchanged

Goal: surface real teardown bugs through the test-
cleanup workflow instead of papering over them with
`-9`.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 70d58c4)
(cherry picked from commit 222784c)
(cherry picked from commit 76d1206)
(cherry picked from commit 1d70d33)
Encode the hard-won lesson from the forkserver
cancel-cascade investigation into two skill docs
so future sessions grep-find it before spelunking
into trio internals.

Deats,
- `.claude/skills/conc-anal/SKILL.md`:
  - new "Unbounded waits in cleanup paths"
    section — rule: bound every `await X.wait()`
    in cleanup paths with `trio.move_on_after()`
    unless the setter is unconditionally
    reachable. Recent example:
    `ipc_server.wait_for_no_more_peers()` in
    `async_main`'s finally (was unbounded,
    deadlocked when any peer handler stuck)
  - new "The capture-pipe-fill hang pattern"
    section — mechanism, grep-pointers to the
    existing `conftest.py` guards (`tests/conftest
    .py:258`, `:316`), cross-ref to the full
    post-mortem doc, and the grep-note: "if a
    multi-subproc tractor test hangs, `pytest -s`
    first, conc-anal second"
- `.claude/skills/run-tests/SKILL.md`: new
  "Section 9: The pytest-capture hang pattern
  (CHECK THIS FIRST)" with symptom / cause /
  pre-existing guards to grep / three-step debug
  recipe (try `-s`, lower loglevel, redirect
  stdout/stderr) / signature of this bug vs. a
  real code hang / historical reference

Cost several investigation sessions before the
capture-pipe issue surfaced — it was masked by
deeper cascade deadlocks. Once the cascades were
fixed, the tree tore down enough to generate
pipe-filling log volume. Lesson: **grep this
pattern first when any multi-subproc tractor test
hangs under default pytest but passes with `-s`.**

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 4106ba7)
(cherry picked from commit 45c4420)
`timeout = 200` was firing via SIGALRM (the default
`method='signal'`) which synchronously raises `Failed` in
trio's main thread mid-`epoll.poll()`, abandoning trio's
runner mid-flight and leaving `GLOBAL_RUN_CONTEXT` half-
installed. EVERY subsequent `trio.run()` in the same pytest
session then bails with
`RuntimeError: Attempted to call run() from inside a run()`.

Empirical impact: a session that hits a single 200s hang
cascades into 30-40 false-positive failures across every
downstream test file that uses `trio.run`. Recent UDS run
saw 1 real timeout (`test_unregistered_err_still_relayed`)
poison 38 sibling tests with cascade-fails — a debugging
nightmare.

Same architectural bug we already documented in
`tests/test_advanced_streaming.py::test_dynamic_pub_sub`
(see its module-level NOTE) — both `pytest-timeout`
enforcement modes are incompatible with trio under fork-
based spawn backends. Now scoped session-wide.

For tests that legitimately need a wall-clock cap, the
canonical pattern is `with trio.fail_after(N):` INSIDE the
test — trio's own `Cancelled` machinery cleanly unwinds
the actor nursery without disturbing global state.

For CI: rely on job-level wall-clock timeouts (e.g. GitHub
Actions `timeout-minutes`) to abort genuinely-stuck suites.

`pyproject.toml` comment block spells this all out so a
future contributor doesn't reach back for `timeout =` and
re-introduce the bug.

ALSO, bump `xonsh` to at least `0.23.0` release.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 3c366ca)
(cherry picked from commit bbf4fe6)
(cherry picked from commit 532a983)
(cherry picked from commit 23b8a80)
(cherry picked from commit 22cdf15)
(cherry picked from commit 781abf7)
Document the intermittent connect-refused failure in the registrar
daemon test — root cause is the `daemon` fixture's blind `time.sleep()`
readiness gate racing against the subproc's `bind()`/ `listen()`
completion. Distinct from the cancel- cascade `TooSlowError` flake
class.

(this commit msg was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 29f9928)
(cherry picked from commit 15c3b67)
Draft plan for consolidating pytest CLI flags,
ad-hoc env vars, and hardcoded fixture defaults
into the existing (but unused) `RuntimeVars`
struct as the single source of truth.

Deats,
- `_rtvars.py` leaf mod w/ `dump`/`load`/`get`/
  `update` helpers using `str(dict)` +
  `ast.literal_eval` encoding
- phased migration: test infra first, then
  runtime callers, then per-session bindspace
- addresses concurrent pytest session collisions
  and subproc env propagation for `devx/` scripts

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 7882c37)
(cherry picked from commit 79c189b)
(cherry picked from commit e329c31)
(cherry picked from commit 8d917cb)
Follow-up note documenting why the deeper "Route B" fix
for `LogSpec`/`apply_logspec()` true per-leaf-MODULE level
control was NOT taken — in favor of the smaller
sub-PACKAGE fix that shipped in 9c36363.

Doc covers,
- Status: what 9c36363 already gives (per-sub-pkg
  control at any nesting depth, `devx.debug` ≠ `devx`)
  vs. what remains unaddressed (per-leaf-mod levels,
  top-level lib mods like `tractor.to_asyncio` on the
  root logger).
- "Route B" sketch: make logger *identity* the full
  dotted module path; mv the cosmetic leaf-trim out of
  logger-naming into the *formatter's* `{name}`
  rendering.
- 6 breaking-change costs: every logger name changes,
  formatter rewrite, propagation/double-emit surface
  grows, level-inheritance semantics shift,
  `modden`/`piker` contract churn, `get_logger()`
  refactor risk.
- Migration plan if pursued: extract a pure
  `_mk_logger_name()` helper w/ an exhaustive name-shape
  test matrix, swap `get_logger()` to use it for
  identity, swap formatter to use the display string,
  golden-diff rendered headers, coordinate w/
  downstreams.
- "Route A" alternative: a `logging.Filter` keyed on
  `record.module`/`pathname` for per-leaf control w/o
  name churn — lower risk, narrower power.
- Recommendation: defer Route B; prefer Route A if
  per-leaf is needed soon; the shipped sub-PKG fix
  covers the common ask.

Lives under `ai/tooling-todos/` since it's a deferred-
work decision record, not a triage/conc-anal doc.

(this patch was generated in some part by [`claude-code`][claude-code-gh])
[claude-code-gh]: https://github.com/anthropics/claude-code

(cherry picked from commit 5b3c2e3)
(cherry picked from commit 232d6cc)
Copilot AI review requested due to automatic review settings June 16, 2026 20:15

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR brings in development tooling configuration and internal documentation, including updates to Python/uv dependency-group setup, CI invocation tweaks, and new/updated Claude “skills” and analysis notes that support tractor’s async/concurrency-heavy development workflow.

Changes:

  • Update project Python support metadata (now targeting Python 3.13–3.14) and restructure dependency groups (incl. per-group Python constraints via tool.uv.dependency-groups).
  • Adjust CI test invocation flags and add/refresh internal runbook-style docs and concurrency notes.
  • Add/update .claude/ skill docs and local agent permission settings.

Reviewed changes

Copilot reviewed 9 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
uv.lock Updates dev/repl xonsh source and lock metadata (currently appears inconsistent with pyproject.toml).
pyproject.toml Bumps requires-python, adds 3.14 classifier, restructures dependency groups and uv group Python constraints.
ai/tooling-todos/logspec_leaf_module_granularity_route_b.md Adds internal design notes on logging-spec granularity tradeoffs.
ai/conc-anal/test_register_duplicate_name_daemon_connect_race_issue.md Adds internal analysis of a flaky daemon readiness/connect race in tests.
ai/conc-anal/fork_thread_semantics_execution_vs_memory.md Adds internal reference doc explaining fork semantics in multithreaded programs.
.gitignore Refines ignored Claude/AI tooling artifacts and local notes directories.
.github/workflows/ci.yml Tweaks pytest invocation formatting and capture flag/comment in CI.
.claude/skills/run-tests/SKILL.md Expands the run-tests skill guidance (uv venv detection, zombie registry checks, capture-hang diagnostics).
.claude/skills/conc-anal/SKILL.md Extends concurrency-analysis skill guidance with additional hang/deadlock patterns.
.claude/settings.local.json Updates local agent tool permissions and UI preferences.
.claude/notes/rt_vars_lift_plan.md Adds a draft design plan for consolidating runtime vars into an env-var envelope.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pyproject.toml
Comment on lines 31 to +34
"Programming Language :: Python :: 3 :: Only",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
"Programming Language :: Python :: 3.14",
Comment thread .github/workflows/ci.yml
Comment on lines +156 to +157
--capture=fd
# ^XXX^ can't work with --spawn-method=main_thread_forkserver
"Skill(open-wkt)",
"Skill(prompt-io)"
"Bash(echo \"EXIT=$?\")",
"Read(//tmp/**)"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants